Skip to content

ETag.parse accepts characters above the obs-text range (U+0100 and up) #19

@OmarAlJarrah

Description

@OmarAlJarrah

The character-set guard in ETag.parse is supposed to enforce the entity-tag grammar from RFC 7232 §2.3, where etagc = %x21 / %x23-7E / obs-text and obs-text = %x80-FF. The check rejects everything at or below SP (0x20) and DEL (0x7F) but applies no upper bound, so any codepoint at or above 0x80 is accepted. Since value is a Python str, that includes codepoints above 0xFF (e.g. = U+20AC), which are outside obs-text and not valid entity-tag characters. The inline comment even states that only obs-text 0x80-0xFF should stay permitted, so the code and its own comment disagree.

Where

packages/dexpace-sdk-core/src/dexpace/sdk/core/http/common/etag.py:103-108

# RFC 7232 §2.3: etagc = %x21 / %x23-7E / obs-text. Everything at or
# below SP (0x20) and DEL (0x7F) is outside the entity-tag character
# set; obs-text (0x80-0xFF) stays permitted.
if any(ord(ch) <= 0x20 or ord(ch) == 0x7F for ch in value):
    raise ValueError(f"Invalid ETag: illegal character in {raw!r}")

Impact

>>> from dexpace.sdk.core.http.common import ETag
>>> tag = ETag.parse('"a€b"')   # U+20AC, above obs-text
>>> tag.value
'a€b'
>>> str(tag)
'"a€b"'

A tag with a codepoint above 0xFF is accepted instead of being rejected as a malformed entity-tag. The accepted value then round-trips through ETag.__str__, and request_conditions._format_etags uses __str__ when building If-Match / If-None-Match request headers — so a non-conformant validator can flow back out onto the wire instead of being caught at parse time. The validation is the canonical place this should be enforced, and right now it does not match the grammar it documents.

Suggested fix

Add the missing upper bound so codepoints above 0xFF are rejected, matching the comment and RFC 7232 §2.3:

if any(ord(ch) <= 0x20 or ord(ch) == 0x7F or ord(ch) > 0xFF for ch in value):
    raise ValueError(f"Invalid ETag: illegal character in {raw!r}")

A test pinning rejection of a character above 0xFF (e.g. ETag.parse('"a€b"')) would lock the behavior in alongside the existing control-char and space cases in tests/http/test_etag.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions