ReverseMerge: Bedrock IAM Role auth (#1944) to main#1948
Conversation
…pter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
Caution Review failedPull request was closed or merged during review Summary by CodeRabbit
WalkthroughIntroduces ChangesBedrock Authentication Type Configuration
Sequence DiagramsequenceDiagram
participant User as User Config
participant Schema as Schema Validation
participant Validate as validate() Method
participant Resolve as _resolve_bedrock_aws_credentials
participant Output as Validated Output
User->>Schema: Provide config with optional auth_type
Schema->>Validate: Pass validated schema config
Validate->>Validate: Extract & remove auth_type from payload
Validate->>Validate: Call Pydantic with remaining fields
Validate->>Resolve: Pass validated kwargs + auth_type
alt auth_type = "access_keys"
Resolve->>Resolve: Enforce non-blank keys (raise if blank)
Resolve-->>Output: Keep credential fields
else auth_type = "iam_role"
Resolve->>Resolve: Remove credential fields
Resolve-->>Output: Drop access keys
else auth_type absent (legacy)
Resolve->>Resolve: Strip only blank credential values
Resolve-->>OUTPUT: Preserve non-blank keys
end
Output-->>User: Return processed validated config
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|
| Filename | Overview |
|---|---|
| unstract/sdk1/src/unstract/sdk1/adapters/base1.py | Adds _resolve_bedrock_aws_credentials helper and auth_type handling to both LLM and Embedding Bedrock adapters; credentials are correctly excluded from Pydantic validation and resolved post-dump. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json | Moves aws_access_key_id/aws_secret_access_key from top-level properties into a JSON Schema dependencies block keyed on auth_type; adds auth_type enum selector with default access_keys. |
| unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.json | Mirrors the LLM JSON schema change for the embedding adapter; identical auth_type dependency structure applied. |
| unstract/sdk1/tests/test_bedrock_adapter.py | New test file with 15 tests covering IAM role, access keys, legacy (no auth_type), blank/whitespace key rejection, and unknown auth_type for both LLM and embedding paths; missing whitespace-key variant for the embedding adapter. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[validate called with adapter_metadata] --> B{auth_type present?}
B -- "None (legacy)" --> C[Strip blank/None key fields silently]
B -- access_keys --> D{aws_access_key_id and aws_secret_access_key non-blank?}
B -- iam_role --> E[Pop both key fields unconditionally]
B -- unknown --> F[Raise ValueError]
D -- Yes --> G[Keep key fields in validated dict]
D -- No --> H[Raise ValueError]
C --> I[Return validated kwargs to LiteLLM]
E --> I
G --> I
H --> Z[Error surfaced to caller]
F --> Z
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json:76
**Misleading enum display label for IAM role option**
The label `"IAM Role / Instance Profile (on-prem AWS only)"` uses "on-prem AWS" in a way that may discourage users who are running Unstract on actual AWS infrastructure (ECS tasks with a task role, EKS pods with IRSA, EC2 with an instance profile) from selecting this mode. All those are cloud-AWS deployments, not "on-prem." A user reading this label could reasonably conclude the option doesn't apply to them and fall back to storing long-lived access keys instead. The full `description` text is accurate — the parenthetical on the display name is the problem. The same label is used in `embedding1/static/bedrock.json`.
### Issue 2 of 2
unstract/sdk1/tests/test_bedrock_adapter.py:185-200
**Missing whitespace-key test for the embedding adapter**
The LLM suite has `test_llm_access_keys_mode_whitespace_keys_raises` which verifies that a whitespace-only `aws_secret_access_key` is rejected. The embedding suite only has `test_embedding_access_keys_mode_blank_keys_raises` (empty string) but no equivalent whitespace test. The same `_resolve_bedrock_aws_credentials` code path runs for both adapters, so a regression in the `.strip()` check would go undetected for the embedding path.
Reviews (1): Last reviewed commit: "[HOTFIX] Add IAM Role / Instance Profile..." | Re-trigger Greptile
| ], | ||
| "enumNames": [ | ||
| "Access Keys", | ||
| "IAM Role / Instance Profile (on-prem AWS only)" |
There was a problem hiding this comment.
Misleading enum display label for IAM role option
The label "IAM Role / Instance Profile (on-prem AWS only)" uses "on-prem AWS" in a way that may discourage users who are running Unstract on actual AWS infrastructure (ECS tasks with a task role, EKS pods with IRSA, EC2 with an instance profile) from selecting this mode. All those are cloud-AWS deployments, not "on-prem." A user reading this label could reasonably conclude the option doesn't apply to them and fall back to storing long-lived access keys instead. The full description text is accurate — the parenthetical on the display name is the problem. The same label is used in embedding1/static/bedrock.json.
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json
Line: 76
Comment:
**Misleading enum display label for IAM role option**
The label `"IAM Role / Instance Profile (on-prem AWS only)"` uses "on-prem AWS" in a way that may discourage users who are running Unstract on actual AWS infrastructure (ECS tasks with a task role, EKS pods with IRSA, EC2 with an instance profile) from selecting this mode. All those are cloud-AWS deployments, not "on-prem." A user reading this label could reasonably conclude the option doesn't apply to them and fall back to storing long-lived access keys instead. The full `description` text is accurate — the parenthetical on the display name is the problem. The same label is used in `embedding1/static/bedrock.json`.
How can I resolve this? If you propose a fix, please make it concise.| """Embedding-side parity with the LLM stale-key fix.""" | ||
| out = AWSBedrockEmbeddingParameters.validate( | ||
| { | ||
| "auth_type": "iam_role", | ||
| "model": "amazon.titan-embed-text-v2:0", | ||
| "region_name": "us-east-1", | ||
| "aws_access_key_id": "STALE_KEY", | ||
| "aws_secret_access_key": "STALE_SECRET", | ||
| } | ||
| ) | ||
| assert "aws_access_key_id" not in out | ||
| assert "aws_secret_access_key" not in out | ||
| assert out["aws_region_name"] == "us-east-1" | ||
|
|
||
|
|
||
| def test_embedding_iam_role_mode_with_no_keys() -> None: |
There was a problem hiding this comment.
Missing whitespace-key test for the embedding adapter
The LLM suite has test_llm_access_keys_mode_whitespace_keys_raises which verifies that a whitespace-only aws_secret_access_key is rejected. The embedding suite only has test_embedding_access_keys_mode_blank_keys_raises (empty string) but no equivalent whitespace test. The same _resolve_bedrock_aws_credentials code path runs for both adapters, so a regression in the .strip() check would go undetected for the embedding path.
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/test_bedrock_adapter.py
Line: 185-200
Comment:
**Missing whitespace-key test for the embedding adapter**
The LLM suite has `test_llm_access_keys_mode_whitespace_keys_raises` which verifies that a whitespace-only `aws_secret_access_key` is rejected. The embedding suite only has `test_embedding_access_keys_mode_blank_keys_raises` (empty string) but no equivalent whitespace test. The same `_resolve_bedrock_aws_credentials` code path runs for both adapters, so a regression in the `.strip()` check would go undetected for the embedding path.
How can I resolve this? If you propose a fix, please make it concise.
What
Back-merge of the Bedrock IAM Role / Instance Profile auth_type change (PR #1944, originally merged into
v0.163.2-hotfix) intomain.This PR is scoped to only the Bedrock changes. The other commit on
v0.163.2-hotfix(PR #1918, pluggable-worker import discovery) is excluded from this back-merge by request — it can be back-merged separately if needed.Cherry-picked commit:
c34f2409[HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter ([HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter #1944)Files touched (4):
unstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.jsonunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.jsonunstract/sdk1/tests/test_bedrock_adapter.py(new)Why
Per the Hotfix Deployment Guide Step 9, hotfix changes must be back-merged to
mainafter production deployment so the fixes carry forward into future releases. This PR is the OSS-side back-merge for the Bedrock auth_type hotfix.Replaces #1946, which inadvertently included the unrelated PR #1918 (pluggable-worker hotfix) in the same back-merge.
How
Can this PR break any existing features. If yes, please list possible items. If no, please explain why.
No. The change was independently reviewed and merged via PR #1944. This PR only forwards that single commit to
main. Coverage is verified byunstract/sdk1/tests/test_bedrock_adapter.py(15 tests, all passing).Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
v0.163.2-hotfix)Dependencies Versions
Notes on Testing
Tested in PR #1944. No additional testing required for the back-merge itself.
Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.