fix: Prompt studio index and run issues by chandrasekharan-zipstack · Pull Request #1 · Zipstack/unstract

chandrasekharan-zipstack · 2024-02-25T17:00:34Z

What

Updated platform key retrieval logic in PromptIdeBaseTool

Why

Noticed issues with indexing a file and running of prompts due

How

...

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Tried indexing and running a prompt on a file

Screenshots

Checklist

I have read and understood the Contribution Guidelines.

nehabagdia · 2024-02-26T02:34:43Z

@jaseemjaskp I am not merging this yet as I want you to confirm if org_id is really mandatory and if the approach for multi-tenancy sounds OK.
Pulling in the change makes my prompt studio work OK locally.

jaseemjaskp · 2024-02-26T03:44:55Z

@jaseemjaskp I am not merging this yet as I want you to confirm if org_id is really mandatory and if the approach for multi-tenancy sounds OK. Pulling in the change makes my prompt studio work OK locally.

@nehabagdia I think it is impossible to get platform key without org_id.

fix: Prompt studio index and run issues

1. Remove deprecated acquire_slot() method (PR comment #1) - Method marked as deprecated, not used anywhere in codebase - check_and_acquire() is the only method used (atomic operation) - Dead code removal improves maintainability 2. Add detailed comment explaining TTL vs manual cleanup (PR comment #2) - Clarifies why _cleanup_expired_entries() is needed even with TTL - Redis ZSET TTL expires entire key, not individual entries - Manual cleanup (ZREMRANGEBYSCORE) removes stale entries within ZSET - Both mechanisms work together for accurate rate limiting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…he and per-org locks (#1649) * UN-2972: Implement API deployment rate limiting with Django cache and per-org locks This commit implements a comprehensive rate limiting system for API deployments with atomic operations, caching, and management commands. Features: - Organization-level and global concurrent request limits - Per-organization Redis distributed locks to prevent race conditions - Django cache integration for org limits (~95% reduction in DB queries) - Automatic cache invalidation on limit updates - Centralized constants to eliminate magic strings - Management commands for easy administration Technical Details: - Uses Redis ZSET for tracking active executions with TTL-based cleanup - Per-org locks ensure atomic check-and-acquire operations - Global limit enforced with eventual consistency (acceptable tolerance) - Rate limiter fails open on Redis errors for system reliability - Retry-After header in 429 responses for proper client backoff New Files: - backend/api_v2/rate_limit_constants.py: Centralized keys and constants - backend/api_v2/rate_limiter.py: Core rate limiting logic - backend/api_v2/migrations/0003_add_organization_rate_limit.py: DB migration - backend/api_v2/management/commands/set_org_rate_limit.py: Set org limits - backend/api_v2/management/commands/get_org_rate_limit.py: View org limits - backend/api_v2/management/commands/list_org_rate_limits.py: List all limits Modified Files: - backend/api_v2/models.py: OrganizationRateLimit model with auto cache clearing - backend/api_v2/deployment_helper.py: Use atomic check_and_acquire - backend/api_v2/admin.py: Admin interface for OrganizationRateLimit - backend/api_v2/exceptions.py: RateLimitExceeded exception - backend/backend/exceptions.py: Add Retry-After header to 429 responses - backend/workflow_manager/workflow_v2/models/execution.py: Auto-release on completion - backend/backend/settings/base.py: Rate limiting configuration - backend/sample.env: Configuration documentation Configuration (environment variables): - API_DEPLOYMENT_DEFAULT_RATE_LIMIT: Default per-org limit (default: 5) - API_DEPLOYMENT_GLOBAL_RATE_LIMIT: System-wide limit (default: 50) - API_DEPLOYMENT_RATE_LIMIT_TTL_HOURS: ZSET entry TTL (default: 6 hours) - API_DEPLOYMENT_RATE_LIMIT_CACHE_TTL: Cache TTL (default: 1 hour) - API_DEPLOYMENT_RATE_LIMIT_LOCK_TIMEOUT: Lock timeout (default: 2 seconds) - API_DEPLOYMENT_RATE_LIMIT_LOCK_BLOCKING_TIMEOUT: Lock wait (default: 5 seconds) - API_DEPLOYMENT_RATE_LIMIT_RETRY_AFTER: Retry-After value (default: 300 seconds) Management Commands: - python manage.py set_org_rate_limit <org-id-or-name> <limit> - python manage.py get_org_rate_limit <org-id-or-name> [--clear-cache] - python manage.py list_org_rate_limits [--with-usage] Performance Impact: - DB queries for org limits: ~95% reduction (cached) - Rate limit check latency: 10-20ms (acceptable tradeoff for no race conditions) - Lock contention: Minimal (per-org, not global) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-2972 [FIX] Move signal imports to top of models.py to fix Ruff E402 Fix Ruff linting errors by moving Django signal imports to the top of the file according to PEP 8 conventions. Changes: - Moved `from django.db.models.signals import post_delete` to line 7 - Moved `from django.dispatch import receiver` to line 8 - Removed duplicate imports from line 268-269 (bottom of file) - Signal handler function remains at bottom unchanged Fixes Ruff errors: - E402: Module level import not at top of file (line 266) - E402: Module level import not at top of file (line 267) Also installed pre-commit hooks locally for future commits. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix organization lookup in rate limit management commands The Organization.organization_id field is a CharField (not UUIDField), storing custom string IDs like "org_qijtoAkJNhznYhNt". The previous implementation attempted UUID parsing which failed for these values. Changes: - Remove UUID validation that assumes organization_id is a UUID - Try organization_id lookup first (handles both UUID and string formats) - Fall back to name lookup if organization_id lookup fails - Update help text to remove "UUID" reference This allows commands to work with both: - Organization ID: python manage.py set_org_rate_limit org_qijtoAkJNhznYhNt 10 - Organization name: python manage.py set_org_rate_limit zipstack 10 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add delete_org_rate_limit management command and improve error logging New command allows administrators to remove custom organization rate limits, reverting organizations back to the system default limit. Features: - Delete custom rate limit for an organization (by ID or name) - Confirmation prompt (can be skipped with --force) - Shows default limit that will be used after deletion - Warns if current usage exceeds default - Automatic cache clearing via post_delete signal Also improved error logging in WorkflowExecution to use logger.exception() instead of logger.error() for better stack trace visibility. Usage: python manage.py delete_org_rate_limit org_qijtoAkJNhznYhNt python manage.py delete_org_rate_limit zipstack --force 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add cache clearing command and improve cache TTL behavior New Features: 1. Management command to clear organization rate limit cache - Clear specific org: --org-id <org_id> - Clear all with custom limits: (default) - Clear ALL orgs: --all - Uses Redis pattern deletion (rate_limit:cache:org_limit:*) for performance, falls back to individual deletion 2. Reduced cache TTL from 1 hour to 10 minutes - Allows faster pickup of default limit changes from ENV - Still provides significant DB query reduction 3. Implement cache TTL refresh on every read - TTL is extended by 10 minutes on every cache hit - Frequently-used orgs stay cached indefinitely - Inactive orgs expire after 10 minutes - LRU-like behavior without explicit LRU cache Usage: # Clear specific organization cache python manage.py clear_org_rate_limit_cache --org-id org_qijtoAkJNhznYhNt # Clear cache for all orgs with custom limits python manage.py clear_org_rate_limit_cache # Clear cache for ALL organizations (uses pattern deletion if Redis) python manage.py clear_org_rate_limit_cache --all This is useful when changing API_DEPLOYMENT_DEFAULT_RATE_LIMIT to ensure organizations pick up the new default value immediately. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add comprehensive rate limiting documentation and update default limits Documentation: - Created docs/API_DEPLOYMENT_RATE_LIMITING.md with: - Architecture overview (per-org + global limits, Redis locks, cache) - All 7 ENV variables with detailed explanations - All 5 management commands with usage examples - Usage scenarios and best practices - Troubleshooting guide - Performance characteristics and security considerations - Confirms rate limiting ONLY affects API deployments (not ETL/tasks) Default Limit Updates (3 locations): - Organization limit: 5 → 20 concurrent requests - Global limit: 50 → 100 concurrent requests - Updated in: 1. backend/api_v2/rate_limit_constants.py (RateLimitDefaults) 2. backend/sample.env (environment variable defaults) 3. backend/backend/settings/base.py (Django setting defaults) Also updated cache TTL default in base.py: 3600s → 600s (10 minutes) to match the change made in previous commit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use centralized constants and remove Retry-After functionality Issues Fixed: 1. RateLimitExceeded was not using centralized RateLimitMessages constants 2. Response format documentation didn't match actual drf-standardized-errors format 3. Retry-After header and retry_after_seconds were not useful (hard to predict accurately) Changes: - RateLimitExceeded now uses RateLimitMessages.get_org_limit_exceeded_message() and get_global_limit_exceeded_message() for consistent error messages - Removed retry_after_seconds parameter from RateLimitExceeded.__init__() - Removed retry_after_seconds from rate_limiter.py return dicts - Removed retry_after_seconds from deployment_helper.py exception call - Removed dead Retry-After header code from backend/exceptions.py - Removed API_DEPLOYMENT_RATE_LIMIT_RETRY_AFTER from: - sample.env - settings/base.py - rate_limit_constants.py (DEFAULT_RETRY_AFTER_SECONDS) - Updated documentation to show actual drf-standardized-errors response format: { "type": "client_error", "errors": [{"code": "error", "detail": "...", "attr": null}] } - Removed all Retry-After references from documentation Clients should implement their own retry logic with exponential backoff. Rate limits are released when active requests complete. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix late imports and comparison bug in rate limit management commands - Move APIDeploymentRateLimiter import to module top (PEP 8 compliance) - Fix off-by-one warning bug: change > to >= to match rate limiter blocking logic - Affects: delete_org_rate_limit, set_org_rate_limit, get_org_rate_limit, list_org_rate_limits The rate limiter blocks when usage >= limit, so warnings must also use >= for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Refactor rate limiting to view layer with dual release paths **Problem:** Review bot identified critical bug where early failures (DB, file staging, queue dispatch) would leak rate limit slots for 6 hours, potentially exhausting org/global quotas. **Root Cause:** - Rate limiting was in deployment_helper.py - Helper has internal try-catch that swallows exceptions and returns error response - Exceptions that don't propagate can't be caught by outer handlers - No slot release on these failures → orphaned slots **Solution:** Move rate limiting to view layer with dual release strategy: 1. **View Layer (api_deployment_views.py)** - Acquires rate limit slot before calling execute_workflow() - Wraps execute_workflow() in try-catch - Releases slot if exception propagates (early setup failures) 2. **Helper Layer (deployment_helper.py)** - Accepts optional execution_id parameter - Existing exception handler now also releases slot - Handles failures in async dispatch that don't propagate 3. **Signal (workflow completion)** - Unchanged - releases slot when async job completes successfully **Coverage:** ✅ Lines 197-241 failures (Tag creation, WorkflowExecution, file staging) → View catches & releases ✅ Lines 243-289 failures (Async dispatch, config checks) → Helper catches & releases ✅ Async job completion → Signal releases ✅ No double-release (each path has one release point) ✅ No orphaned slots on any error path **Files Changed:** - api_deployment_views.py: Add rate limit check + blanket exception handler - deployment_helper.py: Accept execution_id param, add slot release to exception handler - API_DEPLOYMENT_RATE_LIMITING.md: Update architecture diagram and code locations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use logger.exception() to capture full traceback in rate limit exception handler Changed logger.error() to logger.exception() in api_deployment_views.py exception handler to automatically include the full stack trace. This provides better debugging information when workflow execution fails during rate limit protected operations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Update documentation with generic organization IDs * Address PR review comments: remove dead code and clarify TTL behavior 1. Remove deprecated acquire_slot() method (PR comment #1) - Method marked as deprecated, not used anywhere in codebase - check_and_acquire() is the only method used (atomic operation) - Dead code removal improves maintainability 2. Add detailed comment explaining TTL vs manual cleanup (PR comment #2) - Clarifies why _cleanup_expired_entries() is needed even with TTL - Redis ZSET TTL expires entire key, not individual entries - Manual cleanup (ZREMRANGEBYSCORE) removes stale entries within ZSET - Both mechanisms work together for accurate rate limiting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…o CORS (#1938) * UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS Production socket connections were failing for `*.env.us-central.unstract.com` because python-socketio does exact-string comparison on `cors_allowed_origins`, so a literal `*` pattern silently rejected every real subdomain. - Add `CORS_ALLOWED_ORIGIN_REGEXES` derived from `WEB_APP_ORIGIN_URL_WITH_WILD_CARD`. - Wire SocketIO via `_RegexOrigin` whose `__eq__` does the regex match — single list entry covers all wildcard subdomains, no library subclass needed. - Normalize `WEB_APP_ORIGIN_URL` through `urlparse` so trailing slashes / paths in env are stripped (also fixes the `…com//oauth-status/` double-slash). - Add startup guard for malformed env values. Resolves item #1 of UN-3439. Items #2/#3 (decoupling indexing from Socket.io, fallback) are owned separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address PR review: canonical origin, fullmatch, unhashable RegexOrigin, tests Addresses five review comments on #1938: 1. coderabbitai (Major) — RFC 6454 canonicalization. Browsers serialize `Origin` headers with a lowercase host and no explicit default ports; `parsed_url.netloc` preserved both, so `https://APP.EXAMPLE.COM:443` would silently fail to match the browser's `https://app.example.com`. Switch to `parsed_url.hostname` + drop default ports, and reject non-http(s) schemes at startup. 2. greptile (P2) — `re.fullmatch` instead of `re.match`. With `re.match` plus `$`, a candidate ending in `\n` matches because `$` is allowed before an optional trailing newline. `fullmatch` removes the ambiguity. 3. self — `_RegexOrigin.__hash__` violated `a == b ⇒ hash(a) == hash(b)` (one fixed pattern hash vs. many matching strings). Today this is masked because python-socketio uses linear `__eq__` on a list, but if the allow-list is ever wrapped in a set, every legitimate subdomain would silently be rejected — exactly the failure mode UN-3439 closes. Make instances unhashable so the contract can't be broken. 4. self — No regression tests. Add `backend/utils/tests/test_cors_origin.py` (33 cases) covering: regex match/no-match, lookalike spoofing, scheme mismatch, trailing-newline rejection, non-string equality protocol, unhashability, ReDoS bounds, URL normalization (case, default ports, trailing slash, paths, queries), startup-guard rejections (empty, no-scheme, non-browser-scheme, no-host), and end-to-end via the same `RegexOrigin` path SocketIO uses. 5. self — Over-clever wildcard-to-regex builder. The `split('*').join(re.escape, ...)` construction generalised to N wildcards but the input has exactly one; replace with a direct rf-string that's self-evident on review. Refactor for testability: extract `RegexOrigin` and `normalize_web_app_origin` into `backend/utils/cors_origin.py` (Django-free, importable from settings and tests). Settings now delegates to one helper call; `log_events.py` imports `RegexOrigin`. No behavioural change beyond what each comment fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address SonarCloud quality gate The Sonar quality gate failed with C reliability + 5 security hotspots, all on the new test file: - S905 (Bug, Major) — `{ro}` flagged as no-side-effect statement (Sonar doesn't see the implicit `__hash__` call). Drove the C reliability rating. Fix: use `len({ro})` so the side effect is via an explicit function call; test still asserts the same `TypeError`. - S5727 (Code Smell, Critical) — `assert ro != None` is tautological and doesn't exercise `__eq__`. Switch to `(ro == None) is False` which directly tests that `NotImplemented` falls back to identity-equality. - S5332 × 5 (Hotspots) — `http://` and `ftp://` literals in test data. These are intentional inputs proving the rejection logic. Annotate with `# NOSONAR` and an explanatory comment so the hotspots can be marked reviewed. No production code changed; tests still 33/33 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Remove last S5727 code smell — test __eq__ via dunder Sonar S5727 correctly inferred that ``ro == None`` is statically always False (NotImplemented falls back to identity), making the assertion look tautological. The intent is to lock the protocol contract: ``__eq__`` must return the ``NotImplemented`` sentinel for non-strings. Test that directly via ``ro.__eq__(None) is NotImplemented`` instead of going through ``==``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address remaining CodeRabbit nits — port validation, ReDoS bound Two minor follow-ups from the second CodeRabbit pass: - `parsed.port` is a property that raises ValueError on malformed/out-of-range inputs (e.g. `:abc`, `:99999`). That bypassed our normalized config-error message and surfaced as a stack trace. Wrap the access and re-raise with the same actionable text. Adds two test cases (`https://example.com:abc`, `https://example.com:99999`) to lock the new behaviour. - The 50ms ReDoS timing bound is too tight for noisy CI runners. Loosen to 500ms — still orders of magnitude below what catastrophic backtracking would produce. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@jaseemjaskp

* UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS (#1938) * UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS Production socket connections were failing for `*.env.us-central.unstract.com` because python-socketio does exact-string comparison on `cors_allowed_origins`, so a literal `*` pattern silently rejected every real subdomain. - Add `CORS_ALLOWED_ORIGIN_REGEXES` derived from `WEB_APP_ORIGIN_URL_WITH_WILD_CARD`. - Wire SocketIO via `_RegexOrigin` whose `__eq__` does the regex match — single list entry covers all wildcard subdomains, no library subclass needed. - Normalize `WEB_APP_ORIGIN_URL` through `urlparse` so trailing slashes / paths in env are stripped (also fixes the `…com//oauth-status/` double-slash). - Add startup guard for malformed env values. Resolves item #1 of UN-3439. Items #2/#3 (decoupling indexing from Socket.io, fallback) are owned separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address PR review: canonical origin, fullmatch, unhashable RegexOrigin, tests Addresses five review comments on #1938: 1. coderabbitai (Major) — RFC 6454 canonicalization. Browsers serialize `Origin` headers with a lowercase host and no explicit default ports; `parsed_url.netloc` preserved both, so `https://APP.EXAMPLE.COM:443` would silently fail to match the browser's `https://app.example.com`. Switch to `parsed_url.hostname` + drop default ports, and reject non-http(s) schemes at startup. 2. greptile (P2) — `re.fullmatch` instead of `re.match`. With `re.match` plus `$`, a candidate ending in `\n` matches because `$` is allowed before an optional trailing newline. `fullmatch` removes the ambiguity. 3. self — `_RegexOrigin.__hash__` violated `a == b ⇒ hash(a) == hash(b)` (one fixed pattern hash vs. many matching strings). Today this is masked because python-socketio uses linear `__eq__` on a list, but if the allow-list is ever wrapped in a set, every legitimate subdomain would silently be rejected — exactly the failure mode UN-3439 closes. Make instances unhashable so the contract can't be broken. 4. self — No regression tests. Add `backend/utils/tests/test_cors_origin.py` (33 cases) covering: regex match/no-match, lookalike spoofing, scheme mismatch, trailing-newline rejection, non-string equality protocol, unhashability, ReDoS bounds, URL normalization (case, default ports, trailing slash, paths, queries), startup-guard rejections (empty, no-scheme, non-browser-scheme, no-host), and end-to-end via the same `RegexOrigin` path SocketIO uses. 5. self — Over-clever wildcard-to-regex builder. The `split('*').join(re.escape, ...)` construction generalised to N wildcards but the input has exactly one; replace with a direct rf-string that's self-evident on review. Refactor for testability: extract `RegexOrigin` and `normalize_web_app_origin` into `backend/utils/cors_origin.py` (Django-free, importable from settings and tests). Settings now delegates to one helper call; `log_events.py` imports `RegexOrigin`. No behavioural change beyond what each comment fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address SonarCloud quality gate The Sonar quality gate failed with C reliability + 5 security hotspots, all on the new test file: - S905 (Bug, Major) — `{ro}` flagged as no-side-effect statement (Sonar doesn't see the implicit `__hash__` call). Drove the C reliability rating. Fix: use `len({ro})` so the side effect is via an explicit function call; test still asserts the same `TypeError`. - S5727 (Code Smell, Critical) — `assert ro != None` is tautological and doesn't exercise `__eq__`. Switch to `(ro == None) is False` which directly tests that `NotImplemented` falls back to identity-equality. - S5332 × 5 (Hotspots) — `http://` and `ftp://` literals in test data. These are intentional inputs proving the rejection logic. Annotate with `# NOSONAR` and an explanatory comment so the hotspots can be marked reviewed. No production code changed; tests still 33/33 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Remove last S5727 code smell — test __eq__ via dunder Sonar S5727 correctly inferred that ``ro == None`` is statically always False (NotImplemented falls back to identity), making the assertion look tautological. The intent is to lock the protocol contract: ``__eq__`` must return the ``NotImplemented`` sentinel for non-strings. Test that directly via ``ro.__eq__(None) is NotImplemented`` instead of going through ``==``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address remaining CodeRabbit nits — port validation, ReDoS bound Two minor follow-ups from the second CodeRabbit pass: - `parsed.port` is a property that raises ValueError on malformed/out-of-range inputs (e.g. `:abc`, `:99999`). That bypassed our normalized config-error message and surfaced as a stack trace. Wrap the access and re-raise with the same actionable text. Adds two test cases (`https://example.com:abc`, `https://example.com:99999`) to lock the new behaviour. - The 50ms ReDoS timing bound is too tight for noisy CI runners. Loosen to 500ms — still orders of magnitude below what catastrophic backtracking would produce. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ReverseMerge: V0.161.4 hotfix (#1943) * Change csp to report only * [HOTFIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var (v0.161.4) (#1939) [HOTFIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var (#1937) [FIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var os.environ.get returns the raw string when the variable is set, so ENABLE_HIGHLIGHT_API_DEPLOYMENT="False" was truthy in Python (any non-empty string is truthy). Wrap in CommonUtils.str_to_bool so "False" / "false" / "0" actually evaluate to False. The setting is consumed by the cloud configuration plugin's spec default (ConfigSpec.default in plugins/configuration/cloud_config.py) on cloud and on-prem builds. With this fix, an admin who explicitly sets the env var to a falsy string sees highlight data stripped as expected. Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3448 [FIX] Remove vestigial `uv pip install` line in uv-lock-automation workflow (#1941) * UN-3448 [FIX] Add --system flag to uv pip install in uv-lock-automation workflow Modern uv requires uv pip install to run inside a virtual environment OR with the explicit --system flag. The workflow currently has neither, so it errors out: error: No virtual environment found for Python 3.12.9; run `uv venv` to create an environment, or pass `--system` to install into a non-virtual environment This breaks every PR that touches a pyproject.toml (the workflow's paths filter triggers on those). Last successful run was 2026-04-01, before a behaviour change in uv or astral-sh/setup-uv@v7. The --system flag is exactly what the error message suggests and is correct here — we install pip into the runner's system Python; the downstream uv-lock.sh script creates its own venvs as needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3448 [FIX] Remove vestigial `uv pip install` line per review Per @jaseemjaskp's review: the pre-step `uv pip install ... pip` does nothing useful for this workflow. The downstream uv-lock.sh script uses uv sync at line 74, which manages its own venvs internally and never invokes pip directly: $ grep -rn 'pip' docker/scripts/uv-lock-gen/ docker/scripts/uv-lock-gen/uv-lock.sh:2:set -o pipefail Only match is pipefail (shell option), no real pip references. Removing the line entirely is cleaner than papering over with --system. The line was likely copy-pasted from a sibling workflow that legitimately needed pip in the system Python. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ReverseMerge: V0.163.2 hotfix (#1946) * [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --------- Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com> * batch notification --------- Co-authored-by: ali <117142933+muhammad-ali-e@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Praveen Kumar <praveen@zipstack.com> Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com>

* UN-3450 [FEAT] Add golden-path smoke test for callback worker Adds workers/tests/test_callback_sanity.py mirroring the existing test_executor_sanity.py pattern. Covers: - Worker enums and registry (WorkerType.CALLBACK, QueueName.CALLBACK + CALLBACK_API, TaskName.PROCESS_BATCH_CALLBACK, health port 8083, WorkerRegistry queue config + task routing). - Celery task wiring (process_batch_callback, process_batch_callback_api, healthcheck — registration, retry config, max_retries, autoretry_for). - Full dispatch -> task -> return round-trip via Celery eager mode, using the simple healthcheck task (process_batch_callback itself needs a configured InternalAPIClient + downstream HTTP, which is heavy mocking territory unsuitable for a smoke test — covered later in #1.2 characterisation suite). Coverage gains on callback module: - callback/__init__.py: 0% -> 100% - callback/worker.py: 0% -> 52% - callback/tasks.py: 0% -> 10% - Module total: 0% -> 12% 14 tests, run in ~17s. Regression net for spine PRs #6 (CallbackStatus enum migration) and #13 (chord -> Barrier lift in callback chain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3450 [FEAT] Address Greptile review on callback smoke test Two P2 findings from Greptile, both fixed: 1. test_healthcheck_result_is_json_serializable: drop the `default=str` fallback in json.dumps. With it, any non-JSON-serializable value (UUID, datetime, custom object) gets silently coerced to a string — exactly the failure mode the test claims to catch. Without it, the assertion faithfully tests "round-trips cleanly via JSON" the way Celery's serializer would. 2. Add registration check for `process_batch_callback_django_compat` (the backward-compat task name `workflow_manager.workflow_v2. file_execution_tasks.process_batch_callback`). Both this task and `process_batch_callback` delegate to `_process_batch_callback_core`, so refactors that touch the core (e.g. the upcoming CallbackStatus enum migration) affect this path too. Three-line addition. 15 tests now (was 14), runtime 11s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Updated platform key retrieval in PromptIdeBaseTool

b7d06cf

chandrasekharan-zipstack requested review from harini-venkataraman, jaseemjaskp, muhammad-ali-e and nehabagdia February 25, 2024 17:00

chandrasekharan-zipstack self-assigned this Feb 25, 2024

nehabagdia approved these changes Feb 26, 2024

View reviewed changes

jaseemjaskp approved these changes Feb 26, 2024

View reviewed changes

jaseemjaskp merged commit c64479c into main Feb 26, 2024

jaseemjaskp deleted the fix/prompt-studio-index-and-run branch February 26, 2024 07:25

pk-zipstack pushed a commit that referenced this pull request Aug 20, 2025

Merge pull request #1 from Zipstack/fix/prompt-studio-index-and-run

ae61b91

fix: Prompt studio index and run issues

This was referenced Oct 30, 2025

UN-2930 [MISC] Add plugin infrastructure to unstract-core #1618

Merged

UN-2930 [FIX] Track subscription usage only for successful runs #1620

Merged

vishnuszipstack mentioned this pull request Feb 16, 2026

Rewrite README.md based on approved Confluence spec #1795

Merged

6 tasks

muhammad-ali-e mentioned this pull request Mar 5, 2026

[MISC] Add --active-only filter to backfill_metrics command #1824

Merged

greptile-apps Bot mentioned this pull request Mar 17, 2026

UN-3318 [FEAT] Platform API key implementation #1860

Merged

7 tasks

muhammad-ali-e mentioned this pull request Apr 16, 2026

UN-3408 [MISC] Add automated release workflow and version validation #1915

Merged

5 tasks

This was referenced Apr 28, 2026

UN-3435 [FIX] Bind request_id and trace context to worker logs #1932

Merged

UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS #1938

Merged

This was referenced May 5, 2026

UN-3447 [MISC] Expand workers coverage and isort scope to all 11 packages #1940

Merged

UN-3454 [FIX] Add CSV/JSON export to workflow execution logs modal #1949

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Prompt studio index and run issues#1

fix: Prompt studio index and run issues#1
jaseemjaskp merged 1 commit intomainfrom
fix/prompt-studio-index-and-run

chandrasekharan-zipstack commented Feb 25, 2024

Uh oh!

nehabagdia commented Feb 26, 2024

Uh oh!

jaseemjaskp commented Feb 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chandrasekharan-zipstack commented Feb 25, 2024

What

Why

How

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Screenshots

Checklist

Uh oh!

nehabagdia commented Feb 26, 2024

Uh oh!

jaseemjaskp commented Feb 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants