Releases: NetApp/Innovation-Labs
netapp-neo-26.3.2
NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service
NetApp Project Neo v4.0.3p9
NetApp Project Neo v4.0.3p9
What's New
Improved Service Resilience
Admin user creation is now decoupled from worker initialization, ensuring authentication always works even if background workers fail to start. Worker initialization also now automatically retries on failure instead of leaving the service in a broken state.
- Independent admin account creation -- The admin user is created as a standalone step before worker components initialize, so API authentication is available immediately after setup completes
- Automatic worker retry -- If worker initialization fails (e.g., due to a transient Graph API or database issue), the service automatically retries instead of requiring a manual restart
MCP & Search Fixes
Resolves multiple issues with the Model Context Protocol (MCP) integration, ACL-based access control, and NER entity search.
- ACL filtering fix -- Shares configured with
acl_override_mode=everyonenow correctly grant access instead of denying when resolved principals don't match the user - Auth persistence -- MCP OAuth RSA signing keys are now persisted to the database, so authentication tokens survive service restarts
- Group-based access control -- User group memberships are now fetched via Microsoft Graph at token validation time, enabling group-based ACL matching through MCP
- NER search improvements -- Fixed entity search 422 error, added relevance ranking (exact match, entity density, text length), pagination support, and per-file deduplication
- Share status transitions -- NER worker now correctly transitions share status from PROCESSING → READY when all work completes
- OAuthProvider abstraction -- Introduced OAuthProvider ABC for future Keycloak/generic OIDC provider support
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p9.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p9 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p9 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p9 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p9 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p9 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p9 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p9
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p9
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p9
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p9Quality
- Full end-to-end testing passed on both CPU and GPU (CUDA) builds
- Validated across S3, NFS, and SMB storage backends
- 1,467+ files processed with NER entity detection (67,000+ entities on CPU, 8,700+ on GPU)
- Zero import errors across all Cython-compiled services
- Zero CUDA errors on NVIDIA RTX PRO 4000 Blackwell SFF
NetApp Project Neo v4.0.3p10
NetApp Project Neo v4.0.3p10
What's New
Fix: Worker Startup Hang on Large Datasets
Resolved a critical issue where all worker containers would hang indefinitely during initialization on systems with large file inventories (100k+ files), preventing all data ingestion, ACL resolution, and file processing.
- Root cause -- The ACL resolution backfill query used a correlated
LIKEsubquery on cast JSON text (metadata::text LIKE '%' || id || '%'), resulting in O(n*m) complexity that could take hours on large datasets. With all worker replicas running this query simultaneously, database contention compounded the problem. - Fix -- The backfill is now deferred to a non-blocking background task that runs after workers are fully initialized. The query has been rewritten to use an efficient JSONB key lookup (
(metadata::jsonb)->>'file_id') that is indexable and orders of magnitude faster. - Impact -- Workers now start in seconds regardless of dataset size, immediately beginning file processing, ACL resolution, and Graph uploads.
Fix: Admin User Creation Decoupled from Worker Init
Admin user creation now runs independently of worker initialization with retry logic, ensuring authentication works even if worker startup encounters transient errors.
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p10.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p10 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p10 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p10 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p10 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p10 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p10 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p10
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p10
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p10
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p10Quality
- 1940/1940 end-to-end test work items passing (100% pass rate)
- Validated across SMB and NFS storage backends (CPU and GPU builds)
- ACL backfill verified: 10/10 manually-cleared files re-resolved after worker restart
netapp-neo-26.3.1
NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service
NetApp Project Neo v4.0.3p8
NetApp Project Neo v4.0.3p8
What's New
SMB Mount Reliability
Resolved critical mount contention issues affecting deployments with concurrent file extractions across SMB/CIFS shares.
- Per-share mount locking -- Eliminates "Device or resource busy" errors caused by concurrent mount.cifs calls to the same share. Multiple extraction threads now serialize mount operations per share and reuse cached mounts.
- Mount reference counting -- TTL-based cleanup no longer unmounts shares while extractions are still in progress. Active references are tracked so long-running operations (e.g. Docling GPU processing on large files) complete without interruption.
- Unmount cache key fix -- Mounts with custom
smb_mount_optionsare now correctly found during cleanup, preventing orphaned mounts that accumulated over time.
Microsoft Graph Connector Registration
Fixed connector registration stalling in draft state and never completing schema provisioning.
- Connection state checking -- Registration now inspects the actual connection state (
draft,ready,limitExceeded,obsolete) instead of only checking if the connection object exists. Clear error messages are provided for terminal states. - Post-registration verification -- After schema registration, the connector state is verified and logged.
- Cython return type fix -- Removed
-> Dictreturn type annotations oncreate_connection()andget_connection()that Cython enforced at runtime, causing "Expected dict, got ExternalConnection" failures in compiled builds.
Performance
- Async event loop unblocking -- Synchronous database calls and large JSON parsing moved to thread pools to prevent blocking the async event loop.
- Streaming enumeration -- File discovery uses NDJSON streaming via
os.scandir, eliminating per-file stat RPCs over network filesystems. - Bulk database operations -- Multi-row INSERT batching and chunked duplicate checking in the work queue reduces database round-trips during large crawls.
- Content chunking -- Single-encode byte-offset splitting for large document content.
RHEL 8 Compatibility
- Removed
noatimefrom CIFS mount options --noatimeis not a valid CIFS-specific mount option and can cause mount failures on older RHEL 8 kernels (4.18, pre-fs_context backport). CIFS inherently behaves as noatime, and our mounts are read-only, so the option was always a no-op.
Extractor Dependencies
- Added
ffmpegto extractor runtime packages for audio/video metadata extraction.
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p8.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p8 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p8 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p8 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p8 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p8 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p8 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p8
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p8
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p8
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p8Quality
- 4461/4461 end-to-end work items completed (100% pass rate)
- 0 failures across extraction, ACL resolution, and Graph upload pipelines
- Validated across S3, NFS, and SMB storage backends
- GPU-accelerated extraction tested with NVIDIA RTX PRO 4000 (CUDA)
- Microsoft Graph connector registration verified (draft -> schema -> ready)
- 31 unit tests passing for mount manager (including concurrency test)
NetApp Project Neo v4.0.3p7
NetApp Project Neo v4.0.3p7
What's New
Fix: Share Rules Filtering Completely Non-Functional
The extractor-based enumeration path (used by all deployments) bypassed ALL share-level rules — include/exclude patterns, size limits, and date filters had no effect. All files were enumerated, extracted, and uploaded regardless of configured rules.
Five bugs were identified and fixed:
-
Extractor enumeration bypassed rules —
batch_insert_files_extractor()inserted all files and created work items without checking rules. Rules are now evaluated during enumeration, before database insertion. Files that don't match are never added to the work queue. -
Retroactive filter couldn't find current-scan files —
detect_rule_filtered_files()only checked files from previous scans (scan_version < current), missing all files on first crawl. Now accumulates with enumeration-time filtering for complete coverage. -
Date filters ignored in extractor path —
modified_within_days,created_within_days, and all other date-based rules were only applied in the unused local VFS code path. Now applied during extractor enumeration and retroactive detection. -
Path-qualified patterns silently matched nothing — Patterns like
PII/*.csvwere tested against the filename only (data.csv), not the relative path (PII/data.csv). Patterns containing/now match against the full relative path. -
Date filter key mismatch — The ShareRules model used
modified_within_daysbut the date filter utility expectedmodified_time_within_days. Keys are now translated correctly between the two systems.
Improved: Pattern Matching Engine
Replaced the fragile string-replace regex conversion with a proper _glob_to_regex() method that uses placeholder-based **/* handling and re.escape() for special characters. This fixes issues with patterns containing . (e.g., *.csv previously matched *.xcsv), $ (e.g., ~$* temp file patterns), and nested wildcards.
Added: 30 Unit Tests for Rules Filtering
Comprehensive test coverage for include patterns, exclude patterns, path-qualified patterns, double-star patterns, size filters, date filters, and combined rule scenarios. These tests exercise the actual filtering functions (not mocked) and will catch regressions.
Verified Rule Types
| Rule | Example | Verified |
|---|---|---|
| Include by extension | *.pdf |
23 of 476 files passed |
| Include by path | PII/*.csv |
5 of 476 files passed |
| Include by deep path | **/PII/*.csv |
5 of 476 files passed |
| Exclude by extension | *.mp4, *.png |
473 of 476 files passed (3 excluded) |
| Minimum file size | min_file_size: 1MB |
35 of 476 files passed |
| Maximum file size | max_file_size: 10MB |
467 of 476 files passed (9 excluded) |
| Combined rules | *.csv + min 50MB |
2 of 476 files passed |
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p7.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p7 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p7 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p7 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p7 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p7 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p7 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p7
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p7
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p7
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p7Quality
- 1050/1050 unit tests passing (100% pass rate, +30 new rules tests)
- E2E validated: all 7 rule types verified with exact file counts
- Includes all fixes from v4.0.3p6 (enumeration performance, event loop stability, Graph API retry, migration guard, libGL)
NetApp Project Neo v4.0.3p6
NetApp Project Neo v4.0.3p6
What's New
Fix: PDF Extraction (Docling) Broken in Production Images
The Cython-compiled (optimized) extractor image was missing libgl1 and libglib2.0-0, causing all PDF files to fail Docling extraction with libGL.so.1: cannot open shared object file. Added these packages to the shared packages-runtime.txt so both standard and optimized Dockerfiles include them.
Fix: Graph API Rate Limiter — Atomic Enforcement
The rate limiter had two bugs: database writes were never committed (invisible to other workers), and the read-check-increment cycle had a TOCTOU race condition allowing concurrent tasks to overshoot the limit. Replaced with atomic UPDATE ... WHERE count < limit statements with proper commits. Also implemented real exponential backoff (1s, 2s, 4s... up to 300s) replacing the previous fixed 30-second delay.
Fix: Graph API 429 and Network Errors Now Retried
- HTTP 429
ActivityLimitReachedresponses were incorrectly classified as "non-retryable" and failed immediately. Now retried withRetry-Afterheader support. httpx.ReadError,WriteTimeout, and other transient network errors were also non-retryable. Now catches parent classes (TimeoutException,NetworkError,ProtocolError) for comprehensive retry coverage.
Fix: Migration Lock Guard — Fast-Path Skip + Retry Backoff
Worker restarts during active crawls caused PostgreSQL lock timeouts on ALTER TABLE migrations. Added a fast-path pre-check that skips all DDL when migrations are already applied (the common case), and progressive retry (10s → 30s → 60s) for new-version deployments under load. Also fixed two migrations that had no lock_timeout at all.
Performance: Enumeration 265x Faster (3 hours → 41 seconds)
Complete rewrite of the file enumeration pipeline:
- Extractor: Replaced
os.walk()+ per-fileos.stat()withos.scandir()for single-pass traversal with cached stat data. Eliminated O(n²) pagination bug that re-walked all previous files on each page. - NDJSON Streaming: New
/enumerate-streamendpoint streams file entries as they are discovered. Worker starts processing immediately instead of waiting for full file list. - Bulk Database Operations: Multi-row INSERT/UPDATE replaces per-row statements. 139K files now require ~280 DB operations instead of ~700K.
Fix: Large File Upload Event Loop Freeze
Files >100MB caused the worker's async event loop to freeze for 10+ minutes, blocking health checks and all other work:
- Rechunking:
_split_by_bytes_for_graph()was encoding 111 million characters individually. Replaced with single-encode byte-offset splitting (milliseconds instead of minutes). - JSON Parsing: Extraction responses (10-50MB) parsed synchronously via
response.json(). Moved to thread pool viarun_in_executor. - DB Storage:
store_file_metadata()andstore_file_content_chunks()for large files moved to thread pool. - Graph Chunk Uploads: Pre-serialized JSON in thread pool,
asyncio.sleep(0)between chunks to yield event loop control. - Share Status Checks:
_check_and_update_share_status()(called after every work item in all 3 worker types) moved to thread pool with explicit transaction commit/rollback. Previously left PostgreSQL connections inidle in transactionstate, eventually blocking the API service. - Event Loop Yields: Added
await asyncio.sleep(0)between work item iterations in extraction and ACL resolution workers, ensuring health checks and heartbeats are processed during rapid file bursts.
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p6.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p6 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p6 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p6 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p6 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p6 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p6 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p6
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p6
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p6
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p6Quality
- 1020/1020 unit tests passing (100% pass rate)
- End-to-end validated across SMB and NFS storage backends
- Tested with CPU and CUDA GPU extraction (NVIDIA RTX PRO 4000)
- Validated with 143,869-file SMB share: enumeration in 41s (previously 3h), full extraction + upload of all 143,869 files with worker and API healthy throughout
- Large file testing: 111MB CSV, 334MB CSV, 103MB XLSX processed successfully
- Worker sustained ~1,000 extractions/minute at 18-45% CPU (previously 100% with event loop freeze)
NetApp Project Neo v4.0.3p5
NetApp Project Neo v4.0.3p5
What's New
Bug Fixes
- fix(acl): Fixed an issue where ACL resolution work items would fail in certain deployment configurations, preventing file permissions from being resolved to Entra identities
- fix(s3): Fixed S3 storage backend connectivity issue that could prevent file extraction
- fix(nfs): Fixed NFS storage backend connectivity issue that could prevent file extraction
Self-Healing ACL Resolution
- On startup, the worker service now auto-retries all previously failed ACL resolution work items
- ACL resolution backfill automatically creates work items for files that have ACL data but missing resolved principals — no manual re-crawl needed after upgrading
Upgrade Notes
No action required. After pulling the new images and restarting:
- Previously failed ACL resolution work items are automatically retried
- Files missing resolved principals are automatically backfilled
- All fixes take effect immediately
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p5.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p5 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p5 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p5 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p5 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p5 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p5 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p5
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p5
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p5
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p5Quality
- E2E tested across S3, NFS, and SMB storage backends
- Validated on both CPU and CUDA GPU builds
- ACL extraction and Entra ID resolution verified end-to-end
NetApp Project Neo v4.0.3p4
NetApp Project Neo v4.0.3p4
What's New
Fix: ACL Principal Resolution
Resolved a bug where all ACL principals failed to resolve against Entra ID with the error resolve_principal_to_object_id() got an unexpected keyword argument 'sid'.
- Root cause -- The
acl_resolution_workerpassed an unsupportedsid=keyword argument toGraphConnector.resolve_principal_to_object_id() - Impact -- Every file showed
resolved: falsefor all principals, preventing Entra-based access control from functioning - Fix -- Removed the invalid
sidkwarg; the SID is already preserved in the result dict asprincipal
Fix: Work Queue Metadata Deserialization (from 4.0.3p3)
Fixed claim_work deserialization of JSON metadata from the work queue, which was causing workers to crash on startup.
Fix: ACL Worker Import Path (from 4.0.3p3)
Corrected the import path for the ACL resolution worker module and fixed ShareStatus enum references.
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p4.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p4 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p4 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p4 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p4 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p4 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p4 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p4
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p4
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p4
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p4Quality
- 1020/1020 unit tests passing (100% pass rate)
- End-to-end validated: SMB share crawl with successful Entra ID principal resolution
NetApp Project Neo v4.0.3p3
NetApp Project Neo v4.0.3p3
What's New
Critical Fix: Worker Service Startup Crash (v4.0.3p2 regression)
Fixes a startup crash in the worker service introduced in v4.0.3p2 where the ACL resolution worker referenced a nonexistent module, causing the worker container to crash-loop immediately on boot.
- Import path correction --
acl_resolution_worker.pyimportedWorkQueueManagerfrom a nonexistentapp.work_queue_managermodule; corrected tonetapp_shared.work_queue - Work queue metadata deserialization -- The
claim_work()column mapping was missing themetadatacolumn, causing ACL resolution to fail with "No file_id in work item metadata" even after the import was fixed. Metadata is now correctly mapped and JSON-deserialized.
Container Images
All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p3.
| Image | Platforms | Description |
|---|---|---|
netapp-neo-api:4.0.3p3 |
amd64, arm64 | REST API + MCP transport |
netapp-neo-worker:4.0.3p3 |
amd64, arm64 | Background processing |
netapp-neo-extractor:4.0.3p3 |
amd64, arm64 | Content extraction (CPU) |
netapp-neo-extractor-cuda:4.0.3p3 |
amd64, arm64 | Content extraction (NVIDIA GPU) |
netapp-neo-extractor-rocm:4.0.3p3 |
amd64 | Content extraction (AMD GPU) |
netapp-neo-ner:4.0.3p3 |
amd64, arm64 | Named Entity Recognition |
Quick Start
docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p3
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p3
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p3
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p3Quality
- 1020/1020 unit tests passing (100% pass rate)
- Full E2E validation: 43/43 tests passing across service health, authentication, share CRUD, SMB crawl pipeline, file operations, NER, and monitoring
- ACL resolution workers verified processing work items end-to-end