Previous name: Monitor
ModelFP is a Docker-first forensic workbench for auditing model repositories before and during execution. It treats Hugging Face, GitHub, and local model repositories as supply-chain artifacts, not just weight files, and turns each audit into a reproducible evidence package.
Each run can collect repository metadata, file hashes, static risk signals, optional runtime traces, an evidence graph, deterministic harm certificates, and a sanitized payload for AI-assisted review. Static checks cover config risk, unsafe serialization, custom code, non-model payloads, malware-like strings, repo hygiene, and fused repo-level signals. Runtime checks can observe behavior with Docker-contained strace and Python audit hooks.
The project has two citation layers:
monitor: the original March 2025 dynamic audit prototype that usedstraceand Python audit hooks to observe ML model execution;ModelFP: the current Dockerized repo-level forensic workflow with static fusion, evidence graphs, harm certificates, dataset layout, and Codex/Claude skill packaging.
ModelFP is packaged for local CLI use and as agent skills:
- Codex skill:
skills/codex/modelfp - Claude skill adapter:
skills/claude/ModelFP - Local wrappers:
scripts/
ModelFP audits a bounded context: repository, revision, files, config, metadata, container, command, inputs, and trace coverage. It does not prove universal model safety.
The dynamic-audit lineage comes from monitor; the current release is ModelFP.
Build the Docker images:
./scripts/build_images.shAudit a Hugging Face repo in static-only mode. The fingerprint and evidence stay under audit_datasets/; the downloaded model snapshot is deleted after the run by default.
./scripts/audit_hf_static.sh Helsinki-NLP/opus-mt-es-yua ./audit_datasets mainAudit a GitHub repo as a repo-level dataset:
./scripts/audit_github_static.sh https://github.com/AndrewDzzz/malicious_model_test ./audit_datasets mainRun static analysis on a local model snapshot or repo:
./scripts/audit_local_static.sh /path/to/local/repo ./outputs_staticRun controlled pickle detonation for .pickle and .pkl artifacts:
./scripts/audit_pickle_runtime.sh /path/to/local/repo ./outputs_pickle_runtimeInstall or update the Codex skill:
./scripts/install_codex_skill.shInstall a self-contained Claude skill folder into a chosen destination:
./scripts/install_claude_skill.sh /path/to/claude/skillsThe Codex frontmatter name remains modelfp because Codex skill names are lowercase identifiers. The public project, UI display name, and Claude skill are named ModelFP.
ModelFP ships with Docker as the default execution boundary:
modelfp:latestfor static analysis, metadata collection, evidence normalization, rule checking, and pickle detonation helpers;modelfp:mlfor controlled local model execution with CPU ML dependencies.
Remote fetch and metadata stages may use network access. Static analysis, normal runtime checks, and pickle detonation run offline with read-only target mounts. See docs/DOCKER_WORKFLOW.md.
Static modules run inside Docker and include:
- file inventory, SHA256, file type, risky extensions, archives, command and URL text patterns;
- repository hygiene: non-model payloads, repeated commits, README script or external app instructions, task mismatch, malware-hosting-like file trees;
- malware-style static triage: executable magic, download cradles, PowerShell stagers, reverse-shell snippets, persistence hooks, credential harvesting strings, miner strings, and obfuscation patterns;
- Python and Lambda AST checks for dangerous calls,
subprocess(shell=True), unsafe deserialization, event logging, Flask debug mode, and private package indexes; - config checks for
auto_map,trust_remote_code, URLs, missingmodel_type, and custom-code loading; - HDF5/Keras probing, pickle opcode probing, ModelScan integration, and fused static repo-level judgment.
Dynamic modules are optional and also run inside Docker:
stracesyscall capture;- Python audit hook capture;
- per-artifact pickle runtime detonation with network disabled;
- normalized evidence graph and deterministic harm certificates.
Dataset runs write one folder per audit:
audit_datasets/<repo_slug>/<audit_id>/
dataset_manifest.json
orchestrator.log
metadata/
outputs_static/
outputs_runtime/ optional
outputs_pickle_runtime/ optional
Primary facts are evidence_graph.json and verified harm_certificates.json. LLM payloads are secondary interpretation and must cite evidence IDs.
The published repository excludes local audit outputs, model snapshots, sandbox canaries, logs, archives, and generated figures. Runtime wrappers keep outputs on the host but remove downloaded or cloned model snapshots by default. See docs/SANITIZATION.md.
See docs/METHODOLOGY.md for the audit route and evidence-chain design.
If you use the original dynamic audit idea, cite monitor. If you use the current Dockerized workflow, static fusion, evidence graph, harm certificates, or agent skill package, cite ModelFP. See CITATION.cff and docs/RELATED_WORK.md.