Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7957e83
chore(deps): bump sentencepiece from 0.2.0 to 0.2.1 (#114)
dependabot[bot] Feb 13, 2026
04091e0
chore(deps): bump cryptography from 44.0.2 to 46.0.5 (#119)
dependabot[bot] Feb 13, 2026
1ab0b95
chore(deps): bump pillow and protobuf pins
sidmohan0 Feb 13, 2026
8163e80
chore: bump version to 4.3.0a1 [skip ci]
actions-user Feb 16, 2026
33fae4e
chore: bump version to 4.3.0b2 [skip ci]
actions-user Feb 19, 2026
744370b
chore: bump version to 4.3.0a2 [skip ci]
actions-user Feb 23, 2026
84d52db
chore: bump version to 4.3.0b3 [skip ci]
actions-user Feb 26, 2026
4108984
chore: bump version to 4.3.0a3 [skip ci]
actions-user Mar 2, 2026
cb7d951
chore: bump version to 4.3.0b4 [skip ci]
actions-user Mar 5, 2026
107df28
chore: bump version to 4.3.0a4 [skip ci]
actions-user Mar 9, 2026
0788e82
chore: bump version to 4.3.0b5 [skip ci]
actions-user Mar 12, 2026
14ae62e
chore: bump version to 4.3.0a5 [skip ci]
actions-user Mar 16, 2026
fac558d
chore: bump version to 4.3.0b6 [skip ci]
actions-user Mar 19, 2026
14347be
chore: bump version to 4.3.0a6 [skip ci]
actions-user Mar 23, 2026
189b77d
chore: bump version to 4.3.0b7 [skip ci]
actions-user Mar 26, 2026
d4602d0
chore: bump version to 4.3.0a7 [skip ci]
actions-user Mar 30, 2026
879ae70
chore: bump version to 4.3.0b8 [skip ci]
actions-user Apr 2, 2026
3b1a5e8
chore: bump version to 4.3.0a8 [skip ci]
actions-user Apr 6, 2026
372d308
chore: bump version to 4.3.0b9 [skip ci]
actions-user Apr 9, 2026
936ca14
chore: bump version to 4.3.0a9 [skip ci]
actions-user Apr 13, 2026
22e9820
chore: bump version to 4.3.0b10 [skip ci]
actions-user Apr 16, 2026
0a67ad3
chore: bump version to 4.3.0a10 [skip ci]
actions-user Apr 20, 2026
aa72c2e
chore: bump version to 4.3.0b11 [skip ci]
actions-user Apr 23, 2026
4752024
Add v4.4 bridge release runway (#130)
sidmohan0 Apr 26, 2026
4f22b6a
ci: allow prerelease base override (#131)
sidmohan0 Apr 26, 2026
6845ef7
chore: bump version to 4.4.0b1 [skip ci]
actions-user Apr 26, 2026
8ff0618
Make telemetry opt-in for v4.4 (#132)
sidmohan0 Apr 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: Bug report
description: Report something that is broken or surprising.
title: "bug: "
labels: ["bug"]
body:
- type: markdown
attributes:
value: Thanks for reporting a DataFog issue.
- type: textarea
id: summary
attributes:
label: Summary
description: What happened?
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Reproduction
description: Minimal code, command, or steps to reproduce.
render: python
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected behavior
validations:
required: true
- type: input
id: version
attributes:
label: DataFog version
placeholder: "4.3.0"
- type: input
id: python
attributes:
label: Python version
placeholder: "3.12.4"
- type: dropdown
id: profile
attributes:
label: Install profile
options:
- core
- cli
- nlp
- nlp-advanced
- ocr
- distributed
- all
- not sure
- type: textarea
id: environment
attributes:
label: Environment details
description: OS, package manager, relevant dependency versions, or CI link.
- type: textarea
id: extra
attributes:
label: Additional context
description: Logs, screenshots, or related issues.
5 changes: 5 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
blank_issues_enabled: true
contact_links:
- name: Security report
url: https://github.com/DataFog/datafog-python/security/advisories/new
about: Please report security vulnerabilities privately.
41 changes: 41 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: Feature request
description: Suggest an improvement or new workflow.
title: "feat: "
labels: ["enhancement"]
body:
- type: textarea
id: problem
attributes:
label: Problem
description: What user problem would this solve?
validations:
required: true
- type: textarea
id: proposal
attributes:
label: Proposed solution
description: What would you like DataFog to do?
validations:
required: true
- type: dropdown
id: area
attributes:
label: Area
options:
- Core scan/redaction
- CLI
- LLM guardrails
- NLP engines
- OCR/image processing
- Spark/distributed processing
- Packaging/install
- Documentation
- Other
- type: textarea
id: alternatives
attributes:
label: Alternatives considered
- type: textarea
id: extra
attributes:
label: Additional context
37 changes: 37 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
## Summary

Describe the change and why it is needed.

## Type

- [ ] Bug fix
- [ ] Feature
- [ ] Docs
- [ ] Tests
- [ ] Chore

## Target Branch

- [ ] This PR targets `dev`
- [ ] This PR targets `main` for a release/hotfix

## Validation

Commands run:

```bash

```

Optional profiles tested:

- [ ] core
- [ ] cli
- [ ] nlp
- [ ] nlp-advanced
- [ ] ocr
- [ ] distributed

## Notes For Reviewers

Mention API changes, migrations, warnings, or release-note needs.
10 changes: 9 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,16 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12", "3.13"]
install-profile: ["core", "nlp", "nlp-advanced"]
exclude:
# v4.4.0 claims Python 3.13 support for core + CLI first.
# Optional heavyweight profiles remain validated separately before
# we advertise Python 3.13 support for them.
- python-version: "3.13"
install-profile: "nlp"
- python-version: "3.13"
install-profile: "nlp-advanced"
steps:
- uses: actions/checkout@v4
- name: Set up Python
Expand Down
50 changes: 42 additions & 8 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ on:
default: false
type: boolean
version_override:
description: "Override version (e.g. 4.4.0) — stable only"
description: "Override stable version or prerelease base (e.g. 4.4.0)"
required: false
type: string

Expand Down Expand Up @@ -139,8 +139,40 @@ jobs:
OMP_NUM_THREADS=4 MKL_NUM_THREADS=4 OPENBLAS_NUM_THREADS=4 python tests/simple_performance_test.py

# ── 3. Build & Publish ────────────────────────────────────────────────
python313-core:
needs: determine-release
if: needs.determine-release.outputs.has_changes == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ needs.determine-release.outputs.target_branch }}

- name: Set up Python 3.13
uses: actions/setup-python@v5
with:
python-version: "3.13"
cache: "pip"

- name: Install core + CLI dependencies
run: |
python -m pip install --upgrade pip
pip install pytest pytest-cov coverage
pip install -e ".[dev,cli]"

- name: Run Python 3.13 core + CLI tests
run: |
pytest tests/ \
-m "not slow" \
--ignore=tests/test_gliner_annotator.py \
--ignore=tests/test_image_service.py \
--ignore=tests/test_ocr_integration.py \
--ignore=tests/test_spark_integration.py \
--ignore=tests/test_text_service_integration.py

publish:
needs: [determine-release, test]
needs: [determine-release, test, python313-core]
runs-on: ubuntu-latest
outputs:
version: ${{ steps.version.outputs.version }}
Expand Down Expand Up @@ -182,6 +214,13 @@ jobs:

# Strip any pre-release suffix to get base version
BASE=$(echo "$CURRENT" | sed -E 's/(a|b)[0-9]+([.][0-9A-Za-z]+)?$//')
if [ -n "${{ inputs.version_override }}" ]; then
BASE="${{ inputs.version_override }}"
if echo "$BASE" | grep -Eq '(a|b)[0-9]+([.][0-9A-Za-z]+)?$'; then
echo "version_override must be a stable base version like 4.4.0, not a prerelease"
exit 1
fi
fi
echo "Base version: $BASE"

if [ "$TYPE" = "alpha" ]; then
Expand All @@ -199,12 +238,7 @@ jobs:
VERSION="${BASE}b${BETA_NUM}"

else
# Stable: use override or base version
if [ -n "${{ inputs.version_override }}" ]; then
VERSION="${{ inputs.version_override }}"
else
VERSION="$BASE"
fi
VERSION="$BASE"
fi

echo "version=$VERSION" >> $GITHUB_OUTPUT
Expand Down
112 changes: 91 additions & 21 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,99 @@
# Contributing guidelines
# Contributing to DataFog Python

# Contributors
Thanks for helping improve DataFog. The project welcomes issues, bug reports,
documentation fixes, tests, and pull requests.

- sroy9675
- pselvana
- sidmohan0
Please follow the [Code of Conduct](CODE_OF_CONDUCT.md) in all project spaces.

## Branch And PR Policy

DataFog uses `dev` as the default development branch and `main` as the stable
release branch.

Use this workflow for normal contributions:

1. Fork the repository or create a topic branch from `dev`.
2. Name branches with a GitHub username prefix when practical, for example
`sidmohan0/dfpy-v44-bridge` or `yourname/fix-cli-redaction`.
3. Open pull requests into `dev`.
4. Keep pull requests focused and include tests or docs when behavior changes.

Use `main` only for stable release promotion or urgent release hotfixes.
Do not use `dev` or `main` as working branches.

Maintainers should prefer pull requests even for small changes. Protected branch
rules should prevent branch deletion, require CI before merge, and avoid direct
pushes except for explicit emergency maintenance.

## Local Development

```bash
git clone https://github.com/datafog/datafog-python
cd datafog-python
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
python -m pip install --upgrade pip
pip install -e ".[dev,cli]"
```

For optional NLP or OCR work, install the relevant extras:

```bash
pip install -e ".[dev,cli,nlp]"
pip install -e ".[dev,cli,nlp,nlp-advanced]"
pip install -e ".[all,dev]"
```

for their help
## Tests

The datafog community appreciates your contributions via issues and
pull requests. Note that the [code of conduct](CODE_OF_CONDUCT.md)
applies to all interactions with the datafog project, including
issues and pull requests.
Run the core test suite before opening a pull request:

When submitting pull requests, please follow the style guidelines of
the project, ensure that your code is tested and documented, and write
good commit messages, e.g., following [these
guidelines](https://chris.beams.io/posts/git-commit/).
```bash
pytest tests/ -m "not slow" \
--ignore=tests/test_gliner_annotator.py \
--ignore=tests/test_image_service.py \
--ignore=tests/test_ocr_integration.py \
--ignore=tests/test_spark_integration.py \
--ignore=tests/test_text_service_integration.py
```

By submitting a pull request, you are licensing your code under the
project [license](LICENSE) and affirming that you either own copyright
(automatic for most individuals) or are authorized to distribute under
the project license (e.g., in case your employer retains copyright on
your work).
Run the focused test file for the area you changed whenever possible. For
documentation-only changes, build the docs:

### Legal Notice
```bash
sphinx-build -b html docs docs/_build/html
```

When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license.
## Pull Request Checklist

Before requesting review:

- Rebase or merge the latest `dev`.
- Add or update tests for behavior changes.
- Update docs for user-facing changes.
- Keep public API changes explicit in the PR description.
- Note any optional dependency profile you tested, such as `core`, `nlp`, or
`nlp-advanced`.

## Commit Messages

Use clear, descriptive commit messages. Conventional-style prefixes are welcome
but not required, for example:

- `fix: handle empty scan input`
- `docs: clarify branch policy`
- `test: cover v5 preview redaction wrapper`

## Legal

By submitting a pull request, you license your contribution under the project
[license](LICENSE). You also affirm that you authored the contribution or have
the right to submit it under the project license.

## Contributors

Thanks to early contributors including:

- sroy9675
- pselvana
- sidmohan0
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ pip install datafog[nlp-advanced]
pip install datafog[all]
```

Python 3.13 support is certified for the core SDK and CLI. Optional extras such
as `nlp`, `nlp-advanced`, `ocr`, `distributed`, and `all` are available but not
yet certified on Python 3.13.

## Quick Start

```python
Expand Down Expand Up @@ -132,9 +136,15 @@ datafog hash-text "john@example.com"

## Telemetry

DataFog includes anonymous telemetry by default.
DataFog telemetry is disabled by default.

To opt in:

```bash
export DATAFOG_TELEMETRY=1
```

To opt out:
To force telemetry off:

```bash
export DATAFOG_NO_TELEMETRY=1
Expand Down
Loading
Loading