Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: Bug report
description: Report something that is broken or surprising.
title: "bug: "
labels: ["bug"]
body:
- type: markdown
attributes:
value: Thanks for reporting a DataFog issue.
- type: textarea
id: summary
attributes:
label: Summary
description: What happened?
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Reproduction
description: Minimal code, command, or steps to reproduce.
render: python
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected behavior
validations:
required: true
- type: input
id: version
attributes:
label: DataFog version
placeholder: "4.3.0"
- type: input
id: python
attributes:
label: Python version
placeholder: "3.12.4"
- type: dropdown
id: profile
attributes:
label: Install profile
options:
- core
- cli
- nlp
- nlp-advanced
- ocr
- distributed
- all
- not sure
- type: textarea
id: environment
attributes:
label: Environment details
description: OS, package manager, relevant dependency versions, or CI link.
- type: textarea
id: extra
attributes:
label: Additional context
description: Logs, screenshots, or related issues.
5 changes: 5 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
blank_issues_enabled: true
contact_links:
- name: Security report
url: https://github.com/DataFog/datafog-python/security/advisories/new
about: Please report security vulnerabilities privately.
41 changes: 41 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: Feature request
description: Suggest an improvement or new workflow.
title: "feat: "
labels: ["enhancement"]
body:
- type: textarea
id: problem
attributes:
label: Problem
description: What user problem would this solve?
validations:
required: true
- type: textarea
id: proposal
attributes:
label: Proposed solution
description: What would you like DataFog to do?
validations:
required: true
- type: dropdown
id: area
attributes:
label: Area
options:
- Core scan/redaction
- CLI
- LLM guardrails
- NLP engines
- OCR/image processing
- Spark/distributed processing
- Packaging/install
- Documentation
- Other
- type: textarea
id: alternatives
attributes:
label: Alternatives considered
- type: textarea
id: extra
attributes:
label: Additional context
37 changes: 37 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
## Summary

Describe the change and why it is needed.

## Type

- [ ] Bug fix
- [ ] Feature
- [ ] Docs
- [ ] Tests
- [ ] Chore

## Target Branch

- [ ] This PR targets `dev`
- [ ] This PR targets `main` for a release/hotfix

## Validation

Commands run:

```bash

```

Optional profiles tested:

- [ ] core
- [ ] cli
- [ ] nlp
- [ ] nlp-advanced
- [ ] ocr
- [ ] distributed

## Notes For Reviewers

Mention API changes, migrations, warnings, or release-note needs.
10 changes: 9 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,16 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12", "3.13"]
install-profile: ["core", "nlp", "nlp-advanced"]
exclude:
# v4.4.0 claims Python 3.13 support for core + CLI first.
# Optional heavyweight profiles remain validated separately before
# we advertise Python 3.13 support for them.
- python-version: "3.13"
install-profile: "nlp"
- python-version: "3.13"
install-profile: "nlp-advanced"
steps:
- uses: actions/checkout@v4
- name: Set up Python
Expand Down
34 changes: 33 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,40 @@ jobs:
OMP_NUM_THREADS=4 MKL_NUM_THREADS=4 OPENBLAS_NUM_THREADS=4 python tests/simple_performance_test.py

# ── 3. Build & Publish ────────────────────────────────────────────────
python313-core:
needs: determine-release
if: needs.determine-release.outputs.has_changes == 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ needs.determine-release.outputs.target_branch }}

- name: Set up Python 3.13
uses: actions/setup-python@v5
with:
python-version: "3.13"
cache: "pip"

- name: Install core + CLI dependencies
run: |
python -m pip install --upgrade pip
pip install pytest pytest-cov coverage
pip install -e ".[dev,cli]"

- name: Run Python 3.13 core + CLI tests
run: |
pytest tests/ \
-m "not slow" \
--ignore=tests/test_gliner_annotator.py \
--ignore=tests/test_image_service.py \
--ignore=tests/test_ocr_integration.py \
--ignore=tests/test_spark_integration.py \
--ignore=tests/test_text_service_integration.py

publish:
needs: [determine-release, test]
needs: [determine-release, test, python313-core]
runs-on: ubuntu-latest
outputs:
version: ${{ steps.version.outputs.version }}
Expand Down
112 changes: 91 additions & 21 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,99 @@
# Contributing guidelines
# Contributing to DataFog Python

# Contributors
Thanks for helping improve DataFog. The project welcomes issues, bug reports,
documentation fixes, tests, and pull requests.

- sroy9675
- pselvana
- sidmohan0
Please follow the [Code of Conduct](CODE_OF_CONDUCT.md) in all project spaces.

## Branch And PR Policy

DataFog uses `dev` as the default development branch and `main` as the stable
release branch.

Use this workflow for normal contributions:

1. Fork the repository or create a topic branch from `dev`.
2. Name branches with a GitHub username prefix when practical, for example
`sidmohan0/dfpy-v44-bridge` or `yourname/fix-cli-redaction`.
3. Open pull requests into `dev`.
4. Keep pull requests focused and include tests or docs when behavior changes.

Use `main` only for stable release promotion or urgent release hotfixes.
Do not use `dev` or `main` as working branches.

Maintainers should prefer pull requests even for small changes. Protected branch
rules should prevent branch deletion, require CI before merge, and avoid direct
pushes except for explicit emergency maintenance.

## Local Development

```bash
git clone https://github.com/datafog/datafog-python
cd datafog-python
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
python -m pip install --upgrade pip
pip install -e ".[dev,cli]"
```

For optional NLP or OCR work, install the relevant extras:

```bash
pip install -e ".[dev,cli,nlp]"
pip install -e ".[dev,cli,nlp,nlp-advanced]"
pip install -e ".[all,dev]"
```

for their help
## Tests

The datafog community appreciates your contributions via issues and
pull requests. Note that the [code of conduct](CODE_OF_CONDUCT.md)
applies to all interactions with the datafog project, including
issues and pull requests.
Run the core test suite before opening a pull request:

When submitting pull requests, please follow the style guidelines of
the project, ensure that your code is tested and documented, and write
good commit messages, e.g., following [these
guidelines](https://chris.beams.io/posts/git-commit/).
```bash
pytest tests/ -m "not slow" \
--ignore=tests/test_gliner_annotator.py \
--ignore=tests/test_image_service.py \
--ignore=tests/test_ocr_integration.py \
--ignore=tests/test_spark_integration.py \
--ignore=tests/test_text_service_integration.py
```

By submitting a pull request, you are licensing your code under the
project [license](LICENSE) and affirming that you either own copyright
(automatic for most individuals) or are authorized to distribute under
the project license (e.g., in case your employer retains copyright on
your work).
Run the focused test file for the area you changed whenever possible. For
documentation-only changes, build the docs:

### Legal Notice
```bash
sphinx-build -b html docs docs/_build/html
```

When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license.
## Pull Request Checklist

Before requesting review:

- Rebase or merge the latest `dev`.
- Add or update tests for behavior changes.
- Update docs for user-facing changes.
- Keep public API changes explicit in the PR description.
- Note any optional dependency profile you tested, such as `core`, `nlp`, or
`nlp-advanced`.

## Commit Messages

Use clear, descriptive commit messages. Conventional-style prefixes are welcome
but not required, for example:

- `fix: handle empty scan input`
- `docs: clarify branch policy`
- `test: cover v5 preview redaction wrapper`

## Legal

By submitting a pull request, you license your contribution under the project
[license](LICENSE). You also affirm that you authored the contribution or have
the right to submit it under the project license.

## Contributors

Thanks to early contributors including:

- sroy9675
- pselvana
- sidmohan0
18 changes: 18 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Security Policy

Please do not report security vulnerabilities in public issues.

Use GitHub private vulnerability reporting when available:

https://github.com/DataFog/datafog-python/security/advisories/new

If private vulnerability reporting is unavailable, contact the maintainers
directly and include:

- affected versions;
- a minimal reproduction or proof of concept;
- expected impact;
- any known mitigations.

We will acknowledge valid reports as quickly as practical and coordinate fixes
before public disclosure.
Loading
Loading