Adversarial Data Validation Navigational Toolkit
ADVNT exposes high-level helpers for common shift-aware workflows.
from advnt import run_adversarial_validation_workflow
artifacts = run_adversarial_validation_workflow(X_train, X_test)
print(artifacts["score"])
print(artifacts["feature_importances"])from advnt import run_shift_preparation_workflow
bundle = run_shift_preparation_workflow(
X_train,
X_test,
pseudo_label_threshold=0.1,
)
sample_weights = bundle["sample_weights"]
safe = bundle["safe_pseudo_labels"]These helpers run AdversarialValidator internally and return outputs for common downstream tasks:
from advnt import (
compute_density_ratio_weights_from_train_test,
select_safe_pseudo_labels_from_train_test,
extract_model_importances_from_train_test,
)
weights = compute_density_ratio_weights_from_train_test(X_train, X_test)
safe = select_safe_pseudo_labels_from_train_test(X_train, X_test, threshold=0.1)
importances = extract_model_importances_from_train_test(X_train, X_test)neutralize_train_test_drift residualizes selected features against a generated
train/test indicator and optional predictor features. It returns train and test
residual blocks for the requested columns:
from advnt import neutralize_train_test_drift
train_resid, test_resid = neutralize_train_test_drift(
X_train,
X_test,
features=["shifted_feature"],
model_features=["stable_context_feature"],
)An interactive Streamlit app is included for adversarial validation diagnostics.
- Upload train and test datasets (CSV/Parquet).
- Strict schema validation (train/test must have identical columns).
- Choose adversarial model:
LGBMClassifierorLogisticRegression. - Run adversarial validation and inspect CV AV AUC and fold-level scores.
- View top-N model-based feature importances.
- SHAP force plots for train-domain (
class=0) and test-domain (class=1) rows. - SHAP ranking via mean absolute SHAP values.
streamlit run streamlit_app.pyStreamlit Cloud installs the root requirements.txt before launching the app.
You can use a sklearn-style neural model with a dedicated adversarial-validation head:
from advnt import AdversarialValidator, ADVMLPClassifier
av = AdversarialValidator(
model=ADVMLPClassifier(
hidden_dims=(64, 32),
max_epochs=20,
batch_size=256,
random_state=42,
),
random_state=42,
)
av.fit(X_train, X_test)
print(av.score_)Note: this model requires
torchto be installed.
You can also train the neural estimators as regular models. The classifier
supports binary and multiclass labels and optionally accepts eval_set=[(X_test,)]
so the adversarial head treats that block as target-domain data while the main
head learns from your normal y labels:
from advnt import ADVMLPClassifier, ADVMLPRegressor
clf = ADVMLPClassifier(max_epochs=20, random_state=42)
clf.fit(X_train, y_train, eval_set=[(X_test,)])
reg = ADVMLPRegressor(max_epochs=20, random_state=42)
reg.fit(X_train, y_train, eval_set=[(X_test,)])