You'll be working with historical order and shipment data from an e-commerce logistics platform to build and improve a delivery time prediction model. This exercise simulates real-world ML engineering challenges you might encounter when optimizing supply chain operations.
You're helping an e-commerce logistics platform improve their delivery time predictions. The platform connects multiple shippers (merchants) with various carriers (FedEx, UPS, USPS, DHL) to deliver packages across the US. Accurate delivery predictions are crucial for:
- Setting customer expectations at checkout
- Optimizing carrier selection
- Managing warehouse operations
- Improving customer satisfaction
- Explore the Data - Dive into historical shipment records to understand delivery patterns and spot any interesting anomalies
- Review the Model - Examine how we currently predict delivery times using package dimensions, distance, carrier, and service level
- Problem Solve - Identify data quality issues and propose model improvements
- Design for Production - Discuss how to deploy, monitor, and maintain this system at scale
- Python 3.12+
- Basic familiarity with pandas and scikit-learn/XGBoost
# Clone the repository
git clone https://github.com/stordco/ai-team-ds-interview-challenge.git
cd ai-team-ds-interview-challenge
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Train the baseline model
python -m src.train --save
# Try making some predictions!
python -m src.predict --interactive- Look for Patterns: Are certain carriers or routes consistently faster/slower?
- Spot Anomalies: Any surprising delivery times that don't make sense?
- Seasonal Effects: Do delivery times vary by time of year?
- Geographic Quirks: Some state pairs might have unexpected behavior
- Feature Engineering: What additional features could improve predictions?
- Model Selection: Is XGBoost the best choice? What alternatives might work?
- Validation Strategy: How should we split data for time-series delivery predictions?
- Performance Metrics: Beyond RMSE, what metrics matter for the business?
- Real-time vs Batch: When do predictions need to happen?
- Model Monitoring: How do we know if the model degrades?
- Retraining Pipeline: How often should we update the model?
- Business Segments: Should different shippers get different models?
- Training Data: Pre-extracted training data in
data/training_data.csv - Predictions: Test predictions using
python -m src.predict --interactive
# Quick data exploration
python -c "
import pandas as pd
df = pd.read_csv('data/training_data.csv')
print('Dataset shape:', df.shape)
print('\nFirst few rows:')
print(df.head())
print('\nBasic statistics:')
print(df.describe())
"
# Interactive prediction mode (recommended for testing)
python -m src.predict --interactive
# Make a specific prediction
python -m src.predict --carrier FedEx --service-level ground \
--distance 500 --weight 2.5 --length 10 --width 8 --height 6 \
--from-state CA --to-state NY
# Batch predictions from JSON
echo '{"carrier": "FedEx", "service_level": "ground", "distance_miles": 500,
"weight": 2.5, "length": 10, "width": 8, "height": 6}' > request.json
python -m src.predict --json-file request.jsonYou're working with real-world inspired historical shipment data from our e-commerce logistics platform. The dataset represents thousands of completed deliveries with known outcomes.
The data/training_data.csv file contains pre-engineered features from our order and shipment history.
The BigQuery SQL that was used to query this data is in queries/parcel_features.sql.
Package Characteristics:
height,length,width- Package dimensions (inches)weight- Package weight (pounds)
Shipping Details:
distance_miles- Calculated distance between origin and destinationcarrier- The shipping company (FedEx, UPS, USPS, DHL)service_level- Speed of delivery (economy, standard, three_day, two_day, overnight)
Geographic Information:
from_state,to_state- Origin and destination states- Route characteristics that affect delivery times
Temporal Context:
day_of_week- When the package was shippedmonth- Captures seasonal patterns- Holiday and peak season indicators
What We're Predicting:
delivery_days- The actual number of days it took to deliver (our target variable)
We're using XGBoost to predict delivery times based on the features above. The current implementation is basic - think of it as an MVP that needs your expertise to improve.
This is a collaborative technical discussion, not a test! We want to see how you think about ML problems and work with existing code.
Duration: ~90 minutes Format: Screen sharing with live coding/analysis What we're looking for:
- How you explore and understand data
- Your approach to debugging ML issues
- Ideas for improving model performance
- Thoughts on productionizing ML systems
Feel free to:
- Ask questions about the business context
- Think out loud as you explore
- Use any tools or libraries you're comfortable with
- Google things - we all do it!
Let's have fun exploring this delivery prediction challenge together!