📍 Based in Nairobi, Kenya
✉️ ngangam93@gmail.com
✅ Air Quality API: 18.184.3.203:8000/docs — Live on AWS EC2 Frankfurt
✅ Fraud Detection: 18.184.3.203:8003/docs
✅ Churn Prediction: 18.184.3.203:8002/docs
✅ Customer Segmentation: 18.184.3.203:8004/docs
✅ Credit Risk Scoring: 18.184.3.203:8005/docs
✅ Recommendation Dashboard: recommendation-system-dashboard.onrender.com
✅ Recommendation API: 18.184.3.203:8001/docs
✅ Air Quality API live on AWS EC2 Frankfurt — Deployed 30 May 2026. Docker containerised. No cold starts. No monthly suspensions. 24/7 uptime.
🔭 Currently building: Week 9 — Apache Airflow · DAGs · Parallel Model Training · Scheduling
🌱 Next: Week 10 — AWS S3 · RDS · ECR · HTTPS · Domain Name
🤝 Open to: African AI · banking · telecom · healthcare · environmental analytics
🏆 Best Paper Award — Beijing Institute of Technology 2018 · 34 countries
⚡ "Data is only as powerful as the institution's willingness to act on it. I have spent ten years building both."
| Achievement | Detail |
|---|---|
| 🌍 Environmental anomaly detection | ARIMA RMSE 9.93 · PM2.5 spike 469 µg/m³ detected · 11,998 OpenAQ readings · live dashboard + API |
| 🎬 Movie Recommendation System | Item-CF RMSE 0.9540 · P@10 69.7% · 943 users · 1,682 movies · CineAI Netflix-standard dashboard · live |
| 🎯 Real-time fraud detection | 284,807 transactions · Kafka streaming · 22ms response · live production API |
| 💳 Credit risk scoring + SHAP explainability | XGBoost · ROC-AUC 0.703 · SHAP waterfall · Basel III ready · live production API |
| 🔍 RAG Document Search — prototype | 1,244-page semantic search · 4,329 chunks · LaBSE · 109 languages · full system Week 12 |
| 🌍 Kiswahili NLP — prototype | Zero-shot classification · UNEP Strategic Objectives · mBERT · 104 languages · full system Week 15 |
| 📊 Institutional M&E Architecture | 50+ KPIs · Results-Based Management · World Bank KYEOP — A-rating from Ministry of Public Service |
| 🏆 Best Paper Award | 22nd International ICIT · Beijing Institute of Technology · 2018 · 34 countries |
| 🔄 8 production systems | 6 Systems Live on AWS EC2 Frankfurt · deployed 31 May 2026 |
One production-grade project per week — model development · containerisation · cloud deployment · live monitoring · automated retraining. Every week ships.
Week 8 complete — 8 done · 7 remaining
| Week | Project | Stack | Status |
|---|---|---|---|
| 01 | Churn Prediction Pipeline | XGBoost · FastAPI · Docker · PostgreSQL · Grafana | ✅ Live API · Repo |
| 02 | Real-Time Fraud Detection | LightGBM · Kafka · Redis · FastAPI · Docker · Grafana | ✅ Live API · Repo |
| 03 | Customer Segmentation | KMeans · PCA · MLflow · Evidently · Streamlit · FastAPI · Docker | ✅ Live API · Repo |
| 04 | RAG Document Search System | LaBSE · ChromaDB · FastAPI · pypdf · Docker | ✅ Local confirmed · AWS pending · Repo |
| 05 | Credit Risk Scoring + Propensity + RFM | XGBoost · SHAP · ADASYN · DVC · RFM · FastAPI · PostgreSQL · Grafana · Docker | ✅ Live API · Repo |
| 06 | Environmental Anomaly Detection + Time Series 🌍 | ARIMA · Prophet · LSTM (PyTorch) · Isolation Forest · Streamlit · FastAPI · Docker | ✅ Dashboard · API · Repo |
| 07 | Recommendation System | Item-CF · SVD · scikit-surprise · FastAPI · Streamlit · PostgreSQL · Docker | ✅ Dashboard · API · Repo |
| 08 | MLOps Automation | MLflow · DVC · Evidently AI · GitHub Actions · Prefect · Week 6 case study | ✅ Complete |
| 09 | Apache Airflow — Pipeline Orchestration | Airflow · DAGs · Scheduling · Alerting · Docker | 🔄 In Progress |
| 10 | Cloud Deployment — AWS / GCP | EC2 · RDS · ECR · HTTPS · Cloud Run | 🔲 Pending |
| 11 | Environmental Capstone 🌍 | Random Forest · LSTM · Global Forest Watch · FastAPI · Docker | 🔲 Pending |
| 12 | Advanced RAG Chatbot | LangChain · FAISS · HuggingFace · FastAPI · Docker | 🔲 Pending · Prototype |
| 13 | Apache Spark — Big Data | PySpark · Spark MLlib · dbt · BigQuery | 🔲 Pending |
| 14 | NLP — Text Classification | HuggingFace · BERT · spaCy · FastAPI · Docker | 🔲 Pending |
| 15 | Kiswahili NLP 🌍 | mBERT · AfriBERTa · HuggingFace Hub · AWS | 🔲 Pending · Prototype |
Every project: production-grade code · containerised deployment · documented README · tested endpoints · no shortcuts.
mBERT · AfriBERTa · HuggingFace Transformers · MLflow · FastAPI · Docker · AWS
Kiswahili environmental text classifier connecting East African language knowledge to global environmental monitoring. Over 200 million East Africans speak Kiswahili yet most AI systems are built primarily in English — leaving indigenous communities unable to contribute environmental observations in their own language.
Classifies Kiswahili text by UNEP Strategic Objective:
| Objective | Focus | Example |
|---|---|---|
| SO1 | Climate Stability | Mabadiliko ya tabianchi yanaathiri wakulima |
| SO2 | Biodiversity | Viumbe vingi vya porini viko hatarini kutoweka |
| SO3 | Pollution & Waste | Plastiki nyingi zinatupwa baharini |
🔨 Prototype Complete · Zero-shot classification running · Full system Week 15 · Open-source release on HuggingFace Hub · Repository
Item-CF · SVD · scikit-surprise · FastAPI · Streamlit · PostgreSQL · Docker · AWS EC2
Production recommendation engine trained on 100,000 ratings from 943 users on 1,682 movies. Item-CF wins with RMSE 0.9540 and Precision@10 of 69.7%. Netflix-standard CineAI dashboard with three tabs — recommendations, data insights, model performance.
✅ Complete · Week 7 · Tests: 6/6 passing · Live Dashboard · Live API · Repository
⚠️ Deployment Notice: Full production system deploying to AWS EC2 as part of Week 15. Prototype complete and documented above.
ARIMA · Prophet · LSTM (PyTorch) · Isolation Forest · Streamlit · FastAPI · Docker · AWS EC2
Production environmental monitoring pipeline trained on 11,998 real PM2.5 sensor readings from 5 Nairobi locations via OpenAQ. Answers two questions automatically for every hourly reading: What will PM2.5 be next? Is this reading dangerous?
Key EDA findings:
- PM2.5 peaks at 4am every day — night burning trapped in cold air
- Friday is consistently the worst day of the week
- Maximum spike: 469 µg/m³ on 2024-02-18 at 4am — 93x the WHO annual safe limit
- 1.8% of all readings exceed the dangerous US EPA threshold of 55 µg/m³
| Model | RMSE | MAE | Type |
|---|---|---|---|
| ARIMA ✅ Best | 9.93 | 8.35 | Forecasting |
| LSTM (PyTorch) | 19.46 | 17.87 | Deep Learning |
| Prophet | 22.05 | 19.40 | Forecasting |
| Isolation Forest | — | — | Anomaly Detection |
✅ Complete · Week 6 · Tests: 10/10 passing · Live Dashboard · Live API · Repository
XGBoost · SHAP · ADASYN · DVC · RFM · FastAPI · PostgreSQL · Grafana · Docker · AWS EC2 Credit risk scoring for loan applicants — answers three questions simultaneously: Will they default? Will they accept the offer? How valuable are they? Built on 252,971 real LendingClub loans. ROC-AUC 0.703. SHAP explainability for every decision — Basel III compliant audit trail. Propensity scoring, RFM segmentation, 6-panel Grafana monitoring dashboard, PostgreSQL predictions storage.
✅ Complete · Week 5 · Live API · Repository
LaBSE · ChromaDB · FastAPI · pypdf · Docker Semantic search across a 1,244-page technical document — 4,329 chunks indexed using LaBSE multilingual embeddings supporting 109 languages. Questions in any language retrieve relevant passages with exact page references in under one second. Full production system completing at Week 12.
🔨 Prototype Complete · Week 4 · Full system Week 12 · Repository
KMeans · PCA · StandardScaler · MLflow · Evidently · Streamlit · FastAPI · Docker Telecom customer segmentation — 7,032 customers grouped into 4 behavioural segments. Optimal K selected through elbow method and silhouette scoring. Live FastAPI inference, Streamlit dashboard, Evidently drift monitoring, MLflow experiment tracking.
✅ Complete · Week 3 · Live API · Repository
LightGBM · Kafka · Redis · FastAPI · Prometheus · Grafana · Docker Real-time fraud scoring — 284,807 transactions, 22ms response time. Kafka streaming pipeline with Redis caching, Prometheus monitoring and 5-panel Grafana dashboard.
✅ Complete · Week 2 · Live API · Repository
XGBoost · SMOTE · FastAPI · Docker · PostgreSQL · Grafana End-to-end telecom churn pipeline — feature engineering, class balancing, model training, containerised API deployment with live monitoring dashboard.
✅ Complete · Week 1 · Live API · Repository
💻 Programming and Data Science
⚙️ Deployment and Infrastructure
| Area | Skills |
|---|---|
| Machine Learning | Fraud detection · churn · credit risk · segmentation · time series forecasting · anomaly detection · SHAP explainability |
| Deep Learning | LSTM (PyTorch) · sequence modelling · time series · OOP neural network architecture |
| MLOps | End-to-end pipelines · Docker · MLflow · Evidently drift monitoring · pytest · DVC |
| Streaming | Real-time scoring · Apache Kafka · Redis caching · sub-22ms latency |
| NLP & RAG | Semantic search · LaBSE · ChromaDB · vector embeddings · multilingual · 109 languages |
| Environmental ML | Air quality forecasting · PM2.5 anomaly detection · OpenAQ · ARIMA · Isolation Forest |
| Cloud | AWS EC2 · RDS · ECR · Docker · HTTPS · Render deployment |
| Research & M&E | MSc Marketing Analytics · World Bank KYEOP · RBM · 50+ KPI frameworks · Board diversity research |
- 🎓 MSc Marketing Analytics — University of Nairobi — In Progress 2026
- 🌍 Kiswahili NLP — Building African language AI for East African communities — Full system Week 15
- ☁️ AWS/GCP Cloud Certification — Target 2026
15 weeks. 15 production projects. One complete MLOps engineer. Building in public — no shortcuts.







