Skip to content
View raja-taparia's full-sized avatar

Block or report raja-taparia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
raja-taparia/README.md

🚀 Sunny Taparia

Builder | Designer | Architect | Founder | Former Head of Data & AI | Executive Technologist | HEC Paris EMBA

LinkedIn Medium Email

Typing SVG

🎯 Executive Summary

I am a senior Data, Analytics, and AI leader with over two decades of experience operating at the intersection of complex systems architecture and high-level commercial strategy. I specialize in turning ambiguous enterprise requirements into highly governed, automated data foundations that empower organizations to make rapid, data-driven decisions at scale.

Throughout my career, spanning roles as Head of Data roles at Evalueserve and Aon, I have successfully guided technology roadmaps, scaled engineering practices and managed global client portfolios. Backed by an Executive MBA from HEC Paris, my work proves that the most successful technology deployments seamlessly align architectural purity with measurable business impact.

💡 Core Architectural Focus

🤖 Agentic & Generative AI

Engineering secure, multi-agent orchestrations (LangGraph, AutoGen) and advanced RAG pipelines. I focus on deploying autonomous AI with strict safety guardrails, human-in-the-loop validation, and robust enterprise compliance structures.

🏗️ Enterprise Data Lakehouses

Designing scalable, cloud-native processing architectures (Databricks, Snowflake, GCP, Azure) capable of securely handling millions of daily events.

🛡️ Zero-Trust Data Governance

Implementing declarative data contracts (YAML/dbt), automated Service Level Objectives (SLOs), and centralized governance registries to establish unshakeable trust between data producers and consumers.

📈 Strategic Leadership

Managing complex P&Ls, chairing enterprise governance councils, and bridging the communication gap between C-suite stakeholders and hands-on engineering teams.

🛠️ Technology Stack & Expertise

Cloud & Infrastructure Data Engineering AI & Machine Learning Governance & BI
GCP Azure Docker Databricks Snowflake dbt Spark Python LangChain LangGraph Qdrant Contracts MDM PowerBI Tableau

📂 Featured Architecture & Open-Source Implementations

I believe in building systems that are robust, transparent, and built for scale. Below are selected repositories demonstrating my architectural approach across Agentic AI, Data Governance, and Legacy Migration.

A comprehensive framework engineered to enforce data reliability and establish a zero-trust governance model within the modern data stack.

Architecture: Utilizes YAML for declarative contracts and integrates natively with dbt for transformation and constraints testing.

Business Impact: Eradicates downstream data incidents by automating Service Level Objective (SLO) validation. Features a centralized governance registry that manages subscriber access, ensuring engineering hours are spent on product development rather than reactive pipeline triage.

2. 🤖 crafty-platform - Private Repo

An AI-powered, omnichannel agentic workspace designed from the ground up to act as an autonomous digital assistant for trades businesses.

Architecture: Built on a FastAPI backend utilizing pgvector for tenant-isolated semantic memory. Orchestrates a Supervisor Agent and specialized worker agents (Calendar, Mail, Operations) using LangGraph logic. Integrates Retell AI for native voice telephony, WhatsApp Cloud API, and Next.js/React Native frontends.

Business Impact: Solves the back-office scheduling bottleneck for tradespeople by providing 24/7 customer intake and zero-friction scheduling, complete with a built-in "Critic Agent" to ensure outbound communication safety and brand protection.

3. ⚖️ eor_copilot

A LangGraph-based multi-agent system designed to navigate HR and employment law compliance within highly regulated environments.

Architecture: Features a Supervisor → Retriever → Generator → Critic workflow. Utilizes local HuggingFace embeddings and Qdrant vector search to ensure proprietary legal frameworks remain within a secure VPC (Zero-Retention Local Embeddings).

Business Impact: Demonstrates how to safely deploy AI in regulated sectors. Accelerates complex analytics deployment times by 40% while featuring built-in missing variable detection, high-risk matter escalation (for example, discrimination claims), and confidence scoring to maintain absolute auditability.

An end-to-end Retrieval-Augmented Generation (RAG) system that transforms unstructured multimedia into instantly queryable knowledge bases.

Architecture: Orchestrates a complete ingestion pipeline utilizing OpenAI Whisper for transcription and custom pause-aware chunking to respect semantic boundaries. Embeddings are generated via Sentence-Transformers/Ollama, stored in Qdrant, and queried via a FastAPI backend utilizing reciprocal rank fusion (RRF).

Business Impact: Unlocks actionable insights from dark data (video transcripts, PDFs). Demonstrates advanced expertise in multi-modal vector search orchestration and building privacy-first generative AI applications capable of parsing unstructured media.

An opinionated AI orchestration tool designed to automate the migration of legacy Delphi logic into idiomatic Rust code.

Architecture: A Python-based pipeline that leverages Anthropic's Claude to handle complex syntax translation while automatically generating Golden Master characterization tests to ensure logic preservation.

Business Impact: Provides a highly structured, scalable framework for modernizing deeply entrenched legacy systems, drastically reducing the manual engineering overhead associated with systemic technical debt.

📊 GitHub Analytics & Activity

Sunny's GitHub Stats Sunny's GitHub Streak

Top Languages

"The future of enterprise technology is not just about building smarter algorithms; it is about engineering systems that are secure, governed, and inextricably linked to business outcomes."

📫 Open to discussing data strategy, Agentic AI orchestration, and executive technical leadership. Let's connect.

Pinned Loading

  1. rag-video-chatbot rag-video-chatbot Public

    RAG chatbot for querying video transcripts and PDFs using embeddings, Qdrant, and FastAPI.

    Python

  2. data-contracts-and-governance data-contracts-and-governance Public

    A complete data governance platform for managing data contracts, running tests, validating SLOs, and tracking data quality — all with dbt and Databricks

    Python

  3. delphi-to-rust-ai-orchestrator delphi-to-rust-ai-orchestrator Public

    An AI Orchestrator that migrates legacy Delphi logic and code into idiomatic Rust using an LLM (Anthropic Claude) and generates Golden Master tests

    Python

  4. eor_copilot eor_copilot Public

    A LangGraph‑based multi‑agent system for HR/employment‑law compliance using vector search and safety guardrails

    Python

  5. dengue-predict dengue-predict Public

    Forked from raja-taparia-dsr-mc1/dengue-predict

    Jupyter Notebook

  6. telekom_customer_churn_predictor telekom_customer_churn_predictor Public

    telekom customer churn predictor

    Python