Shanmukha Sai Dheeraz Chavali
MSc Data Science @ ETH Zürich & EPFL · Building intelligent systems at the intersection of ML, quantitative finance, and scalable data engineering.
My Journey & Impact
I am a Data Science MSc student jointly at ETH Zürich and EPFL, specializing in machine learning, NLP, and large-scale data systems. My work spans from building enterprise-grade RAG pipelines and LLM evaluation frameworks to designing predictive models for financial risk and quantitative research.
With 3+ years of industry experience at Accenture, Philip Morris International, and the University of Basel, I have shipped production ML systems on AWS, built graph-based knowledge retrieval engines, and led cross-functional analytics projects that drove measurable business impact.
I am passionate about the convergence of AI and quantitative finance — designing systems that are not only technically rigorous but also deliver real-world value at scale.
Work Experience & Internships
A track record of building production ML systems and driving measurable impact
Data Science & AI Engineer Intern
- Architecting enterprise-scale RAG pipeline on AWS Bedrock with semantic chunking and custom re-ranking, reducing hallucination rates by ~40%
- Building A/B test harnesses via SageMaker MLflow across 15+ prompt-model combinations on correctness, faithfulness, and latency
Research & Teaching Assistant
- Spearheading interdisciplinary AI-humanities research integrating LLMs into historical scholarship
- Built graph-based RAG pipeline (Neo4j + LangChain) improving context retrieval accuracy by 40%
Research Assistant
- Developed scalable NLP pipelines (HuggingFace, spaCy) for sentiment analysis and topic modeling
- Designed multimodal ML models (TensorFlow, PyTorch) for cross-platform trade analysis
Advanced Application Engineering Sr. Analyst
- Built predictive analytics pipeline on AWS (S3, Glue, Redshift, SageMaker) generating $60K/quarter savings
- Designed Amazon Connect + Salesforce integration saving $40K/quarter in licensing costs
- Automated patient voice query documentation system saving $30K/quarter
Junior Data Engineer
- Designed hybrid vector-graph search system with 85% precision improvement
- Reduced storage costs by $25K/month via S3 intelligent tiering
A Collection of My Work
From enterprise ML pipelines to cutting-edge research — systems built for real-world impact
Financial Risk Analytics Automation Tool
End-to-end Python tool automating multi-source time-series ingestion, ensemble risk modelling, and LLM-powered memo generation into reproducible Plotly dashboards.
Enterprise RAG Pipeline (AWS Bedrock)
Semantic chunking + custom re-ranking RAG system on AWS Bedrock Knowledge Bases, reducing hallucination by ~40% with contextual guardrails.
Predictive Maintenance System
Ensemble model (XGBoost + Random Forest + TFT) on multivariate sensor time-series, reducing false failure alerts by 35% with full reproducible pipeline.
Deepfake Detection Pipeline
Hybrid CNN-LSTM (ResNeXt-101 + BiLSTM) pipeline achieving 89% accuracy on FaceForensics++, robust to StyleGAN3 and diffusion-based attacks.
Graph-RAG Historical NLP Pipeline
Neo4j + LangChain graph-based RAG enabling multi-hop reasoning across ancient scholarly texts, improving retrieval accuracy by 40% over keyword search.
Semantic Segmentation on Satellite Imagery
U-Net + EfficientNet-B7 achieving 92% mIoU on LandCover.ai; 40% training acceleration via FP16 mixed precision. Full W&B experiment tracking.
Technical Competencies
Technologies and frameworks I work with to build intelligent systems
Programming
ML / AI
Cloud & Data
Quant / Analytics
Academic Background
A rigorous foundation in data science, machine learning, and quantitative methods
Key Courses
Academic Coursework
A rigorous curriculum spanning mathematical foundations, machine learning, and applied AI systems
Get in Touch
Open to internship opportunities in Quant Research, ML Engineering, Data Science