Available for opportunities

Shanmukha Sai Dheeraz Chavali

>|

MSc Data Science @ ETH Zürich & EPFL · Building intelligent systems at the intersection of ML, quantitative finance, and scalable data engineering.

Download CV
Scroll
About Me

My Journey & Impact

DC

I am a Data Science MSc student jointly at ETH Zürich and EPFL, specializing in machine learning, NLP, and large-scale data systems. My work spans from building enterprise-grade RAG pipelines and LLM evaluation frameworks to designing predictive models for financial risk and quantitative research.

With 3+ years of industry experience at Accenture, Philip Morris International, and the University of Basel, I have shipped production ML systems on AWS, built graph-based knowledge retrieval engines, and led cross-functional analytics projects that drove measurable business impact.

I am passionate about the convergence of AI and quantitative finance — designing systems that are not only technically rigorous but also deliver real-world value at scale.

Currently:ETH ZürichEPFLPhilip Morris International
3+Years Experience
5+Projects Shipped
2Research Papers
ZürichBased In
Experience

Work Experience & Internships

A track record of building production ML systems and driving measurable impact

Mar 2026 – PresentVaud, Switzerland

Data Science & AI Engineer Intern

Philip Morris International
  • Architecting enterprise-scale RAG pipeline on AWS Bedrock with semantic chunking and custom re-ranking, reducing hallucination rates by ~40%
  • Building A/B test harnesses via SageMaker MLflow across 15+ prompt-model combinations on correctness, faithfulness, and latency
AWS BedrockRAGSageMakerMLflowLLM
Feb 2025 – Jan 2026Basel, Switzerland

Research & Teaching Assistant

University of Basel
  • Spearheading interdisciplinary AI-humanities research integrating LLMs into historical scholarship
  • Built graph-based RAG pipeline (Neo4j + LangChain) improving context retrieval accuracy by 40%
Neo4jLangChainNLPLLMResearch
Nov 2024 – Feb 2025Basel, Switzerland

Research Assistant

University of Basel (SNSF Project)
  • Developed scalable NLP pipelines (HuggingFace, spaCy) for sentiment analysis and topic modeling
  • Designed multimodal ML models (TensorFlow, PyTorch) for cross-platform trade analysis
HuggingFacespaCyTensorFlowPyTorchNLP
Feb 2022 – Aug 2024Bengaluru, India

Advanced Application Engineering Sr. Analyst

Accenture
  • Built predictive analytics pipeline on AWS (S3, Glue, Redshift, SageMaker) generating $60K/quarter savings
  • Designed Amazon Connect + Salesforce integration saving $40K/quarter in licensing costs
  • Automated patient voice query documentation system saving $30K/quarter
AWSSageMakerSalesforceRedshiftPython
May 2021 – Dec 2021Chennai, India

Junior Data Engineer

Indium Software
  • Designed hybrid vector-graph search system with 85% precision improvement
  • Reduced storage costs by $25K/month via S3 intelligent tiering
AWS S3Data EngineeringETLPython
Projects

A Collection of My Work

From enterprise ML pipelines to cutting-edge research — systems built for real-world impact

Production-grade

Financial Risk Analytics Automation Tool

End-to-end Python tool automating multi-source time-series ingestion, ensemble risk modelling, and LLM-powered memo generation into reproducible Plotly dashboards.

PythonAWSXGBoostLLMPlotlySageMaker
Live at PMI

Enterprise RAG Pipeline (AWS Bedrock)

Semantic chunking + custom re-ranking RAG system on AWS Bedrock Knowledge Bases, reducing hallucination by ~40% with contextual guardrails.

PythonAWS BedrockSageMakerMLflowRAGLLM

Predictive Maintenance System

Ensemble model (XGBoost + Random Forest + TFT) on multivariate sensor time-series, reducing false failure alerts by 35% with full reproducible pipeline.

PythonTensorFlowPyTorchXGBoostLSTMTFTAWS

Deepfake Detection Pipeline

Hybrid CNN-LSTM (ResNeXt-101 + BiLSTM) pipeline achieving 89% accuracy on FaceForensics++, robust to StyleGAN3 and diffusion-based attacks.

PythonPyTorchResNeXtLSTMOpenCVComputer Vision
Research

Graph-RAG Historical NLP Pipeline

Neo4j + LangChain graph-based RAG enabling multi-hop reasoning across ancient scholarly texts, improving retrieval accuracy by 40% over keyword search.

PythonNeo4jLangChainNLPLLMGraph DB

Semantic Segmentation on Satellite Imagery

U-Net + EfficientNet-B7 achieving 92% mIoU on LandCover.ai; 40% training acceleration via FP16 mixed precision. Full W&B experiment tracking.

PythonPyTorchU-NetEfficientNetW&BComputer Vision
Skills & Expertise

Technical Competencies

Technologies and frameworks I work with to build intelligent systems

Programming

PythonRJavaC++SQLTypeScript

ML / AI

PyTorchTensorFlowScikit-learnXGBoostHuggingFaceLangChainRAGspaCyLSTMTransformers

Cloud & Data

AWS (Bedrock, SageMaker, S3, Glue, Redshift)AzureSparkKafkaAirflowDatabricksNeo4j

Quant / Analytics

Time-Series AnalysisStatistical InferenceBayesian MethodsConvex OptimizationPlotlyTableauPower BI
Education

Academic Background

A rigorous foundation in data science, machine learning, and quantitative methods

Key Courses

Large Language ModelsScientific ComputingData Modeling & Databases
Theoretical Knowledge

Academic Coursework

A rigorous curriculum spanning mathematical foundations, machine learning, and applied AI systems

Contact

Get in Touch

Open to internship opportunities in Quant Research, ML Engineering, Data Science