Document Theme Identifier
LLM-powered research chatbot that ingests 75+ docs, runs OCR + embeddings, and answers queries with citation-backed summaries.
I'm Anish Dahiya — a data scientist building practical AI products with high-velocity teams and documenting the process for builders. I help organizations move from prototype to production with pragmatic model design, robust data pipelines, and repeatable shipping patterns.
From experiment to production, I help teams launch data products that improve revenue, retention, and trust.
LLM-powered research chatbot that ingests 75+ docs, runs OCR + embeddings, and answers queries with citation-backed summaries.
Predictive pricing engine that cleanses diamond attributes, trains ensemble regressors, and exposes results through a polished Flask web app.
Biomedical voice analytics pipeline that standardizes acoustic biomarkers and trains interpretable classifiers for early Parkinson's screening.
Deep learning system that hears keyboard keystrokes via MFCC features and a custom 1D-CNN, shipped with a Streamlit UI for real-time inference.
End-to-end ML workflow that validates 590-sensor wafer batches, clusters signals, and selects the best Random Forest/XGBoost model per cluster.
Every chapter blends strategy, storytelling, and system design.
Joined a fast-moving ML team to ship production-ready models, tighten evaluation loops, and translate research spikes into real user impact.
Split time between research internships and my final-year project, hardening MLOps pipelines and documenting lessons for the next cohort.
Built a congestion prediction model for Delhi Integrated Multi-Modal Transit System to forecast bus load and improve scheduling.
Solved curriculum-aligned problems and authored explanations for CS learners while pursuing undergrad.
Specialized in Artificial Intelligence & Machine Learning with hands-on projects across CV, NLP, and forecasting.