Lead engineer in creating a Bank Statement Processing Engine from scratch to extract financial data and insights.
Outlined schemas detailing extraction features, used few-shot prompting to categorize transactions into buckets.
Deployed a multi-model AI stack comprising Gemini 2.5 Pro/Flash, GPT-4 with intelligent fallback mechanisms.
Implemented a frontend analytics dashboard with Supabase SSO Login and Firebase integration, visualizing cash flow trends, balances, and categorized income/expenses.
Skills: Generative AI, Large Language Models, Prompt Engineering, ReactJS, Supabase, Firebase
Manager: Mr. Rajneesh Tiwary - Founder and CTO
Graduate Research Assistant
Organization: University of Wisconsin-Madison
Location: Madison, WI
Duration: September 2024 - May 2025 (9 Mos)
Designed a scalable ingestion and processing system for 80k+ medical documents, optimizing chunking, embedding, and retrieval efficiency.
Developed an Agentic RAG integrating retrieval, reasoning, and multi-agent orchestration to provide contextualized diagnostic insights for radiology.
Generated document embeddings and stored them in Chroma DB, serving locally hosted LLMs via Ollama for low-latency experimentation.
Engineered an Orchestrator LLM to route queries across three specialized nodes (Retriever-only, RAG, and general-purpose LLM) for adaptive, clinician-friendly handling.
Established an evaluation framework using NDCG@5, achieving a score of 0.857 on 1K+ radiology reports.
Skills: Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), LangGraph, Multi-Agent Systems
Advisor: Dr. Ran Zhang - Assistant Professor, Department of Radiology
Autonomous System Research Intern
Organization: Nokia Bell Labs
Location: Murray Hill, NJ
Duration: June 2024 - Aug 2024 (2 Mos)
Conceptualized a system to explore the use of autonomous agents for problem-solving, applying them to a controlled environment to evaluate adaptive learning and collaborative reasoning.
Created a multi-agent framework where each agent performed specialized actions through tool-calling interfaces.
The agents were coordinated by a central Orchestrator LLM that planned and optimized multi-step task execution.
Implemented the multi-agent system using LangGraph and Chain-of-Thought prompting to optimize reasoning and decision quality across agents.
Deployed the system via Streamlit for interactive visualization and analysis of multi-agent behaviors, enabling real-time testing and iterative refinement.
Skills: Generative AI, Large Language Models (LLMs), Multi-Agent Systems, LangGraph, Prompt Engineering, Streamlit, Git
Manager: Dr. Thomas Woo, Research Group Leader - Autonomous Systems Research Department
Master's Research
Organization: University of Wisconsin-Madison
Location: Madison, WI
Duration: September 2023 - May 2024 (9 Mos)
Benchmarked and profiled multiple Large Language Models (LLMs) under diverse configurations and hardware settings to analyze performance, latency, and efficiency trade-offs.
Investigated LLM compression and quantization techniques to optimize deployment on edge and mobile platforms without compromising accuracy.
Assessed model safety and quality trade-offs, measuring accuracy, hallucination frequency, and toxic output in compressed LLMs.
Co-authored PalmBench, accepted at ICLR 2025, on compressed LLM evaluation for mobile use cases.
Skills: Large Language Models (LLMs), Quantization, NumPy, Git
Advisor: Dr. Suman Banerjee, David J. DeWitt Professor, Department of Computer Science
Associate Engineer - AI/ML
Organization: Qualcomm
Location: Hyderabad, India
Duration: July 2022 - July 2023 (1 Yr)
Introduced evaluation frameworks and metrics for ML models to enhance efficiency and accuracy by 10.26%.
Benchmarked quantized and pruned models on Snapdragon SoCs to assess latency, throughput, and power consumption under production workloads on SNPE (SnapDragon Neural Processing Engine).
Collaborated with hardware and firmware teams to fine-tune AI workloads for mobile deployments, ensuring consistency across diverse device configurations.
Engineered an OCR + NLP based document processing pipeline to extract structured data from unstructured business documents for workflow automation.
Integrated and fine-tuned text extraction and entity recognition models, improving accuracy across diverse layouts.
Implemented a Human-in-the-Loop (HITL) feedback module allowing users to correct misclassifications, enabling automated model retraining and continuous accuracy improvement.
Our efforts reduced document processing time by 37% and achieved an accuracy of 85%.
Worked with product and backend teams to deploy the solution in production, improving scalability and efficiency.
Manager: Mr. Anand Chandrasekaran, Founder and CTO
Undergraduate Research Assistant
Organization: Bright Academy (Previously Solarillion Foundation)
Location: Chennai, India
Duration: February 2020 - June 2022 (2 Yrs 5 Mos)
Led the NLP team that developed a sign language translation system that translated German weather forecast videos depicted in German Sign Language into coherent German sentences.
Implemented a custom Multi Context Transformer model with three parallel Transformer encoders trained on video inputs preprocessed into 8, 12, and 16 frame sequences to capture temporal context.
Fused the resulting representations into a unified feature vector, which was used by the decoder to generate accurate German sentences.
Reduced model parameters by 30.88% while maintaining 98.19% ROUGE-L and 86.65% BLEU-4, achieving state-of-the-art performance with lower computational cost.
Advisor: Mr. Vineeth Vijayaraghavan, Director - Research and Outreach
Student Researcher
Organization: Sri Sivasubramaniya Nadar College Of Engineering
Location: Chennai, India
Duration: December 2020 - April 2022 (1 Yr 5 Mos)
Posited a novel architecture for Fake News Detection based on Transformer architecture, which considers the title and
content of a news article to determine its integrity.
Our work performed with an accuracy of 74.0% on a subset of the NELA-GT 2020 dataset. To our knowledge, FakeNews Transformer
is the first published work considering both title and content for evaluating a news article.
Proposed a robust and cost-effective automatic speech recognition model for the Tamil language leveraging Baidu's Deep Speech
architecture. Our work was compared against Google's speech-to-text API, outperforming it by 20%.