Working as the Founding AI Engineer to build intelligent agentic systems that enhance the platform's AI capabilities.
Architected a Medication Agent to provide reliable medication insights using LLM reasoning and fallback handling.
Built a centralized AI caching microservice that reduced duplicate LLM API costs by 40% and latency by 25%.
Skills: Generative AI, Large Language Models, Multi-Agent Systems, TypeScript
Manager: Mrs. Rukmini Banerjee - Founder
AI / ML Engineer
Organization: BrainWaves Digital
Location: Remote, USA
Duration: June 2025 - September 2025
Led the development of a Finance Agent from 0 to 1 to extract data and insights effectively from bank statements.
Engineered a multi-model AI pipeline (Gemini 2.5 + GPT-4) with intelligent fallback logic, improving extraction accuracy by 35% and reducing manual review time by 44%.
Created a dashboard with ReactJS, Supabase SSO and Firebase to visualize the cash flow, balances, and transactions.
Skills: Generative AI, Large Language Models, Multi-Agent Systems, Prompt Engineering, ReactJS
Manager: Mr. Rajneesh Tiwary - Founder and CTO
Graduate Research Assistant
Organization: University of Wisconsin-Madison
Location: Madison, WI
Duration: September 2024 - May 2025
Designed a scalable ingestion and processing system for 80k+ medical documents, optimizing chunking, embedding, and retrieval efficiency.
Developed an Agentic RAG integrating retrieval, reasoning, and multi-agent orchestration to provide contextualized diagnostic insights for radiology.
Generated document embeddings and stored them in a Vector Database, serving locally hosted LLMs via Ollama for low-latency experimentation.
Engineered an Orchestrator LLM to route queries across three specialized nodes (Retriever-only, RAG, and general-purpose LLM) for adaptive, clinician-friendly handling.
Established an evaluation framework using NDCG@5, achieving a score of 0.857 on 1K+ radiology reports.
Skills: Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Multi-Agent Systems, LangGraph, Vector DB
Advisor: Dr. Ran Zhang - Assistant Professor, Department of Radiology
Autonomous System Research Intern
Organization: Nokia Bell Labs
Location: Murray Hill, NJ
Duration: June 2024 - Aug 2024
Conceptualized a system to explore the use of autonomous agents for problem-solving, applying them to a controlled environment to evaluate adaptive learning and collaborative reasoning.
Created a multi-agent framework where each agent performed specialized actions through tool-calling interfaces.
The agents were coordinated by a central Orchestrator LLM that planned and optimized multi-step task execution.
Implemented the multi-agent system using LangGraph and Chain-of-Thought prompting to optimize reasoning and decision quality across agents.
Deployed the system via Streamlit for interactive visualization and analysis of multi-agent behaviors, enabling real-time testing and iterative refinement.
Skills: Generative AI, Large Language Models (LLMs), Multi-Agent Systems, LangGraph, Prompt Engineering, Streamlit, Git
Manager: Dr. Thomas Woo, Research Group Leader - Autonomous Systems Research Department
Master's Research
Organization: University of Wisconsin-Madison
Location: Madison, WI
Duration: September 2023 - May 2024
Benchmarked and profiled multiple Large Language Models (LLMs) under diverse configurations and hardware settings to analyze performance, latency, and efficiency trade-offs.
Investigated LLM compression and quantization techniques to optimize deployment on edge and mobile platforms without compromising accuracy.
Assessed model safety and quality trade-offs, measuring accuracy, hallucination frequency, and toxic output in compressed LLMs.
Co-authored PalmBench, presented at ICLR 2025, on compressed LLM evaluation for mobile use cases.
Skills: Large Language Models (LLMs), Quantization, NumPy, Git
Advisor: Dr. Suman Banerjee, David J. DeWitt Professor, Department of Computer Science
Associate Engineer - AI/ML
Organization: Qualcomm
Location: Hyderabad, India
Duration: July 2022 - July 2023
Introduced evaluation frameworks and metrics for ML models to enhance efficiency and accuracy by 10.26%.
Benchmarked quantized and pruned models on Snapdragon SoCs to assess latency, throughput, and power consumption under production workloads on SNPE (SnapDragon Neural Processing Engine).
Collaborated with hardware and firmware teams to optimize AI workloads for mobile deployments.
Engineered an OCR + NLP based document processing pipeline to extract structured data from unstructured business documents for workflow automation.
Integrated and fine-tuned text extraction and entity recognition models, improving accuracy across diverse layouts.
Implemented a Human-in-the-Loop (HITL) feedback module allowing users to correct misclassifications, enabling automated model retraining and continuous accuracy improvement.
Our efforts reduced document processing time by 37% and achieved an accuracy of 85%.
Worked with product and backend teams to deploy the solution in production, improving scalability and efficiency.
Manager: Mr. Anand Chandrasekaran, Founder and CTO
Undergraduate Research Assistant
Organization: Bright Academy (Previously Solarillion Foundation)
Location: Chennai, India
Duration: February 2020 - June 2022
Led the NLP team that developed a sign language translation system that translated German weather forecast videos depicted in German Sign Language into coherent German sentences.
Implemented a custom Multi Context Transformer model with three parallel Transformer encoders trained on video inputs preprocessed into 8, 12, and 16 frame sequences to capture temporal context.
Fused the resulting representations into a unified feature vector, which was used by the decoder to generate accurate German sentences.
Reduced model parameters by 30.88% while maintaining 98.19% ROUGE-L and 86.65% BLEU-4, achieving state-of-the-art performance with lower computational cost.
Advisor: Mr. Vineeth Vijayaraghavan, Director - Research and Outreach
Student Researcher
Organization: Sri Sivasubramaniya Nadar College Of Engineering
Location: Chennai, India
Duration: December 2020 - April 2022
Posited a novel architecture for Fake News Detection based on Transformer architecture, which considers the title and
content of a news article to determine its integrity.
Our work performed with an accuracy of 74.0% on a subset of the NELA-GT 2020 dataset. To our knowledge, FakeNews Transformer
is the first published work considering both title and content for evaluating a news article.
Proposed a robust and cost-effective automatic speech recognition model for the Tamil language leveraging Baidu's Deep Speech
architecture. Our work was compared against Google's speech-to-text API, outperforming it by 20%.