February 2024 - Present
TinyLLM: Enabling Efficient LLM Deployment
on Resource-Constrained Devices
Working to develop techniques to enable the deployment of Large Language Models (LLMs) on low-power and resource-constrained devices.
The primary focus is on exploring and implementing Quantization methods to reduce the computational and memory requirements
of LLMs while maintaining their performance and accuracy.
This project focuses on fine-tuning Mistral 7B LLM on the Samantha question-answering dataset. It aims to adapt the Mistral model to conversational and contextual question answering.
Identify unfair Terms of Service clauses using a two-stage knowledge distillation DL algorithm
on devices with minimal resources using state-of-the-art architectures like BERT.
Classifying clickbaits: articles with potentially misleading titles, using a state-of-the-art NLP architecture.