Developed a novel approach using synthetic plant images from generative models for rare species identification, achieving a 29% improvement in zero-shot learning and an additional 31% improvement in few-shot learning over pre-trained weights, setting a new state-of-the-art in rare flora classification.
Led research on LLMs as autonomous researchers, utilizing a 1.2 million DBLP papers dataset. My DPO-optimized models outperformed supervised fine-tuned models by 27% in the novel creativity index. They achieved a 42% improvement in automated user satisfaction scores, with 89% expert validation of generated research ideas.
Engineered an enhanced intent classification system using contrastive learning on LLM-generated paraphrased data, achieving a 4.78% performance improvement over previous models and reaching 91.783% accuracy on an ambiguous version of the CLINC150 dataset, outperforming GPT-4 and Claude Opus by 3.1%.
Led a data-intensive project utilizing machine learning on satellite imagery to estimate Maternal and Child Health indicators, securing 1st place in a competitive Kaggle-style challenge among more than 30 teams and received a Letter of Recommendation for exceptional performance.
Executed comprehensive data preprocessing, cleaning, and feature engineering with statistical evidence and understanding of the effect of each parameter in the geographically tagged datasets, reducing features from 10,000 to 70 compelling features.
Optimized machine learning models using a boosting on error ensemble approach, significantly improving RMSE from 35.45 to 10.76.
Engineered a tailored recommendation system to match student interns with appropriate companies, utilizing data visualization tools like Seaborn and Matplotlib to derive business insights, resulting in a 20% increase in successful intern placements.
Conducted a thorough comparison of Recommendation Systems as a Service provided by various cloud platforms, achieving a 15% higher accuracy with the custom model versus leading cloud alternatives with language models and skill extractions.
Designed and deployed the recommendation model using Python on an AWS EC2 instance, with a Flask API for real-time predictions, backed by a robust data pipeline using PyMongo and automated re-training with Cron jobs, leading to a 30% reduction in recommendation latency and a 99.9% uptime.
Architected and deployed a robust Deep Learning model using advanced RNN, GRU, and LSTM networks for a state-of-the-art Digital Twin of Cryogenic Energy Storage Systems, achieving system prediction accuracy of 95% and reducing predictive maintenance costs by 25%.
Innovated a machine learning solution with few-shot learning, allowing plant operators to manage operations without prior domain knowledge, resulting in a 40% decrease in operational errors and a 30% improvement in decision-making efficiency.
Developed and launched an interactive web application to demonstrate the model's predictive capabilities, leading to a 50% increase in user engagement and providing a potential roadmap for scaling up to other energy systems.
Pioneered a task-agnostic model employing the Reptile meta-learning algorithm on the Abstraction and Reasoning Corpus, achieving a notable 78% accuracy across training and test datasets with a CNN architecture without using any domain-specific language.
Led a crowdsourcing initiative on Appen to annotate datasets, enabling the application of natural language rules to develop a robust NLP model that mirrors the agnostic capabilities of the original system.
The innovative approach and model versatility have set the stage for groundbreaking advancements in NLP model generalization, potentially reducing the need for task-specific training data by up to 53% by maintaining the model's efficiency, reaching efficiency near DSL.
Developed Apache Airflow pipelines and high-performance gRPC APIs using Golang, boosting data processing efficiency by 70% and handling a 2x demand surge with improved traffic management capacity and reduced service downtime by 40%.
Enhanced system reliability and performance by implementing Kafka for real-time event management and optimizing relational databases, which resulted in a 60% reduction in processing overhead, a 30% faster query response, and support for a 10x higher query volume.
Achieved a substantial 92.6% reduction in database storage by optimizing data structures, shrinking table join size from 175M to 13M.
Developed high-performance RPC APIs leveraging Redis, achieving a remarkable query time reduction from 250ms to 10ms.
Innovated a tree-based form rendering system, significantly reducing the operational changes for new constraints from multiple to just 2 steps.
Enhanced user interface workflows with a form flow visualizer using Graphviz, coupled with a sophisticated toolset: React, Redux, FusionJS, and BaseUI for frontend; Golang for backend services; and Hive and PrestoQL for data querying.
Engineered a scalable video recommendation system using the LightFM Framework, adeptly managing a hybrid model to resolve cold start problems, trained on a dataset exceeding 1 million records by intricately curating the data for efficient training.
Developed RESTful APIs with NodeJS and integrated MongoDB for backend operations for efficient data transactions and system scalability.
Designed and executed complex SQL and BigQuery analytics, complemented by insightful visualizations in Google Data Studio, which informed strategic business decisions and enhanced user engagement metrics by over 28% and user retention metric by over 12%.
Conducted workshops for open-source awareness, including Python and Git workshops, to provide students with basic technology knowledge and help them start contributing to open-source projects.
Organized Kharagpur Winter of Code (KWoC), an Indian variant of Google Summer of Code (GSoC), promoting contributions to open-source projects by encouraging participation from both students and mentors.
Led the Open Source Summit as part of Kshitij (Techno-Management fest at IIT Kharagpur), which included workshops on Ethical Hacking and Golang, with participation from students across various colleges.
Built algorithmic and data structure problems from domains like dynamic programming and greedy approaches to test students' knowledge.
Tested problems for upcoming contests organized by the company to ensure high-quality challenges.
Additionally responsible for content creation and doubt clearance, focusing on aligning problem difficulty and variety with the needs of the team.
Contact