Note: The list here includes all my open-source projects that I worked on. I have not included the projects that I worked on as part of my job.
Dense passage retrieval is the algorithm behind search query engines like Google. Given a query, find all the relevant documents that have matching vector. Asymmetric Dense passage retrieval, uses 2 BERT-like models, one that is light weight, trained for building search vectors, and another bigger model that is iteratively trained on any new document added to the database. I could improve the DPR accuracy by 4% across all major benchmarks.
A knowledge graph model that can genaralize to unseen relationships without any training. Makes use of RMPI graph network along with BERT to learn embeddings for relationships and attempts to generalize from just training on the KG once. I wrote my M.sc thesis on Knowledge graph relation prediction, and this was the outcome of various experiments I did.
Generate blog like articles, given a seed keyword and instructions on tone, writing style. This is currently in maintenance with no active development. It includes multi threaded (pipeline) architecture, where each pipeline generates one part of the blog. Uses Large Language Model like GPT-2.5, GPT-3 and also fine-tuned StableDiffusion model to generate images. I still use some of the pipelines actively for my other projects.
Simple chrome extension to grab headings and format it to export easily.
A [Re]Science FAIR project, where we implemented the paper "Counterfactual Generative Networks" by Yoon et al. The paper proposes a method to generate counterfactuals for a given input. We implemented the paper and also extended it to generate counterfactuals for a given input and a target class. We also implemented a method to generate counterfactuals for a given input and a target class, while also ensuring that the counterfactuals are semantically similar to the input. Our analysis got accepted to the [Re]Science FAIR 2021 conference.