0 Project complete
I am a Data Scientist at PepsiCo with a graduate degree from Columbia University, Data Science Institute. I have obtained my undergraduate in Computer Science from India. As a part of my graduate program I have acquired various skills to become a full fledged Data Scientist. I have studied courses like statistical modeling, machine learning and applied deep learning.
My main interests lies in applied machine learning and deep learning techniques to solve real world problems. I enjoy working in natural language processing domain as well. My course projects have enabled me to gain the required skills to be a well rounded data scientist. I worked at Johnson & Jonhson as a data science intern where I sparked an interest in the healthcare domain. The problems being worked on in healthcare are very challenging and gave me a good exposure to real world data science applications.
Apart from aspiring to be a top notch data scientist I enjoy theatre, music and art and hope to someday walk the stage again and get involved in performing arts.
0 Project complete
Courses: Machine Learning for Data Science, Algorithms in Data Science, Probability and Statistics, Exploratory Data Analysis & Visualizations, Statistical Inference and Modelling. Applied ML, Personalization theory & application, Applied Deep Learning, Computer Systems for Big Data
Courses: Algorithms, Data Structures, Operating Systems, Computer Architecture, Database management systems, Artificial Intelligence, Compiler Design, Theory of Computing, Software Engineering
Summer exchange program on Computer Science and Electronics
I am a Data Scientist in the Data Science and Analytics team at PepsiCo eCommerce. I primarily work on using data science techniques to optimize sales and ROI for products sold by PepsiCo
Analyzed voice of customer data regarding drug products using Natural Language Processing. Developed medical ontologies using Linguamatics and word embeddings (Fasttext) techniques to perform semantic querying. Built a text analyzer for the voice of customer data using unsupervised clustering models in python.
Modeled the users’ real time data using clustering techniques. Analyzed price movements in the financial and customer markets.
Numpy, Pandas, Scikitlearn, Matplotlib, Nltk, PySpark, D3.js
Foreign Technical Training Program
Best outgoing student in BE Computer Science and Engineering
Proficiency in Engineering Graphics
Highest total marks in BE Computer Science and Engineering
Imparted a machine learning approach to perform facial emotion analysis and digital signal processing on audio signals. Categorized songs into various emotions by extracting midterm features to compute their valence and arousal values. Applied regression through Support Vector Machines was used to train the model on these audio features and a Valence-Arousal coordinate plane was defined to segregate the emotions.
Publication LinkCreated a platform to analyze news articles and events. Topic matching of the news articles with Wikipedia pages. Implemented Wikipedia category search tree to obtain categorization of articles.
Github RepoDetected tumor cells from pathology images using image segmentation and classification. Constructed a convolutional neural network model using tensorflow and keras frameworks. Designed evaluation metrics to diagnose the presence of cancer in the cells.
Github RepoBuilt production grade recommendation systems on the yelp dataset for various businesses. Predicted ratings on active users using collaborative filtering, non-negative matrix factorization. Designed a ‘wide and deep’ learning model for user recommendation.
Github RepoBuilt production grade recommendation systems on the movie lens dataset. Designed user-based collaborative filtering and model-based matrix factorization using PySpark ML methods Predicted users top 10 movie recommendations and developed evaluation metrics for the recommender model.
Github RepoIdentified the humanitarian crisis using UNHCR population of concerns data. Explored the datasets to obtain the countries with highest refugee population using R tidyverse. Visualized the flow of refugees across the years using D3.js
Github RepoPerformed topic modelling on twitter posts using Latent Dirichlet Allocation. Evaluated the results using topic coherence to generate most relevant hashtags. Further analysis on subtweets was performed to obtain better granularity of results.
Things that interest me that are NOT in the data science space!
During college I was a part of a theatre club called Theatron where I was involved in several productions for MYTF and CTI by Crea-Shakthi. I love acting and being a part of the production team for dramas. I organized theatre events at culturals like street play and mono acting. I was the head of finance in my senior year at college. Outside of college I also worked for EVAM and I was a part of the organising team for The Hindu Theatre Festival in 2016 and 2017.
In my college days I was an active part of the Youth Red Cross foundation and Rotaract Committee that organized events for college students to help for the well being of the society.
I am a trinity level 6 pianist and I enjoy playing some of my favorite songs in my free time. I also like to work on some DIY arts & crafts projects for room decor or to gift to my friends as presents. When I am trying to have a lazy day I love to binge watch youtube videos of some of the lifestyle vloggers I follow and occasionally I like to shoot some vlogs myself
New York, NY 10027