Hello, I'm Vish, an incoming Data Scientist at DataKind. My portfolio showcases the work I completed in my Masters in Information and Data Science (MIDS) program at UC Berkeley.
Based in New York City, I spend my free time tutoring underserved populations and improving my skills in jazz piano. One of my passions is carrying on my family's legacy of cooking South-Indian cuisine while adapting the recipes to my plant-based diet.
View My Resume
View My LinkedIn Profile
View My Recent Emissions Forecasting Project
Our project scope was to automate the tracking of animal populations and poaching in the wild. Utilizing deep learning techniques, we want to understand whether it is possible to achieve a manual survey accuracy (79%) for leopards. We will identify leopards based on distinctive features and flag untracked or out-of-distribution leopards in an unsupervised setting.

The assessment of essays can be extremely time-consuming and expensive as teachers spend hours grading essays individually. Automated essay scoring (AES) can help reduce cost and potential grading biases and improve time efficiency.
Our project aims to develop AES by using BERT base and BERT-LSTM models to observe whether an RNN layer is needed for an effective two-stage learning framework.


Natural resource managers need to predict how climate change will affect the composition of tree communities and the functionality of ecosystems. Forest dynamics such as tree species, age composition, and other ecosystem attributes can be studied to understand environmental disturbances and management activities.
Given a dataset of 15,120 training samples and 565,892 test samples, predict the forest cover type (out of 7 classes) for 30x30m2 sections of land based on 54 attributes (ex. elevation, area, soil type, distance to water, aspect, etc.).

In November 2020, our statistics team attempted to model and understand the spread of the global COVID-19 pandemic. Utilizing a dataset of the 50 states in the US, we wanted to develop a regression outlining part of the causal chain of virus spread and casualties. Though the correlation between cases and deaths was undeniable, the lack of data on ICU capacity, healthcare access, and general population health contributed to the difficulty of studying deaths. Instead, we outlined a regression for the predictors of cases across the states.


