My Projects

Leopard Re-Identification

Problem Statement

Our project scope was to automate the tracking of animal populations and poaching in the wild. Utilizing deep learning techniques, we want to understand whether it is possible to achieve a manual survey accuracy (79%) for leopards. We will identify leopards based on distinctive features and flag untracked or out-of-distribution leopards in an unsupervised setting.

Methodology

Results

Automated Essay Scoring

Github | Paper

Problem Statement

The assessment of essays can be extremely time-consuming and expensive as teachers spend hours grading essays individually. Automated essay scoring (AES) can help reduce cost and potential grading biases and improve time efficiency.

Our project aims to develop AES by using BERT base and BERT-LSTM models to observe whether an RNN layer is needed for an effective two-stage learning framework.

Methodology

Results

Forest Cover Prediction

Github | Presentation

Problem Statement

Natural resource managers need to predict how climate change will affect the composition of tree communities and the functionality of ecosystems. Forest dynamics such as tree species, age composition, and other ecosystem attributes can be studied to understand environmental disturbances and management activities.

Given a dataset of 15,120 training samples and 565,892 test samples, predict the forest cover type (out of 7 classes) for 30x30m2 sections of land based on 54 attributes (ex. elevation, area, soil type, distance to water, aspect, etc.).

Methodology

Results

COVID Regression Modeling

Paper

Problem Statement

In November 2020, our statistics team attempted to model and understand the spread of the global COVID-19 pandemic. Utilizing a dataset of the 50 states in the US, we wanted to develop a regression outlining part of the causal chain of virus spread and casualties. Though the correlation between cases and deaths was undeniable, the lack of data on ICU capacity, healthcare access, and general population health contributed to the difficulty of studying deaths. Instead, we outlined a regression for the predictors of cases across the states.