🍄 Can we predict if a mushroom is poisonous? |
EDA, Predictive Analysis, Classifying |
Here, my team and I used the UCI Mushroom Data Set to prepare, analyze, and predict which variables of mushrooms make them more likely to be inedible/poisonous. |
sklearn, pandas, NumPy, matplotlib, seaborn |
🎥 Sentiment Analysis on Movie Reviews |
EDA, Naive Bayes |
My team and I used a Multinomial Bayes Classifier to determine whether a movie review had negative or positive sentiment. |
pandas, BeautifulSoup, Sklearn, matplotlib, re(regex) |
⛽️ Predicting Vehicle Weight |
EDA and Linear Regression |
Analysis on a vehicle dataset and constructing linear regression models that predict the curb weight of a vehicle. |
pandas, matplotlib, seaborn |
🍷 Cleaning a Wine Dataset |
EDA and Imputing Data |
An exercise where a partner and I studied a wine dataset, became familiar with domain knowledge regarding wine, studied each variable numerical and categorical, adjusted skew, normalized, and imputed values for missing values of several variables. |
pandas, matplotlib |
🎓 Decision Tree Vs. Random Forest on NY State Graduation Data |
EDA, Supervised Machine Learning, Decision Trees/Random Forest |
In this analysis, we constructed three different kinds of decision trees and random forest models based on feature importance analysis using Logistic Regression on our Boolean variables, trained them on subsets of our data, analysed their performance using confusion matrices, and chose the best one for prediction. |
pandas, matplotlib, sklearn, Yellowbrick |
🛍 K-Nearest Neighbors and Support Vector Machines to Predict Online Purchases |
EDA, KNN, SVM |
We used supervised learning methods such as K-nearest neighbors and support vector machines in Python to predict whether or not online shoppers were more willing to make a purchase. |
pandas, matplotlib, seaborn |
🎮 Sentiment Analysis - A Machine Learning Approach into Hideo Kojima's Divisive Platformer |
EDA, Naive Bayes, Feature Engineering, Natural Language Processing |
Our team sought to perform sentiment analysis on Twitter tweets in anticipation for Hideo Kojima's video game release, Death Stranding, in 2019. We sourced the Tweets from two libraries, preprocessed them, stored them using MongoDB and then performed sentiment analysis. |
pandas, matplotlib, pymongo, NLTK, json |