During my degree at University of Michigan I produced these data science projects (Masters in Applied Data Science, MADS). I have also included research I did as principal at rad disco.
Predicting short term price movements in volatile cryptocurrency markets. Used a machine learning pipeline to deliver highly accurate (97%) trade predictions for year 2022-23 in the Avalanche crypto market (AVAX-USD) on Coinbase. Built to support real time Highly Frequent Trade environments.
Presented a solution for solving workforce atttrition at a model, case study client. I used a logistic regression model to discover most likely causes of employee turnover at an Indian manufacturing firm. Utilized a correlation matrix to discover highly correlated principal compenents, which were fed into an ROC/AUC analysis.
A study in asset prices, and tightly correlated behavior between sell volume and pessimistic economic announcements.
A study on the American dependence on Chinese exports. Proposes margins by which the US Trade Office may negotiate new trade agreements with nations within the Comprehensive and Progressive Agreement for Trans Pacific Partnership member nations, per industry sector.
I used python to discover affine clusters of tradeable commodities, and investigate how a major firm such as Citadel will use affine clusters of financial products to predict pricing.
I investigated how the implementation of various fitness devices accompanied improved workout efficiency, as measured through workout intensity and heart rate graphs. Used during an introduction to data visualization course with Chris Brooks.
A means of embedding highly expressive ASCI graphics (emoji) into altair area charts. A method for locating the most-expressive tweeted emoji for a real-time sports event.
A proprietary method for unbiased test train splits, improving Random Forest model accuracy. Used to reduce model complexity in unsupervised learning projects, where high degrees of dimensional complexity must be resolved, in addition to preserving unbiased samples among sequential data sets. Used to resolve a prediction model inside a personal fitness context.