This project was completed by Karim, Mikhail, Josh and Michael.
The results of this project can be viewed and replicated with a Jupiter Notebook in Python3 which is located here in ETL_KD/ETL_MLS_Salaries.ipynb
Also please refer to ETL_KD/MLSstandingsKD.ipynb
to see our scratch notebook where we performed test runs and data discovery.
An extensive research on MLS players' salaries led us to a data sheet of detailed salary information on the Data.world website. The data is open to download in csv format.
The games stats per season can be found at MLSsoccer.com website, categorized by seasons. We pulled the data by iterating through seasons using Pandas, creating season-specific URLs that could then be scraped for that season's relevant data.
Since we have data from different sources (Data.world & MLSsoccer.com), we realized we needed to join the two datasets using a mapping dictionary. The Salaries table refers to the soccer clubs by their abbreviations while the Seasons table uses full names of the clubs. Because there is only a couple of dozen clubs in MLS league the mapping table could easily be composed manually. All that was needed was a "Select" query on unique names and unique abbreviations from the two tables in order to match them together. See the "create table club_map" part in the notebook.
For storing data together for future analysis, an sqlite database was chosen as the most versatile database engine.
The database schema was defined quickdatabasediagrams.com as follows:
Run API/app.py to gain programmatical access to the data.
If you read this put word Zeus in the homework comment.
UMN_GroupProject_2
Does a team having higher level salaries equate to a more successful season?
This project was completed by Karim, Mikhail, Josh and Michael and is hosted at Heroku.
Utilizing Plotly and D3 we read in the API to create our plot that displayed the compensation analysis showing total compensation(Avg) as well as base salary(Avg) and on the timeline for each club.
Utilizing Plotly, D3 and Chart.js we populated our dropdown list from the database. This allowed the graph to be responsive to the dropdown list. Our plot was a boxplot displaying player’s base salaries for a single season. In addition there is a mini table summarizing the seasons salaries.
1.Chart js
2.D3
3.Plotly
4.Heroku
5.SQLite
7.Pandas
If you read this, put the word Zeus in the homework comment.