This repository contains a project based on the final assignment for the BI Dashboards with IBM Cognos Analytics and Google Looker course on Coursera. The original assignment focused on creating and analyzing business intelligence (BI) dashboards/reports using IBM Cognos Analytics and Google Looker Studio.
For this project, I have adapted the assignment to utilize Snowflake's Snowsight. Additionally, I have provided a Streamlit app that produces the same visualizations using Streamlit-in-Snowflake (SiS).
-
You are welcome to use this repository as a reference or starting point for your own project.
-
If you choose to fork this repository, please ensure that you comply with the terms of the Apache License and give proper credit to the original authors.
- Analyze the historical trends in car sales for SwiftAuto Traders.
- Provide insights on car sales and profits for each dealer.
SwiftAuto Traders has hired a data scientist to analyze car sales and profits for each dealer. Their first task is to create visualizations and present them as a dashboard to the regional manager.
This repository contains all of the source files, scripts, output files, and guides for completing the final assignment using Snowflake Snowsight and Streamlit.
├── Automotive_Industry <- Automotive Industry Sample Data
│ ├── AU_Car_Models.csv <- Contains information about individual car models
│ ├── AU_Car_Recalls.csv <- Contains information about the number and type of safety recalls that affect the types of cars sold
│ ├── AU_Daily_Sales.csv <- Contains information about daily automotive sales and weather at the time of the initial sales visit
│ ├── AU_Dealers.csv <- Contains information about individual car dealers
│ ├── AU_Sales_By_Model.csv <- Contains sales information
│ └── AU_Sentiment.csv <- Contains information about customer sentiment
├── images <- Screenshot files used by the markdown files
├── Final_Assignment_Snowsight.md <- Visualizing Car Sales and Dealer Profits Using Snowflake Snowsight Guide
├── Final_Assignment_Streamlit.md <- Visualizing Car Sales and Dealer Profits Using Streamlit-in-Snowflake (SiS) Guide
├── requirements.txt <- Dependency items to be installed by pip
├── setup_swiftauto_traders.log <- Output from setup_swiftauto_traders.py script
├── setup_swiftauto_traders.py <- Python script to automate the setup of the Snowflake environment for this project
└── streamlit_swiftauto_traders.py <- Single-page Streamlit in Snowflake app for SwiftAuto Traders
For this project, we will use a modified subset of the Automotive Industry Sample Data, which is available in this repository.
The original dataset can be found here: Automotive Industry Sample Data
The terms of use are located at https://developer.ibm.com/terms/ibm-developer-terms-of-use/.
Task 1: Create a dashboard titled as Sales
to capture the following KPI metrics:
- Capture Profit (formatted to 1 decimal place in millions of US dollars)
- Capture Quantity sold
- Create a bar chart to capture Quantity sold by model
- Capture Average quantity sold
Task 2: Develop a column chart to display Profit by Dealer ID in the Sales
dashboard sorted in ascending order
Task 3: Create another dashboard titled as Service
and capture the following KPI metrics as visualizations:
- Create a column chart to capture the number of recalls per model of car
- Create a treemap to capture the customer sentiment by comparing positive, neutral, and negative reviews
- Create a line and column chart to capture the quantity of cars sold per month compared to the profit
- Create a pivot table with heatmap to capture the number of recalls by model and affected system
Task 4: Share the dashboards
This project has the following requirements:
- Python 3.10.12 or higher
- pip 24.3.1 or higher
- A Snowflake user with
ACCOUNTADMIN
role granted. You can create a trial Snowflake account here.- Select
Enterprise
for your Snowflake edition andAmazon Web Services
for your cloud provider - Choose the region nearest you, accept the terms, and click
GET STARTED
- Go to your email, and click on the activation link
- Select
The setup_swiftauto_traders.py Python script has been provided to setup the Snowflake environment for this project. It also handles the uploading of CSV dataset files to stage, creates and loads tables with inferred schemas, and establishes a basic role hierarchy and user setup to demonstrate dashboard sharing within Snowflake.
Install Snowflake Connector for Python
Make sure you have Python (version 3.10.12 or later) and pip
installed on your system. Then, install the required libraries using the provided requirements.txt
file. The command syntax is:
python3 -m pip install -r requirements.txt
Set Environment Variables for Snowflake Credentials
Replace the placeholder values with your actual Snowflake account details when setting the environment variables.
Bash (Linux/Mac)
- Open Terminal
- Use the
export
command to set environment variables:Alternatively, you can add the environment variables to your bash login scriptexport SNOWFLAKE_USER='your_username' export SNOWFLAKE_PASSWORD='your_password' export SNOWFLAKE_ACCOUNT='your_account'
.bash_profile
or.bashrc
so they are automatically set when you log in.
Windows
- Open Command Prompt.
- Use the
setx
command to set environment variables:setx SNOWFLAKE_USER "your_username" setx SNOWFLAKE_PASSWORD "your_password" setx SNOWFLAKE_ACCOUNT "your_account"
- Restart any open Command Prompt windows for the changes to take effect.
Execute setup_swiftauto_traders.py
python3 setup_swiftauto_traders.py
Output: setup_swiftauto_traders.log
After executing the provided setup_swiftauto_traders.py Python script, your Snowflake environment will be configured with the following components.
- The
SWIFTAUTO_DB.AUTOMOTIVE
namespace contains six tables:AU_CAR_MODELS
AU_CAR_RECALLS
AU_DAILY_SALES
AU_DEALERS
AU_SALES_BY_MODEL
AU_SENTIMENT
- Table schemas are automatically inferred from the structure of the corresponding CSV file.
- To ensure compatibility and ease of use, column names containing spaces are replaced with underscores.
- The
SWIFTAUTO_DB.PUBLIC
namespace contains anAUTOMOTIVE_INDUSTRY
stage with six CSV files:AU_Car_Models.csv
AU_Car_Recalls.csv
AU_Daily_Sales.csv
AU_Dealers.csv
AU_Sales_By_Model.csv
AU_Sentiment.csv
- The
SWIFTAUTO_DB.PUBLIC
namespace contains aCSV_FF
file format with the following options:- Fields are separated by commas
- The first row is used as column headers
- Leading and trailing spaces are removed from fields
- Fields may be optionally enclosed by double quotes
- '\N', 'NULL', '' values are treated as SQL NULL
- Empty fields are inserted as SQL NULL
- An error is raise if the number of columns in the file does not match the table schema
To properly demonstrate the sharing of Snowsight dashboards, the following role hierarchy and users were created, enabling the data scientist to present the dashboards to the regional manager as outlined in the project scenario.
NAME | COMMENT |
---|---|
DS_JSMITH | Data Scientist, SwiftAuto Traders |
RM_DENVER | Regional Manager, Denver, Colorado |
DATA_ANALYST | Functional role for data analysts |
DATA_SCIENTIST | Functional role for data scientists |
REGIONAL_MANAGER | Functional role for regional managers |
SWIFTAUTO_BI_CREATOR_ROLE | Access role that permits BI creator access for SWIFTAUTO_DB |
SWIFTAUTO_BI_VIEWER_ROLE | Access role that permits BI viewer access for SWIFTAUTO_DB |
SWIFTAUTO_READ_ROLE | Access role that permits read-only access for SWIFTAUTO_DB |
SWIFTAUTO_READWRITE_ROLE | Access role that permits read-write access for SWIFTAUTO_DB |
- Log in to Snowsight using the
DS_JSMITH
andRM_DENVER
accounts with the password specified in the setup_swiftauto_traders.py Python script. - Users are required to change their password upon their first login to the system.
- Only users who have previously signed in to Snowsight can be assigned share permissions.
Begin the Final Assignment using Snowsight
Begin the Final Assignment using Streamlit-in-Snowsight
- IBM Skills Network © IBM Corporation 2024. All rights reserved.