Skip to content

Latest commit

 

History

History
81 lines (56 loc) · 6.22 KB

making_report.md

File metadata and controls

81 lines (56 loc) · 6.22 KB

Making a Report

Jupyter Notebooks are a great way to do your data cleaning, data wrangling, and visualization. However, you might not be able to hand over a Jupyter Notebook (.ipynb) to people, as people aren't accustomed to opening that file.

People are used to getting PDFs and people are used to looking at websites. Let's explore a couple of options with the tools we already have. To do more, of course, you might have to pick up more web programming skills and additional languages (Javascript, CSS) to support your endeavors.

RMarkdown

One familiar option you may have heard of is RMarkdown. Using RMarkdown, you might use knitr to knit the code and outputs into a Word doc, HTML doc, or PDF.

Our Docker image includes RStudio, and you are able to create an RMarkdown doc and knit that into HTML or PDF. If you know R/RStudio, you can use this option!

RMarkdown Examples

All these examples use knitr function. The script to run is notebooks/iterate.R, and output files are created in the report directory.

Jupyter Notebooks

There is a similar functionality for Jupyter notebooks. Normally, to get PDFs, it requires LaTex, and other packages to support, which can get very complicated, very easily. Our Docker image has all of that installed, so converting to a PDF is fairly easy for us!

Use the terminal within the Docker

  • Launcher > Terminal
  • Convert the notebook into HTML: jupyter nbconvert --to html --no-input --no-prompt my-notebook.ipynb
    • no-input: do not display input cells (code you wrote), just the outputs
    • no-prompt: do not display input and output prompts
    • nbconvert docs on exporting
  • Convert the notebook into PDF: jupyter nbconvert --to pdf --no-input --no-prompt my-notebook.ipynb

Jupyter Notebook Examples

Similar to the iterate.R, there is a report.py that is a script that can be run in the terminal.

There report.py uses papermill to execute a notebook, convert it to HTML, then relies on automate.py for a function to upload to GitHub pages. You may need credentials to upload to GitHub pages automatically; uploading by checking in the HTML or pdf will not require additional credentials (refer back to Getting Started - Credentials).

Within the terminal inside Docker (Launcher > Terminal):

  • Change into the notebook directory: cd notebooks
  • Execute the Python script: python report.py
    • Within report.py, the line TOKEN = os.environ["GITHUB_TOKEN_PASSWORD"] is where it accesses the token without displaying it for everyone to see.

Examples:

GitHub Pages

GitHub can display HTML pages and render it like a simple website. For those who don't want to venture into web programming, this is one way to get more functionality from the tools already at your disposal. People have used this to make their online portfolios or resumes.

You can set your master branch, or any other branch, as the one you tell GitHub to render as a webpage. Let's stick with the master branch for now.

  • GitHub pages docs
  • Convert the notebook to HTML: jupyter nbconvert --to html --no-input --no-prompt my-notebook.ipynb
    • Make sure you set up your file path correctly! For organization, our notebooks are in the notebooks folder, so your file path might be notebooks/3-make-report.ipynb.
  • A successful conversion will result in a file called my-notebook.html to be created.
  • Add this file in our commit: git add my-notebook.html
  • Add commit message: git commit -m "Add report"
  • Push changes to remote: git push origin my-branch-name
  • Make a pull request and merge in my-branch-name to the master branch.
  • View your HTML page as a "website": navigate to https://YOUR-USERNAME.github.io/simple-coronavirus-report/my-notebook.html

You might need a GitHub personal access token for some of the functionalities, such as uploading files within a script rather than adding/committing yourself.

Getting a GitHub personal access token:

  • On GitHub website: Settings > Developer Settings (left tab) > Personal Access Tokens > Generate a New Token
    • Check off all functionalities except delete_repo and admin:enterprise.
    • Copy and save that token somewhere! It's long and a bunch of scrambled letters/numbers.

Back to main README, Getting Started, GitHub Workflow, Data Pipeline or Other Resources