Requires: jupyter
, pandas
, and matplotlib
The data/
directory can be provided on request
When using Swan, when asked to set up a configuration, under the spark cluster option, choose Analytix.
To create the data/
directory, run the runall.sh
script.
Before running the script you may need to initialize a Kerebos token for Hadoop by the command kinit afsusername
, replacing afsusername with your username.