Skip to content

Latest commit

 

History

History
94 lines (65 loc) · 8.31 KB

File metadata and controls

94 lines (65 loc) · 8.31 KB

badge badge

tensorFlow-federated-data-science-project

Repository containing scaffolding for a Python 3-based data science project based on the TensorFlow Federated ecosystem.

Creating a new project from this template

Simply follow the instructions to create a new project repository from this template.

Project organization

Project organization is based on ideas from Good Enough Practices for Scientific Computing.

  1. Put each project in its own directory, which is named after the project.
  2. Put external scripts or compiled programs in the bin directory.
  3. Put raw data and metadata in a data directory.
  4. Put text documents associated with the project in the doc directory.
  5. Put all Docker related files in the docker directory.
  6. Install the Conda environment into an env directory.
  7. Put all notebooks in the notebooks directory.
  8. Put files generated during cleanup and analysis in a results directory.
  9. Put project source code in the src directory.
  10. Name all files to reflect their content or function.

Using Conda

Creating the Conda environment

After adding any necessary dependencies to the Conda environment.yml file you can create the environment in a sub-directory of your project directory by running the following command.

$ conda env create --prefix ./env --file environment.yml

Once the new environment has been created you can activate the environment with the following command.

$ conda activate ./env

Note that the env directory is not under version control as it can always be re-created from the environment.yml file as necessary.

Building JupyterLab extensions

If you wish to use any JupyterLab extensions included in the environment.yml file then you need to activate the environment and rebuild the JupyterLab application using the following commands to source the postBuild script.

$ conda activate $ENV_PREFIX # optional if environment already active
(/path/to/env) $ . postBuild

For convenience the above steps have been encapsulated by the bin/create-conda-env.sh script which can be run as follows.

./bin/create-conda-env.sh

Updating the Conda environment

If you add (remove) dependencies to (from) the environment.yml file after the environment has already been created, then you can update the environment with the following command.

$ conda env create --prefix ./env --file environment.yml --force

If you add any additional JupyterLab extensions, then the easiest way to update the environment is to re-run bin/create-conda-env.sh script to insure that JupyterLab is re-built with the new extensions.

Listing the full contents of the Conda environment

To list all of the packages installed in the environment run the following command.

conda list --prefix ./env

Using Docker

In order to build Docker images for your project and run containers you will need to install Docker and Docker Compose.

Detailed instructions for using Docker to build and image and launch containers can be found in the docker/README.md.