Merge pull request #4 from aisingapore/3-ci-cd

PR for #1
aisingapore · Jan 4, 2024 · 33feaec · 33feaec
2 parents 233b0b3 + ebfbe9f
commit 33feaec
Show file tree

Hide file tree

Showing 7 changed files with 190 additions and 94 deletions.
diff --git a/...iecutter.repo_name}}/aisg-context/guide-site/docs/guide-for-user/04-dev-wksp.md b/...iecutter.repo_name}}/aisg-context/guide-site/docs/guide-for-user/04-dev-wksp.md
@@ -81,27 +81,47 @@ either method to gain access to a remote VSCode developer workspace.
 
 === "Run:ai"
 
-    Every end-user of Run:ai would be able to quickly spin up a VSCode
-    server workspace using prebuilt blocks. While the
-    [steps for creating a workspace][workspace] are detailed on 
-    Run:ai's documentation, listed below are the recommended 
-    environment, compute resource, and data source that you may make 
-    use of to spin up your first VSCode workspace, in the context of AI 
-    Singapore's infrastructure.
-
-    - __Workspace name:__ `<YOUR_HYPHENATED_NAME>-vscode`
-    - __Environment:__ `aisg-vscode-server-v4-16-1`
-    - __Compute Resource:__ `cpu-mid`
-    - __Data Source:__ The persistent volume claim (PVC) that is 
-      dedicated to your project. For a project named `sample-project`, 
-      you may make use of `sample-project-pvc`.
+    === "Workspaces"
 
-    Once you have selected the blocks, you can proceed to create the
-    workspace and you will be redirected to the workspaces page. On 
-    this page, you may view the status of the workspace that you have 
-    just created.
+        Every end-user of Run:ai would be able to quickly spin up a VSCode
+        server workspace using prebuilt blocks. While the
+        [steps for creating a workspace][workspace] are detailed on 
+        Run:ai's documentation, listed below are the recommended 
+        environment, compute resource, and data source that you may make 
+        use of to spin up your first VSCode workspace, in the context of AI 
+        Singapore's infrastructure.
+
+        - __Workspace name:__ `<YOUR_HYPHENATED_NAME>-vscode`
+        - __Environment:__ `aisg-vscode-server-v4-16-1`
+        - __Compute Resource:__ `cpu-mid`
+        - __Data Source:__ The persistent volume claim (PVC) that is 
+          dedicated to your project. For a project named `sample-project`, 
+          you may make use of `sample-project-pvc`.
+
+        Once you have selected the blocks, you can proceed to create the
+        workspace and you will be redirected to the workspaces page. On 
+        this page, you may view the status of the workspace that you have 
+        just created.
+
+        ![Run:ai Dashboard - Workspaces Page Post VSCode](assets/screenshots/runai-dashboard-workspaces-page-post-vscode.png)
 
-    ![Run:ai Dashboard - Workspaces Page Post VSCode](assets/screenshots/runai-dashboard-workspaces-page-post-vscode.png)
+    === "YAML"
+
+        You can create a VSCode workspace with the YAML file 
+        `aisg-context/runai/02-vscode.yml`. But before that, you would need 
+        to prepare the workspace to spin it up with:
+
+        ```bash
+        # Change the values within the file if any before running this
+        kubectl apply -f aisg-context/runai/01-workspace-prep.yml
+        ```
+
+        After that, you can spin up the workspace with:
+
+        ```bash
+        # Change the values within the file if any before running this
+        kubectl apply -f aisg-context/runai/02-vscode.yml
+        ```
 
     Once the workspace is active (indicated by a green status), you may
     access the workspace by clicking on the `CONNECT` button and 
@@ -144,12 +164,17 @@ either method to gain access to a remote VSCode developer workspace.
 
     [workspace]: https://docs.run.ai/v2.13/Researcher/user-interface/workspaces/create/workspace
 
-    __Reference(s):__
+    ??? info "Reference Link(s)"
 
-    - [Kubernetes Docs - Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes)
+        - [Kubernetes Docs - Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes)
 
 ### Persistent Workspaces
 
+!!! warning "Attention"
+    If you have spun up using the Run:ai YAML method, then you can 
+    skip this step since you've already prepared your workspace prior 
+    to spinning it up.
+
 As mentioned, a PVC should be attached to the workspaces to persist
 changes to the filesystems. If a PVC is attached, the usual path to
 access it would be `/<NAME_OF_DATA_SOURCE>`. For example, if the name of
@@ -196,22 +221,31 @@ Now, let's clone your repository from the remote:
     $ cd {{cookiecutter.repo_name}}
     ```
 
+=== "Run:ai YAML"
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/03-repo-download.yml
+    ```
+
 ### Extensions for VSCode
 
 You can install a multitude of extensions for your VSCode service but
 there are a couple that would be crucial for your workflow, especially
 if you intend to use Jupyter notebooks within the VSCode environment.
 
-- [`ms-python.python`](https://marketplace.visualstudio.com/items?itemName=ms-python.python):
-  Official extension by Microsoft for rich support for many things
-  Python.
-- [`ms-toolsai.jupyter`](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter):
-  Official extension by Microsoft for Jupyter support.
+- [`ms-python.python`][vsx-python]: Official extension by Microsoft for
+  rich support for many things Python.
+- [`ms-toolsai.jupyter`][vsx-jy]: Official extension by Microsoft 
+  for Jupyter support.
 
 !!! warning "Attention"
-    Do head over [here](./05-virtual-env.md#jupyter-kernel-for-vscode)
-    on how to enable the usage of virtual `conda` environments within
-    VSCode.
+    Do head over [here][jy-vscode] on how to enable the usage of 
+    virtual `conda` environments within VSCode.
+
+[vsx-python]: https://marketplace.visualstudio.com/items?itemName=ms-python.python
+[vsx-jy]: https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
+[jy-vscode]: ./05-virtual-env.md#jupyter-kernel-for-vscode
 
 ### Customising VSCode Server
 
@@ -266,29 +300,48 @@ custom image:
 
 While Jupyter Notebooks are viewable, editable and executable within
 a VSCode environment, most are still more familiar with Jupyter's
-interface for interacting with or editing notebooks. We can spin up
-a JupyterLab using the following recommended blocks:
+interface for interacting with or editing notebooks. 
 
-- __Workspace name:__ `<YOUR_HYPHENATED_NAME>-jupyterlab`
-- __Environment:__ `aisg-jupyterlab-server-0-1-0`
-- __Compute Resource:__ `cpu-mid`
-- __Data Source:__ The PVC that is dedicated to your project. For a
-  sample project, you may make use of `sample-project-pvc`.
+=== "Run:ai Workspaces"
+    We can spin up a JupyterLab using the following recommended blocks:
 
-!!! warning "Attention"
-    Under the `Environment` block, there is an expandable section called
-    `More settings`. Under this section, you can provide more arguments
-    for a container that will be spun up for the workspace. For the
-    JupyterLab interface to be able to access any PVC mounted to the
-    container, you should include the following argument:
-    `--NotebookApp.notebook_dir="/path/to/pvc"`.
+    - __Workspace name:__ `<YOUR_HYPHENATED_NAME>-jupyterlab`
+    - __Environment:__ `aisg-jupyterlab-server-0-1-0`
+    - __Compute Resource:__ `cpu-mid`
+    - __Data Source:__ The PVC that is dedicated to your project. For a
+      sample project, you may make use of `sample-project-pvc`.
+
+    !!! warning "Attention"
+        Under the `Environment` block, there is an expandable section 
+        called `More settings`. Under this section, you can provide more 
+        arguments for a container that will be spun up for the 
+        workspace. For the JupyterLab interface to be able to access any 
+        PVC mounted to the container, you should include the following argument: `--NotebookApp.notebook_dir="/path/to/pvc"`.
+
+    Once you have selected the blocks, you can proceed to create the
+    workspace and you will be redirected to the workspaces page. On this
+    page, you may view the status of the workspace that you have just
+    created.
 
-Once you have selected the blocks, you can proceed to create the
-workspace and you will be redirected to the workspaces page. On this
-page, you may view the status of the workspace that you have just
-created.
+    ![Run:ai Dashboard - Workspaces Page Post JupyterLab](assets/screenshots/runai-dashboard-workspaces-page-post-jupyterlab.png)
 
-![Run:ai Dashboard - Workspaces Page Post JupyterLab](assets/screenshots/runai-dashboard-workspaces-page-post-jupyterlab.png)
+=== "Run:ai YAML"
+
+    You can also create a Jupyter workspace with the YAML file 
+    `aisg-context/runai/02b-jupyterlab.yml`. But before that, you would
+    need to prepare the workspace to spin it up with:
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/01-workspace-prep.yml
+    ```
+
+    After that, you can spin up the workspace with:
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/02b-jupyterlab.yml
+    ```
 
 Once the workspace is active (indicated by a green status), you may
 access the workspace by clicking on the `CONNECT` button and choosing
@@ -328,9 +381,9 @@ directed to a view like such:
 
 ![Run:ai - JupyterLab Server Welcome](assets/screenshots/runai-jupyterlab-server-launcher.png)
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [Jupyter Server Docs - Config file and command line options](https://jupyter-server.readthedocs.io/en/stable/other/full-config.html#other-full-config)
+    - [Jupyter Server Docs - Config file and command line options](https://jupyter-server.readthedocs.io/en/stable/other/full-config.html#other-full-config)
 
 !!! Info
     Do head over
@@ -373,6 +426,6 @@ server as well as any associated files can be found under
     themselves is not feasible by default and while possible,
     should be avoided.
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [Using Docker-in-Docker for your CI or testing environment? Think twice. - jpetazzo](https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
+    - [Using Docker-in-Docker for your CI or testing environment? Think twice. - jpetazzo](https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/)
diff --git a/...utter.repo_name}}/aisg-context/guide-site/docs/guide-for-user/05-virtual-env.md b/...utter.repo_name}}/aisg-context/guide-site/docs/guide-for-user/05-virtual-env.md
@@ -1,5 +1,7 @@
 # Virtual Environment
 
+## Creating Persistent `conda` Environments in the Workspace
+
 While the Docker images you will be using to run experiments on Run:ai
 would contain the `conda` environments you would need, you can also
 create these virtual environments within your development environment,
@@ -47,11 +49,11 @@ workspace directory:
     `pip install` executions. However, there's no similar flag for
     `conda` at the moment so the above is a blanket solution.
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [`conda` Docs - Managing environments](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file)
-- [StackOverflow - "Pip install killed - out of memory - how to get around it?"](https://stackoverflow.com/questions/57058641/pip-install-killed-out-of-memory-how-to-get-around-it)
-- [phoenixNAP - Linux alias Command: How to Use It With Examples](https://phoenixnap.com/kb/linux-alias-command#:~:text=In%20Linux%2C%20an%20alias%20is,and%20avoiding%20potential%20spelling%20errors.)
+    - [`conda` Docs - Managing environments](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file)
+    - [StackOverflow - "Pip install killed - out of memory - how to get around it?"](https://stackoverflow.com/questions/57058641/pip-install-killed-out-of-memory-how-to-get-around-it)
+    - [phoenixNAP - Linux alias Command: How to Use It With Examples](https://phoenixnap.com/kb/linux-alias-command#:~:text=In%20Linux%2C%20an%20alias%20is,and%20avoiding%20potential%20spelling%20errors.)
 
 ## Jupyter Kernel for VSCode
 
@@ -146,6 +148,6 @@ within your `conda` environment.
 
 - Test out the kernel by running the cells in the sample notebook.
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [Jupyter Docs - Kernels (Programming Languages)](https://docs.jupyter.org/en/latest/projects/kernels.html)
+    - [Jupyter Docs - Kernels (Programming Languages)](https://docs.jupyter.org/en/latest/projects/kernels.html)
diff --git a/...ame}}/aisg-context/guide-site/docs/guide-for-user/06-data-storage-versioning.md b/...ame}}/aisg-context/guide-site/docs/guide-for-user/06-data-storage-versioning.md
@@ -16,6 +16,12 @@ at hand within our VSCode server workspace.
     $ unzip mnist-pngs-data-aisg.zip
     ```
 
+=== "Run:ai YAML"
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/03b-data-download.yml
+    ```
+
 !!! info
     The sample data for this guide's problem statement is made
     accessible to the public. Hence any team or individual can download

diff --git a/...repo_name}}/aisg-context/guide-site/docs/guide-for-user/07-job-orchestration.md b/...repo_name}}/aisg-context/guide-site/docs/guide-for-user/07-job-orchestration.md
@@ -32,9 +32,9 @@ the CLI.
     [Hydra](https://hydra.cc/)'s
     concepts before you move on.
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [Hydra Docs - Basic Override Syntax](https://hydra.cc/docs/advanced/override_grammar/basic/)
+    - [Hydra Docs - Basic Override Syntax](https://hydra.cc/docs/advanced/override_grammar/basic/)
 
 ## Data Preparation & Preprocessing
 
@@ -63,6 +63,13 @@ provided in this template:
     $ docker push {{cookiecutter.registry_project_path}}/data-prep:0.1.0
     ```
 
+=== "Run:ai YAML"
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/04-docker-build-dataprep.yml
+    ```
+
 Now that we have the Docker image pushed to the registry, we can submit
 a job using that image to Run:ai\:
 
@@ -92,6 +99,13 @@ a job using that image to Run:ai\:
         --command -- "/bin/bash -c 'source activate {{cookiecutter.repo_name}} && python src/process_data.py process_data.raw_data_dir_path=/<NAME_OF_DATA_SOURCE>/workspaces/<YOUR_HYPHENATED_NAME>/data/mnist-pngs-data-aisg process_data.processed_data_dir_path=/<NAME_OF_DATA_SOURCE>/workspaces/<YOUR_HYPHENATED_NAME>/data/processed/mnist-pngs-data-aisg-processed'"
     ```
 
+=== "Run:ai YAML"
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/05-dataprep.yml
+    ```
+
 After some time, the data processing job should conclude and we can
 proceed with training the predictive model.
 The processed data is exported to the directory
@@ -129,10 +143,10 @@ To log and upload artifacts to ECS buckets through MLflow, you need to
 ensure that the client has access to the credentials of an account that
 can write to a bucket.
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [MLflow Docs - Tracking](https://www.mlflow.org/docs/latest/tracking.html#)
-- [MLflow Docs - Tracking (Artifact Stores)](https://www.mlflow.org/docs/latest/tracking.html#artifact-stores)
+    - [MLflow Docs - Tracking](https://www.mlflow.org/docs/latest/tracking.html#)
+    - [MLflow Docs - Tracking (Artifact Stores)](https://www.mlflow.org/docs/latest/tracking.html#artifact-stores)
 
 ### Container for Experiment Job
 
@@ -159,6 +173,13 @@ we need to build the Docker image to be used for it:
     $ docker push {{cookiecutter.registry_project_path}}/model-training:0.1.0
     ```
 
+=== "Run:ai YAML"
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/06-docker-build-modeltraining.yml
+    ```
+
 Now that we have the Docker image pushed to the registry,
 we can run a job using it:
 
@@ -198,6 +219,13 @@ we can run a job using it:
         --command -- "/bin/bash -c 'source activate {{cookiecutter.repo_name}} && python src/train_model.py train_model.data_dir_path=/<NAME_OF_DATA_SOURCE>/workspaces/<YOUR_HYPHENATED_NAME>/data/processed/mnist-pngs-data-aisg-processed train_model.setup_mlflow=true train_model.mlflow_tracking_uri=<MLFLOW_TRACKING_URI> train_model.mlflow_exp_name=<NAME_OF_DEFAULT_MLFLOW_EXPERIMENT> train_model.model_checkpoint_dir_path=/<NAME_OF_DATA_SOURCE>/workspaces/<YOUR_HYPHENATED_NAME>/{{cookiecutter.repo_name}}/models train_model.epochs=3'"
     ```
 
+=== "Run:ai YAML"
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/07-modeltraining.yml
+    ```
+
 Once you have successfully run an experiment, you may inspect the run
 on the MLflow Tracking server. Through the MLflow Tracking server
 interface, you can view the metrics and parameters logged for the run,
@@ -218,10 +246,10 @@ bucket. You can also compare runs with each other.
     specific Run:ai job by using MLflow's search filter expressions
     and API.
 
-    __Reference(s):__
+    ??? info "Reference Link(s)"
 
-    - [Run:ai Docs - Environment Variables inside a Run:ai Workload](https://docs.run.ai/v2.9/Researcher/best-practices/env-variables/)
-    - [MLflow Docs - Search Runs](https://mlflow.org/docs/latest/search-runs.html)
+        - [Run:ai Docs - Environment Variables inside a Run:ai Workload](https://docs.run.ai/v2.9/Researcher/best-practices/env-variables/)
+        - [MLflow Docs - Search Runs](https://mlflow.org/docs/latest/search-runs.html)
 
 !!! info
     If your project has GPU quotas assigned to it, you can make use of
@@ -366,10 +394,17 @@ by default.
         --command -- "/bin/bash -c 'source activate {{cookiecutter.repo_name}} && python src/train_model.py --multirun train_model.data_dir_path=/<NAME_OF_DATA_SOURCE>/workspaces/<YOUR_HYPHENATED_NAME>/data/processed/mnist-pngs-data-aisg-processed train_model.setup_mlflow=true train_model.mlflow_tracking_uri=<MLFLOW_TRACKING_URI> train_model.mlflow_exp_name=<NAME_OF_DEFAULT_MLFLOW_EXPERIMENT> train_model.model_checkpoint_dir_path=/<NAME_OF_DATA_SOURCE>/workspaces/<YOUR_HYPHENATED_NAME>/{{cookiecutter.repo_name}}/models train_model.epochs=3'"
     ```
 
+=== "Run:ai YAML"
+
+    ```bash
+    # Change the values within the file if any before running this
+    kubectl apply -f aisg-context/runai/08-modeltraining-hp.yml
+    ```
+
 ![MLflow Tracking Server - Hyperparameter Tuning Runs](assets/screenshots/mlflow-tracking-hptuning-runs.png)
 
-__Reference(s):__
+??? info "Reference Link(s)"
 
-- [Run:ai Docs - Environment Variables inside a Run:ai Workload](https://docs.run.ai/v2.9/Researcher/best-practices/env-variables/)
-- [Hydra Docs - Optuna Sweeper Plugin](https://hydra.cc/docs/plugins/optuna_sweeper/)
-- [MLflow Docs - Search Syntax](https://www.mlflow.org/docs/latest/search-syntax.html)
+    - [Run:ai Docs - Environment Variables inside a Run:ai Workload](https://docs.run.ai/v2.9/Researcher/best-practices/env-variables/)
+    - [Hydra Docs - Optuna Sweeper Plugin](https://hydra.cc/docs/plugins/optuna_sweeper/)
+    - [MLflow Docs - Search Syntax](https://www.mlflow.org/docs/latest/search-syntax.html)