update practices

guorbit · Dec 15, 2023 · 8015951 · 8015951
1 parent 3153be2
commit 8015951
Show file tree

Hide file tree

Showing 6 changed files with 213 additions and 17 deletions.
diff --git a/docs/source/conventions_and_guidelines/conventions.md b/docs/source/conventions_and_guidelines/conventions.md
@@ -0,0 +1,133 @@
+## Conventions
+
+### Documentation
+Most of the codebase runs on sphinx-autodoc, which generates documentation from the docstrings in the codebase. The main reason why we use this is so we have an easy to, also most of the time up to date, documentation of all the projects in one place.
+This proviced:
+1. A centralised place for all the documentation
+2. A way to easily understand why certain things are in place
+3. So new people can get up to speed with the project and the team
+
+Template for the documentation generator action can be found below:
+```yaml
+# Simple workflow for deploying static content to GitHub Pages
+name: Deploy Documentation on Pages
+
+on:
+  # Runs on pushes targeting the default branch
+  push:
+    branches: ["main"]
+
+  # Allows you to run this workflow manually from the Actions tab
+  workflow_dispatch:
+
+# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
+# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
+concurrency:
+  group: "pages"
+  cancel-in-progress: false
+
+jobs:
+  # Single deploy job since we're just deploying
+  build:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python 3.10
+      uses: actions/setup-python@v3
+      with:
+        python-version: "3.10"
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install .[dev]
+        pip install -U sphinx
+        pip install furo
+        
+    - name: Build documentation
+      run: |
+        cd docs
+        sphinx-apidoc -e -M --force -o . ../utilities/
+        make html
+    - name: Upload build data
+      uses: actions/upload-artifact@v3
+      with:
+        name: documentation
+        path: ./docs/build/
+
+  deploy:
+    needs: build
+    environment:
+      name: documentation
+      url: ${{ steps.deployment.outputs.page_url }}
+    runs-on: ubuntu-latest
+    steps:
+
+      - name: Checkout
+        uses: actions/checkout@v3
+      - name: Setup Pages
+        uses: actions/configure-pages@v3
+      - name: Download built directory
+        uses: actions/download-artifact@v3
+        with:
+          name: documentation
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v1
+        with:
+          # Upload entire repository
+          path: '.'
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v1
+        with:
+          folder: build
+
+```
+
+The action above can be updated to use other triggers, or to build from a project file.
+
+>**Note:** The action above is a template, available as a tempalte repository (which is strongly encouraged to be used), however it should be updated to fit the project it is used for.
+
+### Typing
+
+Python typing is used to provide type hints so developers better understand what the code is doing, and to provide better static analysis of the codebase.
+
+Typing is added to the codebase using the following syntax:
+```python
+def function_name(arg1: type, arg2: type) -> return_type:
+    ...
+```
+
+Or in practice
+```python
+def add(a: int, b: int) -> int:
+    return a + b
+```
+
+This can be use all around, such that it works for objects, as well:
+```python
+class Person:
+    def __init__(self, name: str, age: int):
+        self.name = name
+        self.age = age
+
+def get_person_age(person: Person) -> int:
+    return person.age
+
+```
+
+It is recommended to set your IDEs type checker to basic to help you identify common problems.
+
+### Linting
+
+Linting is used to provide a consistent code style across the codebase, and to help identify common problems.
+
+The template repository provides the necessary configuration for linting, and it is recommended to use it.
+
+
+
diff --git a/docs/source/conventions_and_guidelines/conventions_tree.rst b/docs/source/conventions_and_guidelines/conventions_tree.rst
@@ -0,0 +1,8 @@
+Conventions and guidelines
+==========================
+
+.. mdinclude:: conventions.md
+
+
+
+
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -7,12 +7,6 @@ Welcome to teamdocs's documentation!
 ====================================
 
 
-On this page you find the documentation for the GU Orbit software team.
-The documentation contains information about:
-- The Team
-- The Goal
-- The Projects
-- The Development procedures
 
 .. mdinclude:: main.md
 
@@ -21,6 +15,10 @@ The documentation contains information about:
    :caption: Table of contents:
 
    projects/project_tree.rst
+   conventions_and_guidelines/conventions_tree.rst
+   tips_and_recommendations/tips_tree.rst
+
+
 
 
 Indices and tables

diff --git a/docs/source/main.md b/docs/source/main.md
@@ -1,15 +1,26 @@
-# Hello world
 
-a lot of other stuf here ...
-```python
-print("Hello world")
-```
+On this page you find the documentation for the GU Orbit software team.
+The documentation contains information about:
+- The Team
+- The Goal
+- The Projects
+- The Development procedures
 
-<details>
-<summary>Click to expand</summary>
+## The Team
+
+The GU Orbit software team is a team of students at the University of Glasgow, developing models and onboard processing software, and pipelines for the ASTREOUS 1 CubeSat project of GU Orbit.
+
+## The Goal
+
+Goal is to achieve on efficient onboard processing capabilities for the ASTREOUS 1 CubeSat project of GU Orbit, providing better overall selection of data to be downlinked.
+Using deep learning we aim to identify common features and patterns, making data selection more effective.
+
+
+Before you start working on the team, familiarise yourself with the documentation of the team you are in the following sections:
+- How to contribute
+- How to communicate
+- How to distribute work
+- How to do code reviews
+- Conventions to follow
 
-```python
-print("Hello world")
-```
 
-</details>
diff --git a/docs/source/tips_and_recommendations/tips_tree.rst b/docs/source/tips_and_recommendations/tips_tree.rst
@@ -0,0 +1,6 @@
+tips and recommendations
+========================
+
+.. mdinclude:: training.md
+
+.. mdinclude:: testing.md
diff --git a/docs/source/tips_and_recommendations/training.md b/docs/source/tips_and_recommendations/training.md
@@ -0,0 +1,40 @@
+## Training and Testing Deep models
+
+### Identifying common problems
+
+In most cases it is recommended to use some sort of visualizer to monitor and log the models performance during training. 
+There are many visualizers but my recommendation is to either use:
+- Tensorboard, which comes built in with tensorflow, can be only accessed locally
+- Weights and Biases, which is a third party tool that can be used with any framework, accessible from the browser
+
+**Overfitting:** Very common problem, where the model starts fitting the training data too well, and lacks generalization. Can be identified by looking at the training and validation loss, and seeing the validation loss increase while the training loss decreases.
+
+**Underfitting:** Another common problem, where the model is not complex enough to fit the data, or just the data has too much noise such the model can't generalise over it. Can be identified by having large errors on both the training and validation data.
+
+**Vanishing gradient:** A problem that occurs when the gradient of the loss function is too small, and the model can't learn. Can be identified by looking at the gradients of the model, and seeing them get smaller and smaller.
+
+**Exploding gradient:** A problem that occurs when the gradient of the loss function is too large, and the model can't learn. Can be identified by looking at the gradients of the model, and seeing them get larger and larger.
+
+### Selecting hyperparameters
+Hyperparameters are the parameters of the model that are passed by the user such as batch size, learning rate, etc...
+General rule of thumb:
+- Set learning rate to 0.001
+- Set batch size to the maximum that fits in memory
+- Set number of epochs to 5-6 for fine tuning, 30-100 for training from scratch (depending on the size of the model and dataset)
+- Set optimizer to Adam
+- Set loss function to categorical crossentropy for classification, mean squared error for regression
+
+It is also recommended to calculate accuracy during training and validation, however in case of generative image tasks this becomes expensive.
+
+### Data augmentation
+Data augmentation is a technique used to increase the size of the dataset by applying transformations to the data, such as rotation, flipping, etc...
+Many dataloader libraries have this built in, such as tensorflow, pytorch, and our utilities library.
+
+### Transfer learning
+Transfer learning is a technique used to use a model that was trained on a different task, and use it for a different task. This is done by removing the last layer of the model, and adding a new one, and training only the new layer. This is useful when the dataset is small, and the model is large, as it allows the model to learn from a larger dataset, and then fine tune it to the new task.
+
+### Early stopping
+Early stopping is a technique used to stop training when the model stops learning. This is done by monitoring the validation loss, and stopping when it stops decreasing. This is useful to prevent overfitting, and to save time.
+
+### Model checkpointing
+Model checkpointing is a technique used to save the model during training, so that it can be loaded later. This is useful to prevent losing progress.