Skip to content

Bootstrapping a new RSP environment

gpfrancis edited this page Jul 20, 2023 · 9 revisions

Instructions for setting up a test version of the RSP

In this walkthrough, we're setting up a new environment with the following info:

Step 1.

Setup Vault secret directory

Instructions can be found here:

https://github.com/lsst-uk/rsp-uk-docs/blob/291591d2f6980c79c0ef5c8dc1490c9e28e9df13/notes/20230510-vault-path-setup.txt

For the "rsptest" environment, this was done by Gareth

The roe environment (rsp.lsst.ac.uk) at this moment uses the Rubin team Vault service: https://vault.lsst.codes The rsp-test environment uses the UK Vault service http://vault.lsst.ac.uk


Step 2.

Create NFS server

See: https://lsst-uk.atlassian.net/wiki/spaces/LUSC/pages/3199270914/RSP+NFS+Server

Step 3.

Create DNS record that we'll be using (rsp-test.lsst.ac.uk)

Navigate to: https://kbs-ddi.is.ed.ac.uk/ DNS / Zones (lsst.ac.uk) / + (Add new DNS RR)

   RR Type: A
   Complete name: rsp-test.lsst.ac.uk
   TTL: 3600
   IP: 192.41.122.208

Note: The IP needs to existing in the available Floating IPs in the Openstack RSP project.


Step 4.

Create OAUth app in github

(see: https://github.com/organizations/LSP-UK/settings/applications/2212097)

For the RSP-test, used the https://github.com/organizations/LSP-UK/ Organization

Settings > Developer Settings > New OAuth App

   Application Name:            rsp-test
   Homepage URL:                https://rsp-test.lsst.ac.uk
   Authorization callback URL:  https://rsp-test.lsst.ac.uk/login

   Client ID:  Client_ID
   Client secret: (Generate a new client Secret) 

Note the Client ID & Client Secret, they will be needed later


Step 5.

Create fork of phalanx - Create configs for new environment

Clone phalanx repo:

    git clone https://github.com/lsst-uk/phalanx-test

    pushd phalanx-test

    git checkout dev/stv-newenv

(Optional)

You can use the phalanx_customizer.py script to create a new environment as a clone of the other, and make the modifications which are defined in a yaml configuration file

Fetch script

   wget https://raw.githubusercontent.com/stvoutsin/phlx-customizer/main/phalanx_customizer/phalanx_customizer.py

Fetch environment yml files or:

   wget https://raw.githubusercontent.com/stvoutsin/phlx-customizer/main/envs/roe.yaml
   wget https://raw.githubusercontent.com/stvoutsin/phlx-customizer/main/envs/rsptest.yaml

   pip install -r https://raw.githubusercontent.com/stvoutsin/phlx-customizer/main/requirements.txt

For usage see: https://github.com/stvoutsin/phlx-customizer

   python3 phalanx_customizer.py . roe.yaml rsptest.yaml

Note: Because the roe environment is using the Rubin team Vault service it does not have an applications/vault-secret-operator/values-roe.yaml file

If we are using a different Vault Service, we're going to need to create a config there like this:

nano applications/vault-secrets-operator/values-rsptest.yaml
   
vault-secrets-operator:
  environmentVars:
    - name: VAULT_TOKEN
      valueFrom:
	secretKeyRef:
	  name: vault-secrets-operator
	  key: VAULT_TOKEN
    - name: VAULT_TOKEN_LEASE_DURATION
      valueFrom:
	secretKeyRef:
	  name: vault-secrets-operator
	  key: VAULT_TOKEN_LEASE_DURATION
  vault:
    address: "https://vault.lsst.ac.uk"

Commit to github phalanx repo

       git add applications/*
       git add environment/*
       git commit -m ".."
       git push

Step 6.

Setup Object Store container in Openstack

Create container with name: async-test Using the Openstack cli:

openstack container create async-test

Set to publicly accessible:

swift post async-test  -r ".r:*"

Step 7.

Generate SSL certificate for DNS record, pass it into secrets

Brief Description:

  • Create a VM, install nginx server on it
  • Point rsp-test.lsst.ac.uk to machine
  • Install certbot on machine
  • Run certbot to generate Let's Encrypt Certificate
  • Copy certs & Key off to another machine

Step 8.

Generate Secrets and push to Vault service

git clone https://github.com/lsst-uk/phalanx-test

cd phalanx

Create Docker config to for pulling / pushing Docker images

Here auth needs to be the auth token you can get from Docker. If necessary you can do this as follows:

  • Log in to https://hub.docker.com
  • Go to your account settings, then security (https://hub.docker.com/settings/security)
  • Click on "new access token", give it a name and "generate"
  • Make a note of the access token
  • Convert it to a base64 encoded string, for example if your docker username is fred and you got the token dckr_pat_v-MNbFBIFAZP5pgNxPOAEshQWa1 then you'd do: echo -n "fred:dckr_pat_v-MNbFBIFAZP5pgNxPOAEshQWa1" | base64 and get a string like ZnJlZDpkY2tyX3BhdF92LU1OYkZCSUZBWlA1cGdOeFBPQUVzaFFXYTE=
    nano ~/.docker/config.json 

    {
        	"auths": {
	        	"https://index.docker.io/v1/": {
		        	"auth": "ZnJlZDpkY2tyX3BhdF92LU1OYkZCSUZBWlA1cGdOeFBPQUVzaFFXYTE="
         		}
	        }
    }

Write the following empty files

echo "{}" > google_creds.json  
echo "{}" > aws-credentials.ini
echo "{}" > butler-gcs-idf-creds.json
echo "{}" > postgres-credentials.txt

Also copy over the tls.key & tls.pub from the certificates we created in step 6 here.

./generate_secrets.py rsptest

[pull-secret .dockerconfigjson] (.docker/config.json to pull images)

  > /absolute_path_to_file/.docker/config.json

  Pass in path to docker auth file here

  [rsp-alerts slack-webhook] (Slack webhook for alerts): [current: ] 
  Can leave empty

  [butler-secret aws-credentials.ini] (AWS credentials for butler)
  > /absolute_path_to_file/aws-credentials.ini
  Path to empty file

  [butler-secret butler-gcs-idf-creds.json] (Google credentials for butler)
  > /absolute_path_to_file/butler-gcs-idf-creds.json
  Path to empty file

  [butler-secret postgres-credentials.txt] (Postgres credentials for butler)
  > /absolute_path_to_file/postgres-credentials.txt
  Path to empty file

  [tap google_creds.json] (file containing google service account credentials)
  > /absolute_path_to_file/google_creds.json
  Path to empty file

  [mobu ALERT_HOOK] (Slack webhook for reporting mobu alerts. Or use None for no alerting.): [current: ] 
  Can leave empty 

  [gafaelfawr cloudsql] (Use CloudSQL? (y/n):): [current: ]
  > n

  [gafaelfawr ldap] (Use LDAP? (y/n):): [current: ]
  > n

 [gafaelfawr auth_type] (Use cilogon or github?): [current: ]
 > github

 [gafaelfawr github-client-secret] (GitHub client secret): [current: ]
 Github Client Secret from Step 3

 [installer argocd.admin.plaintext_password] (Admin password for ArgoCD?): [current: ]
 Password to use for ArgoCD

 [argocd dex.clientSecret] (OAuth client secret for ArgoCD (either GitHub or Google)?): [current: ] 
 Can leave empty

 [vo-cutouts cloudsql] (Use CloudSQL? (y/n):): [current: ]
 > n

 [telegraf influx-token] (Token for communicating with monitoring InfluxDB2 instance): [current: ] 
 Can leave empty

 [cert-manager enabled] (Use cert-manager? (y/n):): [current: ]
 > n 
 Once we want to enable cert-manager, set to y

[ingress-nginx tls.key] (Certificate private key)
 > /path/tls.key   
 Key from Certificate we created in earlier step

[ingress-nginx tls.crt] (Certificate chain)
> /path/tls.pub   
Key from Certificate we created in earlier step

Manual step required:

Because we are using Openstack S3, instead of GCS there is a manual step required here, to enable this for the TAP service results.

We have to manully modify the file generated under secrets/tap to look like this:

{
  "AWS_ACCESS_KEY_ID": "<Access>",
  "AWS_SECRET_ACCESS_KEY": "<Secret>",
  "google_creds.json": "EMPTY\n"
}

For this you will have to generate new Application credentials for accessing Swift/S3 and set the according values there. e.g.:

$ openstack ec2 credentials create
$ openstack ec2 credentials list
+----------------------------------+----------------------------------+----------------------------------+----------------------------------+
| Access                           | Secret                           | Project ID                       | User ID                          |
+----------------------------------+----------------------------------+----------------------------------+----------------------------------+
| 01234567890abcdef01234567890abcd | ef01234567890abcdef01234567890ab | 5b5102968e5347ad89676ae42a5510df | 3393434ef6ce337a8cd16a0d8201add3 |
+----------------------------------+----------------------------------+----------------------------------+----------------------------------+

Write secrets to Vault service

export VAULT_ADDR=http://vault.lsst.ac.uk
export VAULT_TOKEN=${VAULT_TOKEN} # Get this from step 1
./write_secrets.sh rsptest

Step 9.

Create Kubernetes cluster

https://github.com/lsst-uk/rsp-uk-docs/wiki/RSP-Deployment-instructions-on-Openstack-with-Magnum


Step 10.

Install the RSP

Option 1. If we don't need to modify the install.sh script in phalanx:

Run RSP installation as defined in :

https://github.com/lsst-uk/rsp-uk-docs/wiki/RSP-Deployment-instructions-on-Openstack-with-Magnum

Option 2. If we DO have to modify the install.sh script in phalanx.

Normally the installation can be run with single run of the Docker container, which runs everything automatically.

In the branch dev/stv-new, I've had to modify the install.sh script to use the UK Vault service, because it is hard coded to https://vault.lsst.codes

This needs to be proposed as a change to the Rubin team, so that we don't have to edit that file manually.

If you want to run the installer in an interactive environment, for example say we want to install on a new environment and use the UK Vault service, we have to modify the install.sh script in phalanx before running it.

To do this you can change the entrypoint like this:

sudo docker run   \
  -it  \
  --hostname installer  \
  --env REPO=${REPO:?}  \
  --env VAULT_TOKEN=${VAULT_TOKEN:?}  \
  --env BRANCH=${BRANCH:?}  \
  --env ENVIRONMENT=${ENVIRONMENT:?}     \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/certs:/etc/kubernetes/certs"  \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/kube/config:/root/.kube/config" \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/scripts/install.sh:/root/install.sh"  \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/scripts/helper.sh:/root/helper.sh" \
  --entrypoint bash \
  installer

Note the use of --entrypoint bash

This will allow us to make a change to the installation script before running it.

From the interactive shell:

Install nano or other editor

apt-get install -y nano

Install storage class

./root/helper.sh

Fetch phalanx REPO & BRANCH

git clone $REPO /phalanx
git -C /phalanx checkout $BRANCH

Modify installer script

cd /phalanx/installer/

nano install.sh
   ..
   export VAULT_ADDR=http://vault.lsst.ac.uk
   ..

Run installer

./install.sh $ENVIRONMENT $VAULT_TOKEN

If we don't need to modify the installer script, we can just run:

sudo docker run   \
  -it  \
  --hostname installer  \
  --env REPO=${REPO:?}  \
  --env VAULT_TOKEN=${VAULT_TOKEN:?}  \
  --env BRANCH=${BRANCH:?}  \
  --env ENVIRONMENT=${ENVIRONMENT:?}     \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/certs:/etc/kubernetes/certs"  \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/kube/config:/root/.kube/config" \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/scripts/install.sh:/root/install.sh"  \
  --volume ${CUR_DIRECTORY:?}"/phlx-installer/scripts/helper.sh:/root/helper.sh" \
  installer