This Ansible playbook allows you to setup an Ubuntu instance (18.04) ready for a Data Carpentry Genomic Workshop. This allows the workshop to easily be run on various cloud platforms.
You'll typically want two cores and ~4Gb RAM per instance.
- Ansible (eg
pip install ansible
)
-
Edit
hosts
to add the IP(s) or hostname(s) of your instances. Ensure you've runssh-add ~/.ssh/some_key
to allow you to access these without a password. The password for thedcuser
account can be changed invars/all.yml
. -
Run the playbook:
ansible-playbook -i hosts training-instance.yml
Typically you'd run this on a single instance, snapshot it to create an image and then launch many instances for workshop participants based on that image.
The instances can also be setup for the "Singularity Containers for Bioinformatics" and "Containers on HPC and Cloud with Singularity" tutorials written by Marco De La Pierre at the Pawsey Super Computing Centre.
ansible-galaxy install --force -r requirements.yml -p galaxy-roles/
ansible-playbook -i hosts containers.yml
In this case, users are given sudo
access in order to allow them to run docker build
etc.
(a NeCTAR image exists for this too, named containers-workshop-20200720
).
If you would like to launch a set of instance on Openstack / NeCTAR using the shell you'll need the openstack
commandline client (pip install python-openstackclient
), source your credentials (source My-Nectar-Project-openrc.sh
) and run something like:
openstack server create \
--image data-carpentry-genomics-20191121 \
--key-name mbp_hosts \
--flavor m3.small \
--availability-zone monash-01 \
--security-group default \
--max 20 \
--wait \
dc-genomics-training
This assumes there is a public/community image or snapshot named data-carpentry-genomics-20191121
- this one exists in NeCTAR since I created it.
You can get a list of the running server IPs to paste into the Etherpad with:
openstack server list --format csv | tail -n +2 | \
grep dc-genomics-training | cut -f 4 -d ',' | \
sed 's/\"//g' | cut -f 2 -d '=' | grep . | sort -h
When the workshop is finished and you want to destroy all the servers:
# Careful - all servers and their data in the current 'project'
# with 'dc-genomics-training' in the name will be destroyed, unrecoverably.
server_id_list=$(openstack server list --format csv | tail -n +2 | grep dc-genomics-training | cut -f 1 -d ',' | sed 's/\"//g' | cut -f 2 -d '=' | grep . | sort -h | xargs)
openstack server delete ${server_id_list}
If you have access to the instances as the sudo
user ubuntu
, you can change the dcuser
password en masse using Ansible like:
# Create a file with one host IP per line
openstack server list --format csv | tail -n +2 | \
grep dc-genomics-training | cut -f 4 -d ',' | sed 's/\"//g' | cut -f 2 -d '=' | grep . \
| cut -d "'" -f 4 | sort -h \
>hosts
# Ensure you have the passlib library installed locally
pip install passlib
# Use Ansible to change the password on all hosts
ansible all -i hosts -m user -a "name=dcuser update_password=always password={{ 'thenewpassword' | password_hash('sha512') }}" -u ubuntu --become
- Add RStudio Server
- Edit .bashrc (remove
ll
alias, maybe other things) - Automate generating a NeCTAR / OpenStack image after instance setup, eg http://khmel.org/?p=1188
- Create a testing task that runs through the Data Wrangling section commands to test that the instance works.