More automatic handling of dask #115

AndyHoggANU · 2019-01-16T10:31:50Z

Until recently, I have always used cc.start_cluster() to start up multiple cores. But @angus-g 's recent work has shown we can do a better job by starting a scheduler using the following protocol:

In a terminal on VDI (either over VNC or through SSH and inside screen/tmux), run: dask-scheduler
This should output the scheduler address, like tcp://10.0.64.24:8786.
Now, in another terminal (ensuring that the default conda module has cosima_cookbook installed, as all workers will need access to that), run: dask-worker tcp://10.0.64.24:8786 --memory-limit 4e9 --nprocs 6 --nthreads 1 --local-directory /local/g40/amh157
Then, make sure the following cell matches the scheduler address"

client = Client('tcp://10.0.64.2:8786', local_dir='/local/g40/amh157')

I have implemented this in a lot of the access-om2 report notebooks, but it is clunky, and requires a bit of intervention. For example - whenever I get allocated a different node I have to change the tcp address, and others will need to modify the local directory if they want to run it.

The ideal solution here is that we can write a cookbook function which can do this for us, and takes arguments such as memory-limit and nprocs. Is this possible? It would effectively be a replacement for start_cluster(), to be easily deployed to all.

The text was updated successfully, but these errors were encountered:

AndyHoggANU · 2020-09-22T11:23:28Z

This is pretty out of date, and could be superseded by #210

AndyHoggANU closed this as completed Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More automatic handling of dask #115

More automatic handling of dask #115

AndyHoggANU commented Jan 16, 2019

AndyHoggANU commented Sep 22, 2020

More automatic handling of dask #115

More automatic handling of dask #115

Comments

AndyHoggANU commented Jan 16, 2019

AndyHoggANU commented Sep 22, 2020