You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Until recently, I have always used cc.start_cluster() to start up multiple cores. But @angus-g 's recent work has shown we can do a better job by starting a scheduler using the following protocol:
In a terminal on VDI (either over VNC or through SSH and inside screen/tmux), run: dask-scheduler
This should output the scheduler address, like tcp://10.0.64.24:8786.
Now, in another terminal (ensuring that the default conda module has cosima_cookbook installed, as all workers will need access to that), run: dask-worker tcp://10.0.64.24:8786 --memory-limit 4e9 --nprocs 6 --nthreads 1 --local-directory /local/g40/amh157
Then, make sure the following cell matches the scheduler address"
I have implemented this in a lot of the access-om2 report notebooks, but it is clunky, and requires a bit of intervention. For example - whenever I get allocated a different node I have to change the tcp address, and others will need to modify the local directory if they want to run it.
The ideal solution here is that we can write a cookbook function which can do this for us, and takes arguments such as memory-limit and nprocs. Is this possible? It would effectively be a replacement for start_cluster(), to be easily deployed to all.
The text was updated successfully, but these errors were encountered:
Until recently, I have always used cc.start_cluster() to start up multiple cores. But @angus-g 's recent work has shown we can do a better job by starting a scheduler using the following protocol:
dask-scheduler
This should output the scheduler address, like
tcp://10.0.64.24:8786
.cosima_cookbook
installed, as all workers will need access to that), run:dask-worker tcp://10.0.64.24:8786 --memory-limit 4e9 --nprocs 6 --nthreads 1 --local-directory /local/g40/amh157
I have implemented this in a lot of the access-om2 report notebooks, but it is clunky, and requires a bit of intervention. For example - whenever I get allocated a different node I have to change the tcp address, and others will need to modify the local directory if they want to run it.
The ideal solution here is that we can write a cookbook function which can do this for us, and takes arguments such as memory-limit and nprocs. Is this possible? It would effectively be a replacement for start_cluster(), to be easily deployed to all.
The text was updated successfully, but these errors were encountered: