allow launcher to target remote compute like Kubernetes or AWS ECS #1702

mwartell · 2022-10-17T19:46:18Z

The launcher phase of armory gathers up environment and local mounts and then asks a local docker daemon to run an engine container. This yields a partially remote application, but one that can only be run on localhost.

It would be useful to make the engine "run anywhere" where "anywhere" could be a Kubernetes Cluster, an AWS Elastic Container Service, or Nomad, or Azure Kubernetes Service etc.

This would allow an Armory user to spawn multiple concurrent instances like:

for epsilon in (0.1 0.2 0.3)
    armory run my-experiment.yaml --attack.epsilon=$epsilon --remote=my-kubernetes-cluster

The major hurdles to implementing this are

a determination by the Armory leads on which remote is worth adding
a mechanism for a remote to get resource pulls (weights, datasets, repos, etc.) when run remotely
a mechanism to get results and logs back from the remote

To the extent possible, the code should be as platform agnostic as possible to, for example, ease Azure deployment once the ECS part is written. I am not aware of a generic adaptation library that could help with this.

The text was updated successfully, but these errors were encountered:

christopherwoodall · 2022-10-28T13:56:25Z

Containerized Armory POC: https://gist.github.com/christopherwoodall/2b30dbd807e6eb77edef25ad1e862820

curl -fsSLO https://gist.github.com/christopherwoodall/2b30dbd807e6eb77edef25ad1e862820/raw/2d0695bf1b08af6ae7a30fab9538db600f22bf81/armory-launcher.sh && \
chmod +x armory-launcher.sh && \
./armory-launcher.sh

Currently only supports docker, but it demonstrates the ability to containerize the entire application; thus allowing armory to target cluster environments without modifying the host system.

To further aid in this effort, work has been done to centralize the armory container into a single image. Aside from simplicity, the armory application size is reduced 4x to 12GB compared to previous versions at 48GB.

christopherwoodall · 2022-10-28T14:45:58Z

I also think @shenshaw26 comment is relevant:

On Containerization:

I am still not sure I am clear as to what we are trying to achieve with the armory containerization bits. For example, why don't we just let armory be the execution engine such that armory run ... executes all things in the native environment and, then, if a person desires to use a docker container they can simply call docker run -it [container] ... armory run ..... It seems like that are a variety of ways one might want to run armory inside docker (e.g. mount external drives for storage vs keep it all completed contained in container) and its a lot of code maintainence / support to try to cover these bits.
It seems like there is some machinery in armory pointing to an approach where armory would manage multiple armory instances in parallel...is that something we actually support? If so, do we really need to do that? If it is a real requirement, we should use some sort of orchestration layer on top of docker (e.g. kube, mesos, etc.) to make this happen....am I missing something?
What is the level of use around the various armory execution modes (e.g. armory exec, --interactive, etc.)? In my very short time here, I have really only seen people use armory run thing.json and if that is all that is used, we could drastically simplify the codebase which would make it much easier to maintain. Also, FWIW, many of these things feel like they just belong in docs for how to do things in docker.

As well as @davidslater comment:

Regarding containerization:

Portability/replicability/integration. For the program, it is extremely important that we can replicate all that is being done with a defensive technique. Once we start adding a lot of docker connections that we don't have control over, it becomes much harder to validate/verify what has been done on their end. In my experience, usage of native mode has been rare (mainly due to folks who can't install docker), so it doesn't make sense to me to default to that. I do think it would be good to enable the default environment mode to be configurable so that people don't have to pass kwargs all the time for that, though.

The multiple armory instances functionality can be dropped. Typically, we do parallelization with separate armory run commands, and an orchestrator that does this parallelization and then collates results should probably be separate from the core codebase.

I use --interactive mode pretty extensively. I think Lucas does as well. It's very helpful for setting break points, debugging issues, more quickly iterating, etc. For exec mode, I only occasionally use that, but it's almost identical to what launch --interactive is doing. The --jupyter mode sees less use, but is also one of the more requested modes, so I would like to expand usage of that.

mwartell mentioned this issue Oct 17, 2022

empat major development tasks #1696

Open

11 tasks

mwartell added docker cloud infrastructure labels Oct 17, 2022

mwartell assigned mwartell and christopherwoodall Oct 18, 2022

mwartell added the discussion needed label Oct 18, 2022

mwartell assigned davidslater and matthewdonizetti Oct 18, 2022

mwartell added this to the empat control and execution improvements milestone Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow launcher to target remote compute like Kubernetes or AWS ECS #1702

allow launcher to target remote compute like Kubernetes or AWS ECS #1702

mwartell commented Oct 17, 2022 •

edited

Loading

christopherwoodall commented Oct 28, 2022 •

edited

Loading

christopherwoodall commented Oct 28, 2022 •

edited

Loading

allow launcher to target remote compute like Kubernetes or AWS ECS #1702

allow launcher to target remote compute like Kubernetes or AWS ECS #1702

Comments

mwartell commented Oct 17, 2022 • edited Loading

christopherwoodall commented Oct 28, 2022 • edited Loading

christopherwoodall commented Oct 28, 2022 • edited Loading

mwartell commented Oct 17, 2022 •

edited

Loading

christopherwoodall commented Oct 28, 2022 •

edited

Loading

christopherwoodall commented Oct 28, 2022 •

edited

Loading