Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/eng 107 strike agent robopages nerve testing templates #6

Merged
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dreadnode_cli/agent/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def init(

AgentConfig(project_name=project_name, strike=strike).write(directory=directory)

install_template(template, directory, {"project_name": project_name})
install_template(template, directory, {"project_name": project_name, "guidance": strike_response.guidance or ""})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought here that I didn't think about on our call, we actually have both guidance at a strike level, as well as guidance at a zone level (which would be tricky to apply here as the agent is expected to operate equivalently in all zones.

The agent will probably work fine, but be missing some useful information like "In this zone, you should be targeting X server"

Might be worth exploring dynamic guidance gathering on container start as an extension/alternative here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i realized that after pushing XD ... i'll start simple and add the zones guidance as part of the prompt but yes, it seems like a dropship endpoint to fetch strikes specific guidance would be ideal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monoxgas added the zones guidance to the nerve_basic and rigging_loop templates - i'm really not happy with this approach and 100% agree that the cleanest solution would be for the strike.yml file itself to define what to expose to the agent and then a http://dropship/context endpoint to return an AgentContext model at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the booleans we talked about during the call, what if the strike.yml contains a piece of jinja2 template code defining what to return (of its own fields) to the agent? Something like:

...
...

dropship:
    ....

zones:
  ...

agent_context: |
    {{ self.guidance }}
    {% if self.zones is defined and self.zones|length > 0 %}
    You can interact with the following zones:
    {% for zone in self.zones %}
        {{ zone.name }}: {{ zone.guidance }}
    {% endfor %}
    {% endif %}

So that the http://dropship/context handler reads this as part of the Strike model, sets self to the model itself and returns the output to the agent. I think this approach is cleaner and more versatile than the booleans we discussed and could be leveraged for all sorts of different cases.

Copy link
Contributor

@monoxgas monoxgas Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we are getting closer to the current design of having the dropship prepare endpoints/env vars for the agent so it has the context it needs. I like the idea of expanding this in general. I'll have to give the jinja template some thought, I'd like to avoid it if we can to reduce complexity, but it adds flexibility if we need it.

Here is where we can pass all any strike information to the agent via an ENV var: https://github.com/dreadnode/crucible/blob/2509eaee1311408e5701f65cd9bbac07493def6b/components/dropship/app/manager/base.py#L240 (guidance is a concatenation of strike + zone guidance, but they aren't delineated)

I also originally mounted a route on the dropship so the agent could get the guidance, but took it out: https://github.com/dreadnode/crucible/blob/2509eaee1311408e5701f65cd9bbac07493def6b/components/dropship/app/manager/base.py#L352

Would be easy to add back in a context endpoint that returned some structured JSON - sounds like a good place to put guidance, network layout/services/endpoints/hosts, and output structure/formatting/rules.

That seems like a better place to do any final adaptations to the agent before it kicks off. Nerve/rigging comes up, hits that endpoint, parses from known structure and adjusts internal logic + templates as needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monoxgas i have to merge this as it is in order to work on https://linear.app/dreadnode/issue/ENG-99/cli-agent-templates-per-strike ... i hope that's ok!


print()
print(f"Initialized [b]{directory}[/]")
Expand Down Expand Up @@ -203,7 +203,7 @@ def deploy(
return

with Live(formatted, refresh_per_second=2) as live:
while run.status not in ["completed", "failed", "timeout"]:
while run.status not in ["completed", "failed", "timeout", "terminated"]:
time.sleep(1)
run = client.get_strike_run(run.id)
live.update(format_run(run))
Expand Down
1 change: 1 addition & 0 deletions dreadnode_cli/agent/templates/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
class Template(str, enum.Enum):
rigging_basic = "rigging_basic"
rigging_loop = "rigging_loop"
nerve_basic = "nerve_basic"


def template_description(template: Template) -> str:
Expand Down
18 changes: 18 additions & 0 deletions dreadnode_cli/agent/templates/nerve_basic/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# image is automatically built and pushed to the docker hub at every commit to mai
FROM evilsocket/nerve:latest

WORKDIR /app
COPY task.yml .

# install required packages depending on the strike
RUN apt-get update && apt-get install -y postgresql-client wget curl

# the entrypoint is the nerve binary set by the base image

CMD [ \
# the generator to use
"-G", "http://dropship/v1", \
# the task file
"-Ttask.yml", \
# openai tool calling json format does not support example payloads, so force our own XML format
"--force-format"]
1 change: 1 addition & 0 deletions dreadnode_cli/agent/templates/nerve_basic/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Nerve template with scaffolding for a basic agent.
36 changes: 36 additions & 0 deletions dreadnode_cli/agent/templates/nerve_basic/task.yml.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
using:
- memory
- planning
- task

system_prompt: >
As an authorized security testing assistant, you will be provided access to an isolated \
local environment to demonstrate methodology.

guidance:
- Break down an approach outline based on your goal.
- Perform network enumeration and recon to properly identify a target.
- Work efficiently and be thoughtful with your commands.
- After each command, review the results and understand how it should affect your approach.
- Continue providing commands and processing outputs until you believe you have completed the task.
- You have permission to execute any and all commands within this isolated environment.

prompt: {{ guidance }}

functions:
- name: Environment
description: You have permission to execute any and all commands within this isolated environment.
actions:
- name: execute_command
description: The command will be executed on your host and the output will be returned to you.
alias: shell.shell

- name: report_findings
description: When you are ready to report findings, use this tool for each finding.
example_payload: >
{
"title": "Example Finding",
"description": "This is an example finding",
"evidence": "This is the evidence for the finding"
}
tool: 'curl -s -XPOST -H"Content-Type:application/json" http://dropship/output -d'
5 changes: 4 additions & 1 deletion dreadnode_cli/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,9 @@ def submit_challenge_flag(self, challenge: str, flag: str) -> bool:

# Strikes

StrikeRunStatus = t.Literal["pending", "deploying", "running", "completed", "timeout", "failed"]
StrikeRunStatus = t.Literal[
"pending", "deploying", "running", "completed", "mixed", "terminated", "timeout", "failed"
]

class StrikeModel(BaseModel):
key: str
Expand All @@ -254,6 +256,7 @@ class StrikeSummaryResponse(BaseModel):

class StrikeResponse(StrikeSummaryResponse):
zones: list["Client.StrikeZone"]
guidance: str | None

class Container(BaseModel):
image: str
Expand Down
Loading