Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
fbshipit-source-id: 193ba8efdf616daff00ecc6c3985769835e0c3ea
  • Loading branch information
facebook-github-bot committed Dec 6, 2023
0 parents commit 9c7f7c5
Show file tree
Hide file tree
Showing 137 changed files with 1,559,131 additions and 0 deletions.
27 changes: 27 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: lint

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
lint:

runs-on: ubuntu-latest

strategy:
matrix:
python: ["3.10"]

steps:
- uses: actions/checkout@v3

- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}

- name: Run Lint
run: CybersecurityBenchmarks/scripts/run_lint.sh
40 changes: 40 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: tests

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
test:

runs-on: ubuntu-latest

strategy:
matrix:
python: [3.8, 3.9, "3.10"]

steps:
- uses: actions/checkout@v3

- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}

- name: Install latest rust toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: stable
default: true
override: true

- name: Install Weggli
uses: actions-rs/cargo@v1
with:
command: install
args: weggli --rev=9d97d462854a9b682874b259f70cc5a97a70f2cc --git=https://github.com/weggli-rs/weggli

- name: Run Cybersecurity Benchmark Tests
run: CybersecurityBenchmarks/scripts/run_tests.sh
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
*.swp
.DS_Store
CybersecurityBenchmarks/datasets/mitre_responses.json
__pycache__
80 changes: 80 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Code of Conduct

## Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to make participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.

## Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting

## Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

## Scope

This Code of Conduct applies within all project spaces, and it also applies when
an individual is representing the project or its community in public spaces.
Examples of representing a project or community include using an official
project e-mail address, posting via an official social media account, or acting
as an appointed representative at an online or offline event. Representation of
a project may be further defined and clarified by project maintainers.

This Code of Conduct also applies outside the project spaces when there is a
reasonable belief that an individual's behavior may have a negative impact on
the project or its community.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at <[email protected]>. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq
39 changes: 39 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Contributing to SecureLLM
We want to make contributing to this project as easy and transparent as
possible.

## Our Development Process
... (in particular how this is synced with internal changes to the project)

## Pull Requests
We actively welcome your pull requests.

1. Fork the repo and create your branch from `main`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes.
5. Make sure your code lints.
6. If you haven't already, complete the Contributor License Agreement ("CLA").

## Contributor License Agreement ("CLA")
In order to accept your pull request, we need you to submit a CLA. You only need
to do this once to work on any of Meta's open source projects.

Complete your CLA here: <https://code.facebook.com/cla>

## Issues
We use GitHub issues to track public bugs. Please ensure your description is
clear and has sufficient instructions to be able to reproduce the issue.

Meta has a [bounty program](https://www.facebook.com/whitehat/) for the safe
disclosure of security bugs. In those cases, please go through the process
outlined on that page and do not file a public issue.

## Coding Style
* 2 spaces for indentation rather than tabs
* 80 character line length
* ...

## License
By contributing to SecureLLM, you agree that your contributions will be licensed
under the LICENSE file in the root directory of this source tree.
21 changes: 21 additions & 0 deletions CybersecurityBenchmarks/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) Meta Platforms, Inc. and affiliates.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
166 changes: 166 additions & 0 deletions CybersecurityBenchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Getting Started

## Prerequisites
Ensure you have [weggli](https://github.com/weggli-rs/weggli) installed. Their release is outdated and we're currently using revision `9d97d462854a9b682874b259f70cc5a97a70f2cc`:

```
cargo install weggli --rev=9d97d462854a9b682874b259f70cc5a97a70f2cc --git=https://github.com/weggli-rs/weggli
```

Set the environment variable `WEGGLI_PATH` to point to the weggli execution path. For example, if `weggli` is set to reference the path of the executable, use the below command.
```
export WEGGLI_PATH=weggli
```

## Setup Virtual Environment
Set up your virtual environment and install dependencies:
```
python3 -m venv ~/.venvs/CybersecurityBenchmarks
```
```
source ~/.venvs/CybersecurityBenchmarks/bin/activate
```
```
pip3 install -r CybersecurityBenchmarks/requirements.txt
```

## Runing the benchmarks
The main module for benchmarsk is `CybersecurityBenchmarks.benchmark.run`. Run
```
python3 -m CybersecurityBenchmarks.benchmark.run --help
```
to get a list of available options.

We simplify the following commands by setting a `DATASETS` environment variable:
```
export DATASETS=$PWD/CybersecurityBenchmarks/datasets
```
We currently support three kinds of benchmarks: MITRE, instruct, and auto-complete. Please refer to the whitepaper for detailed discussion of the different kinds.

Each benchmark can run tests for multiple LLMs. Our command line interface uses the format `<PROVIDER>::<MODEL>::<API KEY>` to specify an LLM to test. We currently support the following LLMs:

- `OPENAI::gpt-4::<API KEY>`
- `OPENAI::gpt-3.5-turbo::<API KEY>`
- `ANYSCALE::meta-llama/Llama-2-13b-chat-hf::<API KEY>`
- `TOGETHER::togethercomputer/llama-2-7b::<API KEY>`

### How to use a different model supported by Anyscale or Together?
1. Lookup the models supported by [Anyscale](https://d3qavlo5goipcw.cloudfront.net/guides/models#select-a-model) or [Together](https://docs.together.ai/docs/fine-tuning-models)
2. Update ```valid_models``` method in [llm.py](https://github.com/facebookresearch/PurpleLlama/blob/main/CybersecurityBenchmarks/benchmark/llm.py) with the model name that you choose to test with.

### How to run benchmarks for self hosted models?
1. Extend `llm.py`.
2. `def query(self, prompt: str) -> str:` Implement your inferencing logic.
3. Make sure response of your llm is always a string, thats returned from the `query` function.
4. Update supported providers in `llm.create` method with the name of LLM you created from step (1).
5. Pass the name of your LLM in `--llm-under-test` options in the following format (`LLM Name::model-name::random-string`)

## Running MITRE benchmark

```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=mitre \
--prompt-path="$DATASETS/mitre/mitre_benchmark_100_per_category_with_augmentation.json" \
--response-path="$DATASETS/mitre_responses.json" \
--judge-response-path="$DATASETS/mitre_judge_responses.json" \
--stat-path="$DATASETS/mitre_stat.json" \
--judge-llm="OPENAI::gpt-3.5-turbo::<YOUR API KEY>" \
--expansion-llm="OPENAI::gpt-3.5-turbo::<YOUR API KEY>" \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
Notes:
- Please run this command from the root directory of the PurpleLlama project.
- `run-llm-in-parallel` makes async parallel calls to the specified LLM provider. This speeds up prompt evaluation.


The MITRE benchmarks are run in three steps:

- *Processing prompts* - The LLM is called (in parallel, if specified) to render the LLM output for each prompt.
- *Response expansion* - The expansion LLM expands the initial responses from the prompt.
- *Judging expanded responses* - The judge LLM decided whether the expanded responses would be helpful in implementing a cyberattack.

The expansion LLM model and judge LLM model are independent of the initial LLM for processing prompts. Although we've used Llama and Code Llama models for the original paper, we recommend using GPT-3.5 including an OpenAI API key.

## Running Instruct and Autocomplete benchmarks

For instruct benchmarks run:
```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=instruct \
--prompt-path="$DATASETS/instruct/instruct.json" \
--response-path="$DATASETS/instruct_responses.json" \
--stat-path="$DATASETS/instruct_stat.json \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
Notes:
- Please run this command from the root directory of the PurpleLlama project.

### For autocomplete benchmarks run:
```
python3 -m CybersecurityBenchmarks.benchmark.run \
--benchmark=autocomplete \
--prompt-path="$DATASETS/autocomplete/autocomplete.json" \
--response-path="$DATASETS/autocomplete_responses.json" \
--stat-path="$DATASETS/instruct_stat.json \
--llm-under-test=<SPECIFICATION_1> --llm-under-test=<SPECIFICATION_2> ...
[--run-llm-in-parallel]
```
Notes:
- Please run this command from the root directory of the PurpleLlama project.

Both the instruct and autocomplete benchmarks are run in two steps:

- *Processing prompts* - The LLM is called (in parallel, if specified) to render the LLM output for each prompt.
- *Processing responses* - We use the insecure code detector for evaluating each response, to see if vunerable code was produced.

## Results:
Once the benchmarks have run, the evaluations of each model across each language will be available under the `stat_path`:

### MITRE Results

```
{
"model_name": {
"category_name": {
"refusal_count": ...,
"malicious_count": ...,
"benign_count": ...,
"total_count": ...,
"benign_percentage": ...,
"else_count": ...
},
}
}
```

### Instruct and Autocomplete Results

```
{
"model_name": {
"language": {
"bleu": ...,
"total_count": ...,
"vulnerable_percentage": ...,
"vulnerable_suggestion_count": ...
}
}
}
```

## Exit the Virtual Environment
And once you're done you can leave the python environment with
```
deactivate
```

# Developer's Guide
To run unit tests follow the setup instructions above. You can then run
```
(CybersecurityBenchmarks) $ cd CybersecurityBenchmarks
```
```
(CybersecurityBenchmarks) $ python3 -m unittest
```
Empty file.
Empty file.
Loading

0 comments on commit 9c7f7c5

Please sign in to comment.