From b2cc43a1066715d9e8e4ad084245ffc74a9ac0c7 Mon Sep 17 00:00:00 2001 From: toolCHAINZ Date: Fri, 24 Jan 2025 09:26:10 -0500 Subject: [PATCH 1/2] Merge dev changes (#31) * Target newer z3, fmt, clippy * Remove reference * Add catch for sleigh size mismatch * Fix condition * fmt * Initial context work * Add context; will slowly move stuff over to using this instead * Add derives * Pub z3 * cargo fmt * Ditch cargo lock * Update ci * Remove registers from context for now until link issues are fixed * Relax context requirement * Add state equality helper * Bump z3 * Store language id in sleigh context * Update image section debug * Oops * Actually run cargo check this time; comment out thing I didn't finish writing * Try section flags instead of segment flags * Add bb-read * Only load executable sections into ghidra for now * Only load executable sections into ghidra for now * Try bumping to ghidra 11.1 * Update how loading is done for ghidra 11.1 * Fix get_registers() * Update test * Add section parsing log * Fmt * Show range instead * Kludgy try_from impl, silencing warnings, fmt, clippy * Clone if there's only one * impl (Partial)Eq for Instruction * Derive ParitalEq/Eq so that I can derive Hash soundly * Readme tweak * Remove unnecessary compile API now that Ghidraships with precompiled sla. Also added a new bin target for jingle * Initial CLI * cargo fmt * Basic bin functionality * Add readme note and two missing operations * Simplify printed model * Sleigh parsing tweaks * Update logo * Gimli change * Block tweak * Add input enumeration * Add constraint * Explicitly add pointer dependencies to input call * Add arch and fmt * Initial context stuff * Fixed up C++ build side, now to fix FFI * Fix stuff and fmt. Builds, but need to make sure it actually works * Need to fix tests now * Some small tweaks * Gitignore, heap-allocate some stuff * Tweaks * Move back to storing all images directly in context * Add test * Bump ghidra to 11.2 * Fix jingle build * Fix jingle binary build * Fixes context variables * Re-add image * Add initialize call * Some bounds checking fixes * Move pcode/assembly emitters into their own files * Add test * Change get_reg impl for now * Add wrapper to ensure an image is loaded before parsing * Fmt * Clippy * More clippy * Fix jingle * fmt * Clippy * Fix binary * Don't consume varnode in `get_register_name` * Initial trait work * Tweaks * Some build fixes * More stuff * Maybe just need to add the impls now? * Build fixed * fmt * Clippy * Actually fix build * Fix crashes * Start on gimli * Impl gimli * actually actually fix build * Fmt and gimli tweak * Remove unused file * Remove more unused files * Fix build * clippy --fix * Clippy fixes * Changes to traits * fmt * Clippy * pub perms * More trait stuff * Convert LoadedSleighContext to struct * Trying more stuff * More trait gymnastics * Clippy * Remove unused generic bounds * Add owned file * Pub all of gimli * Reshuffle * Filter * Fix loading * Display register names * Change display impl * fmt * clippy * fmt * Name tweak * Add read_bytes * cherrypick get_bytes * Add helper to summarize branches * Update logo * Update jingle.svg * Made ImageSectionIterator::new pub This would allow users to implement ImageProvider trait for their datatypes * Target master branch of z3.rs * Add rebasing API * Fix rebasing API * Context refactor * Fix formatting * fmt * Clippy * Only expose image bytes in the code space * Blanket impl for ImageProvider * Fmt * Fix spaces * Bundle Zlib (#23) * Experiment with bundling zlib * Check if testing works in CI * Suppress warnings * Revert workflow change * Re-add all-features * Multiplatform CI (#24) * Enforce Fmt and Lint in CI * Build jingle_sleigh on linux, macos, and windows * Build jingle on linux * Fix ldefs (#26) * fix ldef * return error if there are not ldefs * fix clippy --------- Co-authored-by: daniele.linguaglossa * Additional CI refactor (#28) * Steal dtolnay's CI configuration * Add deflate.c to compilation (#29) * Build tweak (#30) * Fix exception warning, reorganize build rs paths * Add zconf for windows build * Re-add flag * Needed more trees I guess * Left shift tweak * Remove unnecessary pin. Add favicon.svg. * Some CLI stuff * Fix and fmt * Clippy fix --------- Co-authored-by: chf0x <163881601+chf0x@users.noreply.github.com> Co-authored-by: Daniele Linguaglossa Co-authored-by: daniele.linguaglossa --- .github/workflows/check.yml | 13 - .github/workflows/jingle.yml | 32 + .github/workflows/jingle_sleigh.yml | 45 ++ .github/workflows/style.yml | 36 + Cargo.lock | 631 ------------------ README.md | 30 +- favicon.svg | 1 + jingle.svg | 1 + jingle/Cargo.toml | 14 +- jingle/README.md | 89 ++- jingle/src/context.rs | 71 ++ jingle/src/error.rs | 6 +- jingle/src/lib.rs | 7 + jingle/src/main.rs | 201 ++++++ jingle/src/modeling/block.rs | 36 +- jingle/src/modeling/branch.rs | 28 +- jingle/src/modeling/instruction.rs | 31 +- jingle/src/modeling/mod.rs | 171 +++-- jingle/src/modeling/slice.rs | 6 +- jingle/src/modeling/state/mod.rs | 173 +++-- jingle/src/modeling/state/space.rs | 110 ++- jingle/src/translator.rs | 23 +- jingle/src/varnode/display.rs | 2 +- jingle/src/varnode/mod.rs | 10 +- jingle_sleigh/.gitignore | 1 + jingle_sleigh/CMakeLists.txt | 12 +- jingle_sleigh/Cargo.toml | 15 +- jingle_sleigh/build.rs | 262 ++++++-- jingle_sleigh/ghidra | 2 +- .../src/context/builder/image/gimli.rs | 70 -- .../src/context/builder/image/mod.rs | 61 -- .../src/context/builder/language_def.rs | 5 +- jingle_sleigh/src/context/builder/mod.rs | 60 +- .../src/context/builder/processor_spec.rs | 9 +- .../src/context/{builder => }/image/elf.rs | 4 +- jingle_sleigh/src/context/image/gimli.rs | 185 +++++ jingle_sleigh/src/context/image/mod.rs | 176 +++++ .../src/context/instruction_iterator.rs | 90 +++ jingle_sleigh/src/context/loaded.rs | 265 ++++++++ jingle_sleigh/src/context/mod.rs | 197 +++--- jingle_sleigh/src/error.rs | 13 +- jingle_sleigh/src/ffi/addrspace.rs | 9 +- jingle_sleigh/src/ffi/compile.rs | 94 --- jingle_sleigh/src/ffi/context_ffi.rs | 95 ++- jingle_sleigh/src/ffi/cpp/.gitignore | 3 +- jingle_sleigh/src/ffi/cpp/context.cpp | 208 ++---- jingle_sleigh/src/ffi/cpp/context.h | 38 +- .../src/ffi/cpp/dummy_load_image.cpp | 18 + jingle_sleigh/src/ffi/cpp/dummy_load_image.h | 21 + jingle_sleigh/src/ffi/cpp/exception.h | 11 +- .../src/ffi/cpp/jingle_assembly_emitter.cpp | 11 + .../src/ffi/cpp/jingle_assembly_emitter.h | 21 + .../src/ffi/cpp/jingle_pcode_emitter.cpp | 33 + .../src/ffi/cpp/jingle_pcode_emitter.h | 20 + jingle_sleigh/src/ffi/cpp/rust_load_image.cpp | 28 + jingle_sleigh/src/ffi/cpp/rust_load_image.h | 24 + .../src/ffi/cpp/varnode_translation.cpp | 16 + .../src/ffi/cpp/varnode_translation.h | 13 + jingle_sleigh/src/ffi/image.rs | 21 - jingle_sleigh/src/ffi/instruction.rs | 2 +- jingle_sleigh/src/ffi/mod.rs | 42 +- jingle_sleigh/src/instruction.rs | 65 +- jingle_sleigh/src/pcode/branch.rs | 27 + jingle_sleigh/src/pcode/display.rs | 295 +------- jingle_sleigh/src/pcode/mod.rs | 228 ++++++- jingle_sleigh/src/space.rs | 3 +- jingle_sleigh/src/varnode/display.rs | 23 +- jingle_sleigh/src/varnode/mod.rs | 52 +- 68 files changed, 2752 insertions(+), 1863 deletions(-) delete mode 100644 .github/workflows/check.yml create mode 100644 .github/workflows/jingle.yml create mode 100644 .github/workflows/jingle_sleigh.yml create mode 100644 .github/workflows/style.yml delete mode 100644 Cargo.lock create mode 100644 favicon.svg create mode 100644 jingle.svg create mode 100644 jingle/src/context.rs create mode 100644 jingle/src/main.rs delete mode 100644 jingle_sleigh/src/context/builder/image/gimli.rs delete mode 100644 jingle_sleigh/src/context/builder/image/mod.rs rename jingle_sleigh/src/context/{builder => }/image/elf.rs (98%) create mode 100644 jingle_sleigh/src/context/image/gimli.rs create mode 100644 jingle_sleigh/src/context/image/mod.rs create mode 100644 jingle_sleigh/src/context/instruction_iterator.rs create mode 100644 jingle_sleigh/src/context/loaded.rs delete mode 100644 jingle_sleigh/src/ffi/compile.rs create mode 100644 jingle_sleigh/src/ffi/cpp/dummy_load_image.cpp create mode 100644 jingle_sleigh/src/ffi/cpp/dummy_load_image.h create mode 100644 jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.cpp create mode 100644 jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.h create mode 100644 jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.cpp create mode 100644 jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.h create mode 100644 jingle_sleigh/src/ffi/cpp/rust_load_image.cpp create mode 100644 jingle_sleigh/src/ffi/cpp/rust_load_image.h create mode 100644 jingle_sleigh/src/ffi/cpp/varnode_translation.cpp create mode 100644 jingle_sleigh/src/ffi/cpp/varnode_translation.h delete mode 100644 jingle_sleigh/src/ffi/image.rs create mode 100644 jingle_sleigh/src/pcode/branch.rs diff --git a/.github/workflows/check.yml b/.github/workflows/check.yml deleted file mode 100644 index 6fadea7..0000000 --- a/.github/workflows/check.yml +++ /dev/null @@ -1,13 +0,0 @@ -name: Check -on: [push] -jobs: - build: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v2 - with: - submodules: true - - uses: actions-rs/toolchain@v1 - with: - toolchain: stable - - run: cargo check --all-features diff --git a/.github/workflows/jingle.yml b/.github/workflows/jingle.yml new file mode 100644 index 0000000..bf4f6f8 --- /dev/null +++ b/.github/workflows/jingle.yml @@ -0,0 +1,32 @@ +name: jingle + +on: + push: + pull_request: + workflow_dispatch: + +jobs: + pre_ci: + uses: dtolnay/.github/.github/workflows/pre_ci.yml@master + + build: + name: ${{matrix.name || format('Rust {0}', matrix.rust)}} + needs: pre_ci + if: needs.pre_ci.outputs.continue + runs-on: ${{matrix.os}}-latest + strategy: + fail-fast: false + matrix: + rust: [nightly, beta, stable] + os: [ubuntu] + env: + RUSTFLAGS: --cfg deny_warnings -Dwarnings + timeout-minutes: 45 + steps: + - uses: actions/checkout@v4 + with: + submodules: true + - uses: dtolnay/rust-toolchain@master + with: + toolchain: ${{matrix.rust}} + - run: cargo build --all-features --manifest-path jingle_sleigh/Cargo.toml \ No newline at end of file diff --git a/.github/workflows/jingle_sleigh.yml b/.github/workflows/jingle_sleigh.yml new file mode 100644 index 0000000..68b2e5f --- /dev/null +++ b/.github/workflows/jingle_sleigh.yml @@ -0,0 +1,45 @@ +name: jingle_sleigh + +# Stealing the multi-platform CI configuration from +# https://github.com/dtolnay/cxx/blob/master/.github/workflows/ci.yml +# for testing build using CXX. + +on: + push: + pull_request: + workflow_dispatch: + +jobs: + pre_ci: + uses: dtolnay/.github/.github/workflows/pre_ci.yml@master + + build: + name: ${{matrix.name || format('Rust {0}', matrix.rust)}} + needs: pre_ci + if: needs.pre_ci.outputs.continue + runs-on: ${{matrix.os}}-latest + strategy: + fail-fast: false + matrix: + rust: [nightly, stable] + os: [ubuntu] + include: + - name: Cargo on macOS + rust: nightly + os: macos + - name: Cargo on Windows (msvc) + rust: nightly-x86_64-pc-windows-msvc + os: windows + flags: /EHsc + env: + CXXFLAGS: ${{matrix.flags}} + RUSTFLAGS: --cfg deny_warnings -Dwarnings + timeout-minutes: 45 + steps: + - uses: actions/checkout@v4 + with: + submodules: true + - uses: dtolnay/rust-toolchain@master + with: + toolchain: ${{matrix.rust}} + - run: cargo build --all-features --manifest-path jingle_sleigh/Cargo.toml \ No newline at end of file diff --git a/.github/workflows/style.yml b/.github/workflows/style.yml new file mode 100644 index 0000000..c353263 --- /dev/null +++ b/.github/workflows/style.yml @@ -0,0 +1,36 @@ +name: Style + +on: + push: + pull_request: + workflow_dispatch: + +jobs: + pre_ci: + uses: dtolnay/.github/.github/workflows/pre_ci.yml@master + + build: + name: ${{matrix.name || format('Rust {0}', matrix.rust)}} + needs: pre_ci + if: needs.pre_ci.outputs.continue + runs-on: ${{matrix.os}}-latest + strategy: + fail-fast: false + matrix: + rust: [ stable ] + os: [ ubuntu ] + env: + RUSTFLAGS: --cfg deny_warnings -Dwarnings + timeout-minutes: 45 + steps: + - uses: actions/checkout@v4 + with: + submodules: true + - uses: dtolnay/rust-toolchain@master + with: + toolchain: ${{matrix.rust}} + components: clippy, rustfmt + - name: cargo fmt + run: cargo fmt --all -- --check + - name: cargo clippy + run: cargo clippy --all-targets --all-features -- -D warnings diff --git a/Cargo.lock b/Cargo.lock deleted file mode 100644 index 7aa8321..0000000 --- a/Cargo.lock +++ /dev/null @@ -1,631 +0,0 @@ -# This file is automatically @generated by Cargo. -# It is not intended for manual editing. -version = 3 - -[[package]] -name = "adler" -version = "1.0.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe" - -[[package]] -name = "aho-corasick" -version = "1.1.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "8e60d3430d3a69478ad0993f19238d2df97c507009a52b3c10addcd7f6bcb916" -dependencies = [ - "memchr", -] - -[[package]] -name = "bindgen" -version = "0.66.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f2b84e06fc203107bfbad243f4aba2af864eb7db3b1cf46ea0a023b0b433d2a7" -dependencies = [ - "bitflags", - "cexpr", - "clang-sys", - "lazy_static", - "lazycell", - "peeking_take_while", - "proc-macro2", - "quote", - "regex", - "rustc-hash", - "shlex", - "syn 2.0.65", -] - -[[package]] -name = "bitflags" -version = "2.5.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "cf4b9d6a944f767f8e5e0db018570623c85f3d925ac718db4e06d0187adb21c1" - -[[package]] -name = "byteorder" -version = "1.5.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "1fd0f2584146f6f2ef48085050886acf353beff7305ebd1ae69500e27c67f64b" - -[[package]] -name = "cc" -version = "1.0.98" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "41c270e7540d725e65ac7f1b212ac8ce349719624d7bcff99f8e2e488e8cf03f" - -[[package]] -name = "cexpr" -version = "0.6.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6fac387a98bb7c37292057cffc56d62ecb629900026402633ae9160df93a8766" -dependencies = [ - "nom", -] - -[[package]] -name = "cfg-if" -version = "1.0.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd" - -[[package]] -name = "clang-sys" -version = "1.7.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "67523a3b4be3ce1989d607a828d036249522dd9c1c8de7f4dd2dae43a37369d1" -dependencies = [ - "glob", - "libc", - "libloading", -] - -[[package]] -name = "codespan-reporting" -version = "0.11.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "3538270d33cc669650c4b093848450d380def10c331d38c768e34cac80576e6e" -dependencies = [ - "termcolor", - "unicode-width", -] - -[[package]] -name = "crc32fast" -version = "1.4.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a97769d94ddab943e4510d138150169a2758b5ef3eb191a9ee688de3e23ef7b3" -dependencies = [ - "cfg-if", -] - -[[package]] -name = "cxx" -version = "1.0.122" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "bb497fad022245b29c2a0351df572e2d67c1046bcef2260ebc022aec81efea82" -dependencies = [ - "cc", - "cxxbridge-flags", - "cxxbridge-macro", - "link-cplusplus", -] - -[[package]] -name = "cxx-build" -version = "1.0.122" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9327c7f9fbd6329a200a5d4aa6f674c60ab256525ff0084b52a889d4e4c60cee" -dependencies = [ - "cc", - "codespan-reporting", - "once_cell", - "proc-macro2", - "quote", - "scratch", - "syn 2.0.65", -] - -[[package]] -name = "cxxbridge-flags" -version = "1.0.122" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "688c799a4a846f1c0acb9f36bb9c6272d9b3d9457f3633c7753c6057270df13c" - -[[package]] -name = "cxxbridge-macro" -version = "1.0.122" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "928bc249a7e3cd554fd2e8e08a426e9670c50bbfc9a621653cfa9accc9641783" -dependencies = [ - "proc-macro2", - "quote", - "syn 2.0.65", -] - -[[package]] -name = "derive_more" -version = "0.99.17" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4fb810d30a7c1953f91334de7244731fc3f3c10d7fe163338a35b9f640960321" -dependencies = [ - "proc-macro2", - "quote", - "syn 1.0.109", -] - -[[package]] -name = "elf" -version = "0.7.4" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4445909572dbd556c457c849c4ca58623d84b27c8fff1e74b0b4227d8b90d17b" - -[[package]] -name = "flate2" -version = "1.0.30" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "5f54427cfd1c7829e2a139fcefea601bf088ebca651d2bf53ebc600eac295dae" -dependencies = [ - "crc32fast", - "miniz_oxide", -] - -[[package]] -name = "glob" -version = "0.3.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d2fabcfbdc87f4758337ca535fb41a6d701b65693ce38287d856d1674551ec9b" - -[[package]] -name = "jingle" -version = "0.1.1" -dependencies = [ - "jingle_sleigh", - "serde", - "thiserror", - "tracing", - "z3", -] - -[[package]] -name = "jingle_sleigh" -version = "0.1.1" -dependencies = [ - "cxx", - "cxx-build", - "elf", - "object", - "serde", - "serde-xml-rs", - "thiserror", - "tracing", -] - -[[package]] -name = "lazy_static" -version = "1.4.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646" - -[[package]] -name = "lazycell" -version = "1.3.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "830d08ce1d1d941e6b30645f1a0eb5643013d835ce3779a5fc208261dbe10f55" - -[[package]] -name = "libc" -version = "0.2.155" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "97b3888a4aecf77e811145cadf6eef5901f4782c53886191b2f693f24761847c" - -[[package]] -name = "libloading" -version = "0.8.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0c2a198fb6b0eada2a8df47933734e6d35d350665a33a3593d7164fa52c75c19" -dependencies = [ - "cfg-if", - "windows-targets", -] - -[[package]] -name = "link-cplusplus" -version = "1.0.9" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9d240c6f7e1ba3a28b0249f774e6a9dd0175054b52dfbb61b16eb8505c3785c9" -dependencies = [ - "cc", -] - -[[package]] -name = "log" -version = "0.4.21" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "90ed8c1e510134f979dbc4f070f87d4313098b704861a105fe34231c70a3901c" - -[[package]] -name = "memchr" -version = "2.7.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6c8640c5d730cb13ebd907d8d04b52f55ac9a2eec55b440c8892f40d56c76c1d" - -[[package]] -name = "minimal-lexical" -version = "0.2.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "68354c5c6bd36d73ff3feceb05efa59b6acb7626617f4962be322a825e61f79a" - -[[package]] -name = "miniz_oxide" -version = "0.7.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "87dfd01fe195c66b572b37921ad8803d010623c0aca821bea2302239d155cdae" -dependencies = [ - "adler", -] - -[[package]] -name = "nom" -version = "7.1.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d273983c5a657a70a3e8f2a01329822f3b8c8172b73826411a55751e404a0a4a" -dependencies = [ - "memchr", - "minimal-lexical", -] - -[[package]] -name = "object" -version = "0.35.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "b8ec7ab813848ba4522158d5517a6093db1ded27575b070f4177b8d12b41db5e" -dependencies = [ - "flate2", - "memchr", - "ruzstd", -] - -[[package]] -name = "once_cell" -version = "1.19.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92" - -[[package]] -name = "peeking_take_while" -version = "0.1.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "19b17cddbe7ec3f8bc800887bab5e717348c95ea2ca0b1bf0837fb964dc67099" - -[[package]] -name = "pin-project-lite" -version = "0.2.14" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "bda66fc9667c18cb2758a2ac84d1167245054bcf85d5d1aaa6923f45801bdd02" - -[[package]] -name = "proc-macro2" -version = "1.0.83" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0b33eb56c327dec362a9e55b3ad14f9d2f0904fb5a5b03b513ab5465399e9f43" -dependencies = [ - "unicode-ident", -] - -[[package]] -name = "quote" -version = "1.0.36" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0fa76aaf39101c457836aec0ce2316dbdc3ab723cdda1c6bd4e6ad4208acaca7" -dependencies = [ - "proc-macro2", -] - -[[package]] -name = "regex" -version = "1.10.4" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c117dbdfde9c8308975b6a18d71f3f385c89461f7b3fb054288ecf2a2058ba4c" -dependencies = [ - "aho-corasick", - "memchr", - "regex-automata", - "regex-syntax", -] - -[[package]] -name = "regex-automata" -version = "0.4.6" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "86b83b8b9847f9bf95ef68afb0b8e6cdb80f498442f5179a29fad448fcc1eaea" -dependencies = [ - "aho-corasick", - "memchr", - "regex-syntax", -] - -[[package]] -name = "regex-syntax" -version = "0.8.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "adad44e29e4c806119491a7f06f03de4d1af22c3a680dd47f1e6e179439d1f56" - -[[package]] -name = "rustc-hash" -version = "1.1.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "08d43f7aa6b08d49f382cde6a7982047c3426db949b1424bc4b7ec9ae12c6ce2" - -[[package]] -name = "ruzstd" -version = "0.6.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "5174a470eeb535a721ae9fdd6e291c2411a906b96592182d05217591d5c5cf7b" -dependencies = [ - "byteorder", - "derive_more", - "twox-hash", -] - -[[package]] -name = "scratch" -version = "1.0.7" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a3cf7c11c38cb994f3d40e8a8cde3bbd1f72a435e4c49e85d6553d8312306152" - -[[package]] -name = "serde" -version = "1.0.202" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "226b61a0d411b2ba5ff6d7f73a476ac4f8bb900373459cd00fab8512828ba395" -dependencies = [ - "serde_derive", -] - -[[package]] -name = "serde-xml-rs" -version = "0.6.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "fb3aa78ecda1ebc9ec9847d5d3aba7d618823446a049ba2491940506da6e2782" -dependencies = [ - "log", - "serde", - "thiserror", - "xml-rs", -] - -[[package]] -name = "serde_derive" -version = "1.0.202" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6048858004bcff69094cd972ed40a32500f153bd3be9f716b2eed2e8217c4838" -dependencies = [ - "proc-macro2", - "quote", - "syn 2.0.65", -] - -[[package]] -name = "shlex" -version = "1.3.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64" - -[[package]] -name = "static_assertions" -version = "1.1.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a2eb9349b6444b326872e140eb1cf5e7c522154d69e7a0ffb0fb81c06b37543f" - -[[package]] -name = "syn" -version = "1.0.109" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "72b64191b275b66ffe2469e8af2c1cfe3bafa67b529ead792a6d0160888b4237" -dependencies = [ - "proc-macro2", - "quote", - "unicode-ident", -] - -[[package]] -name = "syn" -version = "2.0.65" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d2863d96a84c6439701d7a38f9de935ec562c8832cc55d1dde0f513b52fad106" -dependencies = [ - "proc-macro2", - "quote", - "unicode-ident", -] - -[[package]] -name = "termcolor" -version = "1.4.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "06794f8f6c5c898b3275aebefa6b8a1cb24cd2c6c79397ab15774837a0bc5755" -dependencies = [ - "winapi-util", -] - -[[package]] -name = "thiserror" -version = "1.0.61" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c546c80d6be4bc6a00c0f01730c08df82eaa7a7a61f11d656526506112cc1709" -dependencies = [ - "thiserror-impl", -] - -[[package]] -name = "thiserror-impl" -version = "1.0.61" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "46c3384250002a6d5af4d114f2845d37b57521033f30d5c3f46c4d70e1197533" -dependencies = [ - "proc-macro2", - "quote", - "syn 2.0.65", -] - -[[package]] -name = "tracing" -version = "0.1.40" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c3523ab5a71916ccf420eebdf5521fcef02141234bbc0b8a49f2fdc4544364ef" -dependencies = [ - "pin-project-lite", - "tracing-attributes", - "tracing-core", -] - -[[package]] -name = "tracing-attributes" -version = "0.1.27" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "34704c8d6ebcbc939824180af020566b01a7c01f80641264eba0999f6c2b6be7" -dependencies = [ - "proc-macro2", - "quote", - "syn 2.0.65", -] - -[[package]] -name = "tracing-core" -version = "0.1.32" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c06d3da6113f116aaee68e4d601191614c9053067f9ab7f6edbcb161237daa54" -dependencies = [ - "once_cell", -] - -[[package]] -name = "twox-hash" -version = "1.6.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "97fee6b57c6a41524a810daee9286c02d7752c4253064d0b05472833a438f675" -dependencies = [ - "cfg-if", - "static_assertions", -] - -[[package]] -name = "unicode-ident" -version = "1.0.12" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b" - -[[package]] -name = "unicode-width" -version = "0.1.12" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "68f5e5f3158ecfd4b8ff6fe086db7c8467a2dfdac97fe420f2b7c4aa97af66d6" - -[[package]] -name = "winapi-util" -version = "0.1.8" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4d4cc384e1e73b93bafa6fb4f1df8c41695c8a91cf9c4c64358067d15a7b6c6b" -dependencies = [ - "windows-sys", -] - -[[package]] -name = "windows-sys" -version = "0.52.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d" -dependencies = [ - "windows-targets", -] - -[[package]] -name = "windows-targets" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6f0713a46559409d202e70e28227288446bf7841d3211583a4b53e3f6d96e7eb" -dependencies = [ - "windows_aarch64_gnullvm", - "windows_aarch64_msvc", - "windows_i686_gnu", - "windows_i686_gnullvm", - "windows_i686_msvc", - "windows_x86_64_gnu", - "windows_x86_64_gnullvm", - "windows_x86_64_msvc", -] - -[[package]] -name = "windows_aarch64_gnullvm" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "7088eed71e8b8dda258ecc8bac5fb1153c5cffaf2578fc8ff5d61e23578d3263" - -[[package]] -name = "windows_aarch64_msvc" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9985fd1504e250c615ca5f281c3f7a6da76213ebd5ccc9561496568a2752afb6" - -[[package]] -name = "windows_i686_gnu" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "88ba073cf16d5372720ec942a8ccbf61626074c6d4dd2e745299726ce8b89670" - -[[package]] -name = "windows_i686_gnullvm" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "87f4261229030a858f36b459e748ae97545d6f1ec60e5e0d6a3d32e0dc232ee9" - -[[package]] -name = "windows_i686_msvc" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "db3c2bf3d13d5b658be73463284eaf12830ac9a26a90c717b7f771dfe97487bf" - -[[package]] -name = "windows_x86_64_gnu" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4e4246f76bdeff09eb48875a0fd3e2af6aada79d409d33011886d3e1581517d9" - -[[package]] -name = "windows_x86_64_gnullvm" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "852298e482cd67c356ddd9570386e2862b5673c85bd5f88df9ab6802b334c596" - -[[package]] -name = "windows_x86_64_msvc" -version = "0.52.5" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "bec47e5bfd1bff0eeaf6d8b485cc1074891a197ab4225d504cb7a1ab88b02bf0" - -[[package]] -name = "xml-rs" -version = "0.8.20" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "791978798f0597cfc70478424c2b4fdc2b7a8024aaff78497ef00f24ef674193" - -[[package]] -name = "z3" -version = "0.12.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4a7ff5718c079e7b813378d67a5bed32ccc2086f151d6185074a7e24f4a565e8" -dependencies = [ - "log", - "z3-sys", -] - -[[package]] -name = "z3-sys" -version = "0.8.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d7cf70fdbc0de3f42b404f49b0d4686a82562254ea29ff0a155eef2f5430f4b0" -dependencies = [ - "bindgen", -] diff --git a/README.md b/README.md index f9249cb..c58646a 100644 --- a/README.md +++ b/README.md @@ -1,19 +1,33 @@ -# `jingle`: SMT Modeling for SLEIGH +
-`jingle` is a library for program analysis over traces of PCODE operations. I -am writing in the course of my PhD work and it is still very much "in flux". + + +🎶 Jingle bells, Jingle bells, Jingle all the `SLEIGH` 🎶 + +
+ +# `jingle`: SMT Modeling for `p-code` +`jingle` is a library that translates (a fragment of) Ghidra's `p-code` into SMT. It allows expressing symbolic state +of the pcode vm and the relational semantics between those states defined by `p-code` operations. + +**I am writing in the course of my PhD work and it is still very much "in flux". Breaking changes may happen at any time +and the overall design may change too.** + +The API is currently a bit of a mess because I've been trying out different approaches to figure out what I like (e.g. +traits vs context objects). I hope to clean it up at some point and expose one right way to do things. This repository contains a [Cargo Workspace](https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html) for two related crates: * [`jingle_sleigh`](./jingle_sleigh): a Rust FFI in front of [Ghidra](https://github.com/NationalSecurityAgency/ghidra)' s - code translator: SLEIGH. Sleigh is written in C++ and can be + code translator: `SLEIGH`. `SLEIGH` is written in C++ and can be found [here](https://github.com/NationalSecurityAgency/ghidra/tree/master/Ghidra/Features/Decompiler/src/decompile/cpp). - This crate contains a private internal low-level API to SLEIGH and exposes an idiomatic high-level API to consumers. -* [`jingle`](./jingle): a set of functions built on top of `jingle_sleigh` that defines an encoding of PCODE operations - into quantifier-free SMT statements operating on objects of the `Array(BitVec, BitVec)` sort. `jingle` is currently - designed for providing formulas for use in decision procedures over program traces. A more robust analysis + This crate contains a private internal low-level API to `SLEIGH` and exposes an idiomatic high-level API to consumers. +* [`jingle`](./jingle): a set of functions built on top of `jingle_sleigh` that defines an encoding of `p-code` operations + into SMT. `jingle` is currently + designed for providing formulas for use in decision procedures over individual program traces. As such, it does not yet + expose APIs for constructing or reasoning about control-flow graphs. A more robust analysis is forthcoming, depending on my research needs. ## Usage diff --git a/favicon.svg b/favicon.svg new file mode 100644 index 0000000..419da0b --- /dev/null +++ b/favicon.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/jingle.svg b/jingle.svg new file mode 100644 index 0000000..3f9b4bf --- /dev/null +++ b/jingle.svg @@ -0,0 +1 @@ + diff --git a/jingle/Cargo.toml b/jingle/Cargo.toml index 751fc0c..73a7052 100644 --- a/jingle/Cargo.toml +++ b/jingle/Cargo.toml @@ -14,13 +14,21 @@ keywords = ["ghidra", "sleigh", "pcode", "smt"] # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html +[[bin]] +name = "jingle" +required-features = ["bin_features"] + [dependencies] jingle_sleigh = { path = "../jingle_sleigh", version = "0.1.1" } -z3 = { version = "0.12.1" } +z3 = { git = "https://github.com/prove-rs/z3.rs.git", branch = "master" } thiserror = "1.0.58" serde = { version = "1.0.197", features = ["derive"] } tracing = "0.1.40" - +clap = { version = "4.5.14", optional = true, features = ["derive"] } +confy = { version = "0.6.1" , optional = true} +hex = { version = "0.4.3" , optional = true} +anyhow = { version = "1.0.95", optional = true } [features] -elf = ["jingle_sleigh/elf"] +default = [] +bin_features = ["dep:clap", "dep:confy", "dep:hex", "dep:anyhow"] gimli = ["jingle_sleigh/gimli"] diff --git a/jingle/README.md b/jingle/README.md index 5f7a375..1405970 100644 --- a/jingle/README.md +++ b/jingle/README.md @@ -1,4 +1,91 @@ # `jingle`: Z3 + SLEIGH `jingle` uses the sleigh bindings provided by `jingle_sleigh` and the excellent -z3 bindings from the `z3` crate to provide SMT modeling of sequences of `PCODE` instructions +z3 bindings from the `z3` crate to provide SMT modeling of sequences of `PCODE` instructions. + +## CLI + +`jingle` exposes a simple CLI tool for disassembling strings of executable bytes and modeling them in logic. + +### Installation + +From this folder: + +```shell +cargo install --path . --features="bin_features" +``` + +This will install `jingle` in your path. Note that + +### Usage + +`jingle` requires that a Ghidra installation be present. + +When you provide it as the first argument to the `jingle` CLI, it +will save that path for future usage. + +Once it has been configured, you can simple run it as follows: + +```shell +jingle disassemble x86:LE:32:default 89fb +jingle lift x86:LE:32:default 89fb +jingle model x86:LE:32:default 89fb +``` + +These three invocations will print disassembly, pcode translation, and +a logical model respectively. None of these, particularly the logical model, +are intended to be used directly from this utility; this is merely for demonstration. +The proper way to use this tool is through the API. + +The above invocations will produce the following output: +```shell +# jingle disassemble x86:LE:32:default 89fb +MOV EBX,EDI +``` + +```shell +# jingle lift x86:LE:32:default 89fb +EBX = COPY EDI +``` + +```shell +# jingle model x86:LE:32:default 89fb +; benchmark generated from rust API +(set-info :status unknown) +(declare-fun register!4 () (Array (_ BitVec 32) (_ BitVec 8))) +(declare-fun register!9 () (Array (_ BitVec 32) (_ BitVec 8))) +(declare-fun ram!3 () (Array (_ BitVec 32) (_ BitVec 8))) +(declare-fun ram!8 () (Array (_ BitVec 32) (_ BitVec 8))) +(declare-fun OTHER!1 () (Array (_ BitVec 64) (_ BitVec 8))) +(declare-fun OTHER!6 () (Array (_ BitVec 64) (_ BitVec 8))) +(assert + (let ((?x77 (store (store register!4 (_ bv12 32) (select register!4 (_ bv28 32))) (_ bv13 32) (select register!4 (_ bv29 32))))) + (let ((?x81 (store (store ?x77 (_ bv14 32) (select register!4 (_ bv30 32))) (_ bv15 32) (select register!4 (_ bv31 32))))) + (let (($x82 (= register!9 ?x81))) + (let (($x63 (= ram!8 ram!3))) + (let (($x62 (= OTHER!6 OTHER!1))) + (and $x62 $x63 $x82))))))) +(check-sat) + +``` + +### Usage string + +```shell +Usage: jingle [GHIDRA_PATH] + +Commands: + disassemble Adds files to myapp + lift + model + architectures + help Print this message or the help of the given subcommand(s) + +Arguments: + [GHIDRA_PATH] + +Options: + -h, --help Print help + -V, --version Print version + +``` \ No newline at end of file diff --git a/jingle/src/context.rs b/jingle/src/context.rs new file mode 100644 index 0000000..ae682c8 --- /dev/null +++ b/jingle/src/context.rs @@ -0,0 +1,71 @@ +use crate::modeling::State; +use jingle_sleigh::{RegisterManager, SpaceInfo, SpaceManager, VarNode}; +use std::ops::Deref; +use std::rc::Rc; +use z3::Context; + +#[derive(Clone, Debug)] +pub struct JingleContextInternal<'ctx> { + pub z3: &'ctx Context, + spaces: Vec, + default_code_space_index: usize, + registers: Vec<(VarNode, String)>, +} + +#[derive(Clone, Debug)] +pub struct JingleContext<'ctx>(Rc>); + +impl<'ctx> Deref for JingleContext<'ctx> { + type Target = JingleContextInternal<'ctx>; + + fn deref(&self) -> &Self::Target { + self.0.as_ref() + } +} +impl<'ctx> JingleContext<'ctx> { + pub fn new(z3: &'ctx Context, r: &S) -> Self { + let spaces = r.get_all_space_info().to_vec(); + let default_code_space_index = r.get_code_space_idx(); + Self(Rc::new(JingleContextInternal { + z3, + spaces, + default_code_space_index, + registers: r.get_registers(), + })) + } + pub fn fresh_state(&self) -> State<'ctx> { + State::new(self) + } +} + +impl SpaceManager for JingleContext<'_> { + fn get_space_info(&self, idx: usize) -> Option<&SpaceInfo> { + self.spaces.get(idx) + } + + fn get_all_space_info(&self) -> &[SpaceInfo] { + self.spaces.as_slice() + } + + fn get_code_space_idx(&self) -> usize { + self.default_code_space_index + } +} + +impl RegisterManager for JingleContext<'_> { + fn get_register(&self, name: &str) -> Option { + self.registers + .iter() + .find_map(|i| i.1.eq(name).then_some(i.0.clone())) + } + + fn get_register_name(&self, location: &VarNode) -> Option<&str> { + self.registers + .iter() + .find_map(|i| i.0.eq(location).then_some(i.1.as_str())) + } + + fn get_registers(&self) -> Vec<(VarNode, String)> { + self.registers.clone() + } +} diff --git a/jingle/src/error.rs b/jingle/src/error.rs index bac1e7e..56f41d2 100644 --- a/jingle/src/error.rs +++ b/jingle/src/error.rs @@ -21,8 +21,10 @@ pub enum JingleError { ConstantWrite, #[error("Attempt to read an indirect value from the constant space. While this can be modeled, it's almost definitely unintended.")] IndirectConstantRead, - #[error("Attempted to perform a write of a bitvector to a VarNode with leftover space. Sleigh guarantees this will be done with an explicit extension operation.")] - Mismatched, + #[error("Attempted to perform a write of a bitvector to a VarNode with leftover space. This is a sleigh bug.")] + MismatchedWordSize, + #[error("Attempted to perform a write to a space using the wrong size of address. This is a sleigh bug.")] + MismatchedAddressSize, #[error("Jingle does not yet model this instruction")] UnmodeledInstruction(Box), } diff --git a/jingle/src/lib.rs b/jingle/src/lib.rs index 9b6982f..ec10ef6 100644 --- a/jingle/src/lib.rs +++ b/jingle/src/lib.rs @@ -1,3 +1,4 @@ +mod context; mod error; pub mod modeling; mod translator; @@ -5,5 +6,11 @@ pub mod varnode; pub use jingle_sleigh as sleigh; +pub use context::JingleContext; pub use error::JingleError; pub use translator::SleighTranslator; + +#[cfg(test)] +mod tests { + pub(crate) const SLEIGH_ARCH: &str = "x86:LE:64:default"; +} diff --git a/jingle/src/main.rs b/jingle/src/main.rs new file mode 100644 index 0000000..35fd4d4 --- /dev/null +++ b/jingle/src/main.rs @@ -0,0 +1,201 @@ +use anyhow::Context; +use clap::{Parser, Subcommand}; +use hex::decode; +use jingle::modeling::{ModeledBlock, ModelingContext}; +use jingle::JingleContext; +use jingle_sleigh::context::loaded::LoadedSleighContext; +use jingle_sleigh::context::SleighContextBuilder; +use jingle_sleigh::{Disassembly, Instruction, JingleSleighError, PcodeOperation, VarNode}; +use serde::{Deserialize, Serialize}; +use std::path::PathBuf; +use z3::ast::Ast; +use z3::{Config, Context as Z3Context, Solver}; + +#[derive(Debug, PartialEq, Eq, Serialize, Deserialize)] +struct JingleConfig { + pub ghidra_path: PathBuf, +} + +impl JingleConfig { + pub fn sleigh_builder(&self) -> Result { + SleighContextBuilder::load_ghidra_installation(&self.ghidra_path) + } +} + +impl Default for JingleConfig { + fn default() -> Self { + if cfg!(target_os = "windows") { + let path = PathBuf::from(r"C:\Program Files\ghidra"); + Self { ghidra_path: path } + } else if cfg!(target_os = "macos") { + let path = PathBuf::from(r"/Applications/ghidra"); + Self { ghidra_path: path } + } else { + let path = PathBuf::from(r"/opt/ghidra"); + Self { ghidra_path: path } + } + } +} + +impl From<&JingleParams> for JingleConfig { + fn from(value: &JingleParams) -> Self { + let path = value.ghidra_path.clone(); + Self { + ghidra_path: path + .map(PathBuf::from) + .unwrap_or(JingleConfig::default().ghidra_path), + } + } +} + +#[derive(Debug, Parser)] +#[command(version, about, long_about = None)] +struct JingleParams { + #[command(subcommand)] + pub command: Commands, + pub ghidra_path: Option, +} + +#[derive(Debug, Subcommand)] +enum Commands { + /// Adds files to myapp + Disassemble { + architecture: String, + hex_bytes: String, + }, + Lift { + architecture: String, + hex_bytes: String, + }, + Model { + architecture: String, + hex_bytes: String, + }, + Architectures, +} + +fn main() -> anyhow::Result<()> { + let params: JingleParams = JingleParams::parse(); + update_config(¶ms); + let config: JingleConfig = confy::load("jingle", None)?; + match params.command { + Commands::Disassemble { + architecture, + hex_bytes, + } => disassemble(&config, architecture, hex_bytes), + Commands::Lift { + architecture, + hex_bytes, + } => lift(&config, architecture, hex_bytes), + Commands::Model { + architecture, + hex_bytes, + } => model(&config, architecture, hex_bytes), + Commands::Architectures => { + list_architectures(&config); + Ok(()) + } + } +} + +fn update_config(params: &JingleParams) { + let stored_config: JingleConfig = confy::load("jingle", None).unwrap(); + if params.ghidra_path.is_some() { + let new_config = JingleConfig::from(params); + if stored_config != new_config { + confy::store("jingle", None, new_config).unwrap() + } + } +} + +fn list_architectures(config: &JingleConfig) { + let sleigh = config.sleigh_builder().unwrap(); + for language_id in sleigh.get_language_ids() { + println!("{}", language_id) + } +} + +fn get_instructions( + config: &JingleConfig, + architecture: String, + hex_bytes: String, +) -> anyhow::Result<(LoadedSleighContext, Vec)> { + let sleigh_build = config.sleigh_builder().context(format!( + "Unable to parse selected architecture. \n\ + This may indicate that your configured Ghidra path is incorrect: {}", + config.ghidra_path.display() + ))?; + let img = decode(hex_bytes)?; + let max_len = img.len(); + let mut offset = 0; + let sleigh = sleigh_build.build(&architecture).context( + "Unable to build the selected architecture.\n\ + This is either a bug in sleigh or the .sinc file for your architecture is malformed.", + )?; + let sleigh = sleigh.initialize_with_image(img)?; + let mut instrs = vec![]; + while offset < max_len { + if let Some(instruction) = sleigh.instruction_at(offset as u64) { + offset += instruction.length; + instrs.push(instruction); + } + if sleigh.instruction_at(offset as u64).is_none() { + break; + } + } + Ok((sleigh, instrs)) +} + +fn disassemble( + config: &JingleConfig, + architecture: String, + hex_bytes: String, +) -> anyhow::Result<()> { + for instr in get_instructions(config, architecture, hex_bytes)?.1 { + println!("{}", instr.disassembly) + } + Ok(()) +} + +fn lift(config: &JingleConfig, architecture: String, hex_bytes: String) -> anyhow::Result<()> { + let (sleigh, instrs) = get_instructions(config, architecture, hex_bytes)?; + for instr in instrs { + for x in instr.ops { + let x_disp = x.display(&sleigh)?; + println!("{}", x_disp) + } + } + Ok(()) +} + +fn model(config: &JingleConfig, architecture: String, hex_bytes: String) -> anyhow::Result<()> { + let z3 = Z3Context::new(&Config::new()); + let solver = Solver::new(&z3); + let (sleigh, mut instrs) = get_instructions(config, architecture, hex_bytes)?; + // todo: this is a disgusting hack to let us read a modeled block without requiring the user + // to enter a block-terminating instruction. Everything with reading blocks needs to be reworked + // at some point. For now, this lets me not break anything else relying on this behavior while + // still getting this to work. + instrs.push(Instruction { + address: 0, + disassembly: Disassembly { + args: "".to_string(), + mnemonic: "".to_string(), + }, + ops: vec![PcodeOperation::Branch { + input: VarNode { + space_index: 1, + offset: 0, + size: 1, + }, + }], + length: 1, + }); + + let jingle_ctx = JingleContext::new(&z3, &sleigh); + let block = ModeledBlock::read(&jingle_ctx, instrs.into_iter())?; + let final_state = jingle_ctx.fresh_state(); + solver.assert(&final_state._eq(block.get_final_state())?.simplify()); + println!("{}", solver.to_smt2()); + Ok(()) +} diff --git a/jingle/src/modeling/block.rs b/jingle/src/modeling/block.rs index b1e445b..f1c5c15 100644 --- a/jingle/src/modeling/block.rs +++ b/jingle/src/modeling/block.rs @@ -4,18 +4,18 @@ use crate::modeling::branch::BranchConstraint; use crate::modeling::state::State; use crate::modeling::{ModelingContext, TranslationContext}; use crate::varnode::ResolvedVarnode; +use crate::JingleContext; use crate::JingleError::EmptyBlock; use jingle_sleigh::Instruction; use jingle_sleigh::PcodeOperation; use jingle_sleigh::{SpaceInfo, SpaceManager}; use std::collections::HashSet; use std::fmt::{Display, Formatter}; -use z3::Context; /// A `jingle` model of a basic block #[derive(Debug, Clone)] pub struct ModeledBlock<'ctx> { - z3: &'ctx Context, + jingle: JingleContext<'ctx>, pub instructions: Vec, state: State<'ctx>, original_state: State<'ctx>, @@ -24,7 +24,7 @@ pub struct ModeledBlock<'ctx> { outputs: HashSet>, } -impl<'ctx> Display for ModeledBlock<'ctx> { +impl Display for ModeledBlock<'_> { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { for x in self.instructions.iter() { writeln!(f, "{:x} {}", x.address, x.disassembly)?; @@ -36,11 +36,11 @@ impl<'ctx> Display for ModeledBlock<'ctx> { impl<'ctx, T: ModelingContext<'ctx>> TryFrom<&'ctx [T]> for ModeledBlock<'ctx> { type Error = JingleError; fn try_from(vec: &'ctx [T]) -> Result { - let z3 = vec.first().ok_or(EmptyBlock)?.get_z3(); - let original_state = State::new(z3, vec[0].get_original_state()); + let jingle = vec.first().ok_or(EmptyBlock)?.get_jingle(); + let original_state = State::new(jingle); let state = original_state.clone(); let mut new_block: Self = Self { - z3, + jingle: jingle.clone(), instructions: Default::default(), state, original_state, @@ -61,12 +61,11 @@ impl<'ctx, T: ModelingContext<'ctx>> TryFrom<&'ctx [T]> for ModeledBlock<'ctx> { } impl<'ctx> ModeledBlock<'ctx> { - pub fn read, S: SpaceManager>( - z3: &'ctx Context, - space_manager: &S, + pub fn read>( + jingle: &JingleContext<'ctx>, instr_iter: T, ) -> Result { - let original_state = State::new(z3, space_manager); + let original_state = State::new(jingle); let state = original_state.clone(); let mut block_terminated = false; @@ -95,7 +94,7 @@ impl<'ctx> ModeledBlock<'ctx> { ); let mut model = Self { - z3, + jingle: jingle.clone(), instructions, state, original_state, @@ -110,15 +109,20 @@ impl<'ctx> ModeledBlock<'ctx> { } pub fn fresh(&self) -> Result { - ModeledBlock::read(self.z3, self, self.instructions.clone().into_iter()) + ModeledBlock::read(&self.jingle, self.instructions.clone().into_iter()) } - pub fn get_address(&self) -> u64 { + pub fn get_first_address(&self) -> u64 { self.instructions[0].address } + + pub fn get_last_address(&self) -> u64 { + let i = self.instructions.last().unwrap(); + i.address + i.length as u64 + } } -impl<'ctx> SpaceManager for ModeledBlock<'ctx> { +impl SpaceManager for ModeledBlock<'_> { fn get_space_info(&self, idx: usize) -> Option<&SpaceInfo> { self.state.get_space_info(idx) } @@ -132,8 +136,8 @@ impl<'ctx> SpaceManager for ModeledBlock<'ctx> { } impl<'ctx> ModelingContext<'ctx> for ModeledBlock<'ctx> { - fn get_z3(&self) -> &'ctx Context { - self.z3 + fn get_jingle(&self) -> &JingleContext<'ctx> { + &self.jingle } fn get_address(&self) -> u64 { diff --git a/jingle/src/modeling/branch.rs b/jingle/src/modeling/branch.rs index 2b95381..84a7089 100644 --- a/jingle/src/modeling/branch.rs +++ b/jingle/src/modeling/branch.rs @@ -29,13 +29,11 @@ impl BlockEndBehavior { ctx: &'a T, ) -> Result, JingleError> { match self { - Fallthrough(f) => Ok(BV::from_u64(ctx.get_z3(), 0, (f.size * 8) as u32)), + Fallthrough(f) => Ok(BV::from_u64(ctx.get_jingle().z3, 0, (f.size * 8) as u32)), UnconditionalBranch(b) => { match b { // Direct branch - GeneralizedVarNode::Direct(d) => { - ctx.get_final_state().read_varnode_metadata(&d) - } + GeneralizedVarNode::Direct(d) => ctx.get_final_state().read_varnode_metadata(d), // Indirect branch, we want to only inspect the pointer GeneralizedVarNode::Indirect(i) => ctx .get_final_state() @@ -49,13 +47,19 @@ impl BlockEndBehavior { ctx: &'a T, ) -> Result, JingleError> { match self { - Fallthrough(f) => Ok(BV::from_u64(ctx.get_z3(), f.offset, (f.size * 8) as u32)), + Fallthrough(f) => Ok(BV::from_u64( + ctx.get_jingle().z3, + f.offset, + (f.size * 8) as u32, + )), UnconditionalBranch(b) => { match b { // Direct branch - GeneralizedVarNode::Direct(d) => { - Ok(BV::from_u64(ctx.get_z3(), d.offset, (d.size * 8) as u32)) - } + GeneralizedVarNode::Direct(d) => Ok(BV::from_u64( + ctx.get_jingle().z3, + d.offset, + (d.size * 8) as u32, + )), // Indirect branch, we want to only inspect the pointer GeneralizedVarNode::Indirect(i) => ctx .get_final_state() @@ -84,7 +88,7 @@ impl BranchConstraint { pub fn has_branch(&self) -> bool { match self.last { - Fallthrough(_) => self.conditional_branches.len() != 0, + Fallthrough(_) => !self.conditional_branches.is_empty(), UnconditionalBranch(_) => true, } } @@ -106,14 +110,14 @@ impl BranchConstraint { .get_final_state() .read_varnode(&cond_branch.condition)? ._eq(&BV::from_i64( - ctx.get_z3(), + ctx.get_jingle().z3, 0, (cond_branch.condition.size * 8) as u32, )) .not(); let branch_dest = match &cond_branch.destination { GeneralizedVarNode::Direct(d) => { - BV::from_u64(ctx.get_z3(), d.offset, (d.size * 8) as u32) + BV::from_u64(ctx.get_jingle().z3, d.offset, (d.size * 8) as u32) } GeneralizedVarNode::Indirect(a) => ctx.get_final_state().read(a.into())?, }; @@ -132,7 +136,7 @@ impl BranchConstraint { .get_final_state() .read_varnode_metadata(&cond_branch.condition)? ._eq(&BV::from_i64( - ctx.get_z3(), + ctx.get_jingle().z3, 0, (&cond_branch.condition.size * 8) as u32, )) diff --git a/jingle/src/modeling/instruction.rs b/jingle/src/modeling/instruction.rs index b95b8b4..613601a 100644 --- a/jingle/src/modeling/instruction.rs +++ b/jingle/src/modeling/instruction.rs @@ -8,14 +8,13 @@ use crate::modeling::branch::BranchConstraint; use crate::modeling::state::State; use crate::varnode::ResolvedVarnode; -use crate::JingleError; +use crate::{JingleContext, JingleError}; use jingle_sleigh::{SpaceInfo, SpaceManager}; -use z3::Context; /// A `jingle` model of an individual SLEIGH instruction #[derive(Debug, Clone)] pub struct ModeledInstruction<'ctx> { - z3: &'ctx Context, + jingle: JingleContext<'ctx>, pub instr: Instruction, state: State<'ctx>, original_state: State<'ctx>, @@ -25,19 +24,15 @@ pub struct ModeledInstruction<'ctx> { } impl<'ctx> ModeledInstruction<'ctx> { - pub fn new( - instr: Instruction, - sleigh: &T, - z3: &'ctx Context, - ) -> Result { - let original_state = State::new(z3, sleigh); + pub fn new(instr: Instruction, jingle: &JingleContext<'ctx>) -> Result { + let original_state = State::new(jingle); let state = original_state.clone(); let next_vn = state.get_default_code_space_info().make_varnode( instr.next_addr(), state.get_default_code_space_info().index_size_bytes as usize, ); let mut model = Self { - z3, + jingle: jingle.clone(), instr, state, original_state, @@ -52,11 +47,11 @@ impl<'ctx> ModeledInstruction<'ctx> { } pub fn fresh(&self) -> Result { - ModeledInstruction::new(self.instr.clone(), self, self.z3) + ModeledInstruction::new(self.instr.clone(), &self.jingle) } } -impl<'ctx> SpaceManager for ModeledInstruction<'ctx> { +impl SpaceManager for ModeledInstruction<'_> { fn get_space_info(&self, idx: usize) -> Option<&SpaceInfo> { self.state.get_space_info(idx) } @@ -71,8 +66,8 @@ impl<'ctx> SpaceManager for ModeledInstruction<'ctx> { } impl<'ctx> ModelingContext<'ctx> for ModeledInstruction<'ctx> { - fn get_z3(&self) -> &'ctx Context { - self.z3 + fn get_jingle(&self) -> &JingleContext<'ctx> { + &self.jingle } fn get_address(&self) -> u64 { @@ -124,3 +119,11 @@ impl<'ctx> TranslationContext<'ctx> for ModeledInstruction<'ctx> { &mut self.branch_builder } } + +/*impl<'ctx> From<&[ModeledInstruction<'ctx>]> for ModeledInstruction<'ctx>{ + fn from(value: &[ModeledInstruction<'ctx>]) -> Self { + for instr in value.iter() { + instr. + } + } +}*/ diff --git a/jingle/src/modeling/mod.rs b/jingle/src/modeling/mod.rs index f1e119e..0983d8c 100644 --- a/jingle/src/modeling/mod.rs +++ b/jingle/src/modeling/mod.rs @@ -10,7 +10,6 @@ use std::hash::{DefaultHasher, Hash, Hasher}; use std::ops::{Add, Neg}; use tracing::instrument; use z3::ast::{Ast, Bool, BV}; -use z3::Context; mod block; mod branch; @@ -18,6 +17,7 @@ mod instruction; mod slice; mod state; +use crate::JingleContext; pub use block::ModeledBlock; pub use branch::*; pub use instruction::ModeledInstruction; @@ -29,8 +29,8 @@ pub use state::State; /// defines several helper functions for building formulae /// todo: this should probably be separated out with the extension trait pattern pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { - /// Get a handle to the z3 context associated with this modeling context - fn get_z3(&self) -> &'ctx Context; + /// Get a handle to the jingle context associated with this modeling context + fn get_jingle(&self) -> &JingleContext<'ctx>; /// Get the address this context is associated with (e.g. for an instruction, it is the address, /// for a basic block, it is the address of the first instruction). @@ -52,9 +52,9 @@ pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { /// from the [State] returned by [get_final_state], as it is guaranteed to have a handle to /// all intermediate spaces that may be referenced fn get_inputs(&self) -> HashSet>; - /// Get a hashset of the addresses written by this trace. The values returned in this hashset are - /// fully modeled: a read from a given varnode will evaluate to its value at the stage in the - /// computation that the read was performed. Because of this, these should always be read + /// Get a hashset of the addresses written by this trace. The values returned in this hashset + /// are fully modeled: a read from a given varnode will evaluate to its value at the stage in + /// the computation that the read was performed. Because of this, these should always be read /// from the [State] returned by [get_final_state], as it is guaranteed to have a handle to /// all intermediate spaces that may be referenced fn get_outputs(&self) -> HashSet>; @@ -110,7 +110,7 @@ pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { } let p_terms: Vec<&Bool> = premise_terms.iter().collect(); - let premise = Bool::and(self.get_z3(), p_terms.as_slice()); + let premise = Bool::and(self.get_jingle().z3, p_terms.as_slice()); Ok(premise) } @@ -130,11 +130,16 @@ pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { .filter(|v| self.should_varnode_constrain(v)) { let ours = self.get_final_state().read_resolved(vn)?; - let other = other.get_final_state().read_resolved(vn)?; - output_terms.push(ours._eq(&other).simplify()); + let other_bv = other.get_final_state().read_resolved(vn)?; + output_terms.push(ours._eq(&other_bv).simplify()); + if let Indirect(a) = vn { + let ours = self.get_final_state().read_varnode(&a.pointer_location)?; + let other = other.get_final_state().read_varnode(&a.pointer_location)?; + output_terms.push(ours._eq(&other).simplify()); + } } let imp_terms: Vec<&Bool> = output_terms.iter().collect(); - let outputs_pairwise_equal = Bool::and(self.get_z3(), imp_terms.as_slice()); + let outputs_pairwise_equal = Bool::and(self.get_jingle().z3, imp_terms.as_slice()); Ok(outputs_pairwise_equal) } @@ -144,24 +149,11 @@ pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { &self, other: &T, ) -> Result, JingleError> { - let mut terms = vec![]; - for (i, _) in self - .get_final_state() - .get_all_space_info() - .iter() - .enumerate() - .filter(|(_, n)| n._type == SpaceType::IPTR_PROCESSOR) - { - let other = other.get_original_state().get_space(i)?; - let space = self.get_final_state().get_space(i)?; - terms.push(space._eq(other).simplify()) - } - let eq_terms: Vec<&Bool> = terms.iter().collect(); - Ok(Bool::and(self.get_z3(), eq_terms.as_slice())) + self.get_final_state()._eq(other.get_original_state()) } - /// Returns an assertion that [other]'s end-branch behavior is able to branch to the same destination - /// as [self], given that [self] has branching behavior + /// Returns an assertion that [other]'s end-branch behavior is able to branch to the same + /// destination as [self], given that [self] has branching behavior /// todo: should swap self and other to make this align better with [upholds_postcondition] fn branch_comparison>( &self, @@ -180,10 +172,10 @@ pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { zext_to_match(self_bv_metadata.simplify(), &other_bv_metadata.simplify()); let other_bv_metadata = zext_to_match(other_bv_metadata, &self_bv_metadata); Ok(Some(Bool::and( - self.get_z3(), + self.get_jingle().z3, &[ - &self_bv._eq(&other_bv).simplify(), - &self_bv_metadata._eq(&other_bv_metadata).simplify(), + self_bv._eq(&other_bv).simplify(), + self_bv_metadata._eq(&other_bv_metadata).simplify(), ], ))) } @@ -192,15 +184,19 @@ pub trait ModelingContext<'ctx>: SpaceManager + Debug + Sized { /// branch to the given [u64] fn can_branch_to_address(&self, addr: u64) -> Result, JingleError> { let branch_constraint = self.get_branch_constraint().build_bv(self)?; - let addr_bv = BV::from_i64(self.get_z3(), addr as i64, branch_constraint.get_size()); + let addr_bv = BV::from_i64( + self.get_jingle().z3, + addr as i64, + branch_constraint.get_size(), + ); Ok(branch_constraint._eq(&addr_bv)) } } /// This trait is used for types that build modeling contexts. This could maybe be a single /// struct instead of a trait. -/// The helper methods in here allow for parsing pcode operations into z3 formulae, and automatically -/// tracking the inputs/outputs of each operation and traces composed thereof +/// The helper methods in here allow for parsing pcode operations into z3 formulae, and +/// automatically tracking the inputs/outputs of each operation and traces composed thereof pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { /// Adds a [GeneralizedVarNode] to the "input care set" for this operation. /// This is usually used for asserting equality of all input varnodes when @@ -233,6 +229,7 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { .clone(); self.track_input(&Indirect(ResolvedIndirectVarNode { pointer, + pointer_location: indirect.pointer_location.clone(), access_size_bytes: indirect.access_size_bytes, pointer_space_idx: indirect.pointer_space_index, })); @@ -255,6 +252,7 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { let pointer = self.read_and_track(indirect.pointer_location.clone().into())?; self.track_output(&Indirect(ResolvedIndirectVarNode { pointer, + pointer_location: indirect.pointer_location.clone(), access_size_bytes: indirect.access_size_bytes, pointer_space_idx: indirect.pointer_space_index, })); @@ -444,10 +442,10 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { input1, output, } => { - let bv1 = self.read_and_track(input0.into())?; + let mut bv1 = self.read_and_track(input0.into())?; let mut bv2 = self.read_and_track(input1.into())?; match bv1.get_size().cmp(&bv2.get_size()) { - Ordering::Less => bv2 = bv2.extract(bv1.get_size() - 1, 0), + Ordering::Less => bv1 = bv1.zero_ext(bv2.get_size() - bv1.get_size()), Ordering::Greater => bv2 = bv2.zero_ext(bv1.get_size() - bv2.get_size()), _ => {} } @@ -464,8 +462,8 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { // bool arg seems to be for whether this check is signed let carry_bool = in0.bvadd_no_overflow(&in1, false); let out_bv = carry_bool.ite( - &BV::from_i64(self.get_z3(), 0, 8), - &BV::from_i64(self.get_z3(), 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), + &BV::from_i64(self.get_jingle().z3, 1, 8), ); self.write(&output.into(), out_bv) } @@ -479,8 +477,8 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { // bool arg seems to be for whether this check is signed let carry_bool = in0.bvadd_no_overflow(&in1, true); let out_bv = carry_bool.ite( - &BV::from_i64(self.get_z3(), 0, 8), - &BV::from_i64(self.get_z3(), 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), + &BV::from_i64(self.get_jingle().z3, 1, 8), ); self.write(&output.into(), out_bv) } @@ -495,16 +493,16 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { // meaning of "overflow" is in sleigh vs what it means in z3 let borrow_bool = in0.bvsub_no_underflow(&in1, true); let out_bv = borrow_bool.ite( - &BV::from_i64(self.get_z3(), 0, 8), - &BV::from_i64(self.get_z3(), 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), + &BV::from_i64(self.get_jingle().z3, 1, 8), ); self.write(&output.into(), out_bv) } PcodeOperation::Int2Comp { input, output } => { let in0 = self.read_and_track(input.into())?; - let flipped = in0 - .bvneg() - .add(BV::from_u64(self.get_z3(), 1, in0.get_size())); + let flipped = + in0.bvneg() + .add(BV::from_u64(self.get_jingle().z3, 1, in0.get_size())); self.write(&output.into(), flipped) } PcodeOperation::IntSignedLess { @@ -516,8 +514,22 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { let in1 = self.read_and_track(input1.into())?; let out_bool = in0.bvslt(&in1); let out_bv = out_bool.ite( - &BV::from_i64(self.get_z3(), 1, 8), - &BV::from_i64(self.get_z3(), 0, 8), + &BV::from_i64(self.get_jingle().z3, 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), + ); + self.write(&output.into(), out_bv) + } + PcodeOperation::IntSignedLessEqual { + input0, + input1, + output, + } => { + let in0 = self.read_and_track(input0.into())?; + let in1 = self.read_and_track(input1.into())?; + let out_bool = in0.bvsle(&in1); + let out_bv = out_bool.ite( + &BV::from_i64(self.get_jingle().z3, 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), ); self.write(&output.into(), out_bv) } @@ -530,8 +542,22 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { let in1 = self.read_and_track(input1.into())?; let out_bool = in0.bvult(&in1); let out_bv = out_bool.ite( - &BV::from_i64(self.get_z3(), 1, 8), - &BV::from_i64(self.get_z3(), 0, 8), + &BV::from_i64(self.get_jingle().z3, 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), + ); + self.write(&output.into(), out_bv) + } + PcodeOperation::IntLessEqual { + input0, + input1, + output, + } => { + let in0 = self.read_and_track(input0.into())?; + let in1 = self.read_and_track(input1.into())?; + let out_bool = in0.bvule(&in1); + let out_bv = out_bool.ite( + &BV::from_i64(self.get_jingle().z3, 1, 8), + &BV::from_i64(self.get_jingle().z3, 0, 8), ); self.write(&output.into(), out_bv) } @@ -545,8 +571,8 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { let outsize = output.size as u32; let out_bool = in0._eq(&in1); let out_bv = out_bool.ite( - &BV::from_i64(self.get_z3(), 1, outsize * 8), - &BV::from_i64(self.get_z3(), 0, outsize * 8), + &BV::from_i64(self.get_jingle().z3, 1, outsize * 8), + &BV::from_i64(self.get_jingle().z3, 0, outsize * 8), ); self.write(&output.into(), out_bv) } @@ -560,8 +586,8 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { let outsize = output.size as u32; let out_bool = in0._eq(&in1).not(); let out_bv = out_bool.ite( - &BV::from_i64(self.get_z3(), 1, outsize * 8), - &BV::from_i64(self.get_z3(), 0, outsize * 8), + &BV::from_i64(self.get_jingle().z3, 1, outsize * 8), + &BV::from_i64(self.get_jingle().z3, 0, outsize * 8), ); self.write(&output.into(), out_bv) } @@ -572,16 +598,16 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { } => { let i0 = self.read_and_track(input0.into())?; let i1 = self.read_and_track(input1.into())?; - let result = i0 - .bvand(&i1) - .bvand(&BV::from_u64(self.get_z3(), 1, i0.get_size())); + let result = + i0.bvand(&i1) + .bvand(&BV::from_u64(self.get_jingle().z3, 1, i0.get_size())); self.write(&output.into(), result) } PcodeOperation::BoolNegate { input, output } => { let val = self.read_and_track(input.into())?; - let negated = val - .bvneg() - .bvand(&BV::from_u64(self.get_z3(), 1, val.get_size())); + let negated = + val.bvneg() + .bvand(&BV::from_u64(self.get_jingle().z3, 1, val.get_size())); self.write(&output.into(), negated) } PcodeOperation::BoolOr { @@ -591,9 +617,9 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { } => { let i0 = self.read_and_track(input0.into())?; let i1 = self.read_and_track(input1.into())?; - let result = i0 - .bvor(&i1) - .bvand(&BV::from_u64(self.get_z3(), 1, i0.get_size())); + let result = + i0.bvor(&i1) + .bvand(&BV::from_u64(self.get_jingle().z3, 1, i0.get_size())); self.write(&output.into(), result) } PcodeOperation::BoolXor { @@ -603,15 +629,15 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { } => { let i0 = self.read_and_track(input0.into())?; let i1 = self.read_and_track(input1.into())?; - let result = i0 - .bvxor(&i1) - .bvand(&BV::from_u64(self.get_z3(), 1, i0.get_size())); + let result = + i0.bvxor(&i1) + .bvand(&BV::from_u64(self.get_jingle().z3, 1, i0.get_size())); self.write(&output.into(), result) } PcodeOperation::PopCount { input, output } => { let size = output.size as u32; let in0 = self.read_and_track(input.into())?; - let mut outbv = BV::from_i64(self.get_z3(), 0, output.size as u32 * 8); + let mut outbv = BV::from_i64(self.get_jingle().z3, 0, output.size as u32 * 8); for i in 0..size * 8 { let extract = in0.extract(i, i); let extend = extract.zero_ext((size * 8) - 1); @@ -659,14 +685,15 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { let output_size = output.size as u32; let size = min(input_size, output_size); let input = bv0.extract((input_low_byte + size) * 8 - 1, input_low_byte * 8); - if size < output_size { - self.write(&output.into(), input.zero_ext((output_size - size) * 8))?; - } else if output_size < size { - self.write(&output.into(), input.extract(output_size * 8 - 1, 0))?; - } else { - self.write(&output.into(), input)?; + match size.cmp(&output_size) { + Ordering::Less => { + self.write(&output.into(), input.zero_ext((output_size - size) * 8)) + } + Ordering::Greater => { + self.write(&output.into(), input.extract(output_size * 8 - 1, 0)) + } + Ordering::Equal => self.write(&output.into(), input), } - Ok(()) } PcodeOperation::CallOther { inputs, output } => { let mut hasher = DefaultHasher::new(); @@ -692,7 +719,7 @@ pub(crate) trait TranslationContext<'ctx>: ModelingContext<'ctx> { self.get_branch_builder().set_last(&hash_vn.into()); if let Some(out) = output { let size = out.size * 8; - let hash_bv = BV::from_u64(self.get_z3(), hash, size as u32); + let hash_bv = BV::from_u64(self.get_jingle().z3, hash, size as u32); let metadata = self .get_final_state() .immediate_metadata_array(true, out.size); diff --git a/jingle/src/modeling/slice.rs b/jingle/src/modeling/slice.rs index ce24e67..6dbf046 100644 --- a/jingle/src/modeling/slice.rs +++ b/jingle/src/modeling/slice.rs @@ -1,12 +1,12 @@ use crate::modeling::{BranchConstraint, ModelingContext, State}; use crate::varnode::ResolvedVarnode; +use crate::JingleContext; use jingle_sleigh::PcodeOperation; use std::collections::HashSet; -use z3::Context; impl<'ctx, T: ModelingContext<'ctx>> ModelingContext<'ctx> for &[T] { - fn get_z3(&self) -> &'ctx Context { - self[0].get_z3() + fn get_jingle(&self) -> &JingleContext<'ctx> { + self[0].get_jingle() } fn get_address(&self) -> u64 { diff --git a/jingle/src/modeling/state/mod.rs b/jingle/src/modeling/state/mod.rs index 303f385..72febcd 100644 --- a/jingle/src/modeling/state/mod.rs +++ b/jingle/src/modeling/state/mod.rs @@ -2,56 +2,67 @@ mod space; use crate::error::JingleError; use crate::error::JingleError::{ - ConstantWrite, IndirectConstantRead, Mismatched, UnexpectedArraySort, UnmodeledSpace, + ConstantWrite, IndirectConstantRead, MismatchedWordSize, UnexpectedArraySort, UnmodeledSpace, ZeroSizedVarnode, }; use crate::modeling::state::space::ModeledSpace; use crate::varnode::ResolvedVarnode; +use crate::JingleContext; use jingle_sleigh::{ - GeneralizedVarNode, IndirectVarNode, SpaceInfo, SpaceManager, SpaceType, VarNode, + GeneralizedVarNode, IndirectVarNode, RegisterManager, SpaceInfo, SpaceManager, SpaceType, + VarNode, }; use std::ops::Add; -use z3::ast::{Array, Ast, BV}; -use z3::Context; +use z3::ast::{Array, Ast, Bool, BV}; /// Represents the modeled combined memory state of the system. State /// is represented with Z3 formulas built up as select and store operations /// on an initial state #[derive(Clone, Debug)] pub struct State<'ctx> { - z3: &'ctx Context, - space_info: Vec, + jingle: JingleContext<'ctx>, spaces: Vec>, - default_code_space_index: usize, } -impl<'ctx> SpaceManager for State<'ctx> { +impl SpaceManager for State<'_> { fn get_space_info(&self, idx: usize) -> Option<&SpaceInfo> { - self.space_info.get(idx) + self.jingle.get_space_info(idx) } fn get_all_space_info(&self) -> &[SpaceInfo] { - self.space_info.as_slice() + self.jingle.get_all_space_info() } fn get_code_space_idx(&self) -> usize { - self.default_code_space_index + self.jingle.get_code_space_idx() + } +} + +impl RegisterManager for State<'_> { + fn get_register(&self, name: &str) -> Option { + self.jingle.get_register(name) + } + + fn get_register_name(&self, location: &VarNode) -> Option<&str> { + self.jingle.get_register_name(location) + } + + fn get_registers(&self) -> Vec<(VarNode, String)> { + self.jingle.get_registers() } } impl<'ctx> State<'ctx> { - pub fn new(z3: &'ctx Context, other: &T) -> Self { - let mut s: Self = Self { - z3, - space_info: other.get_all_space_info().to_vec(), - spaces: Default::default(), - default_code_space_index: other.get_code_space_idx(), - }; - for space_info in other.get_all_space_info() { - s.spaces.push(ModeledSpace::new(s.z3, space_info)); + pub fn new(jingle: &JingleContext<'ctx>) -> Self { + let mut spaces: Vec = Default::default(); + for space_info in jingle.get_all_space_info() { + spaces.push(ModeledSpace::new(jingle, space_info)); + } + Self { + jingle: jingle.clone(), + spaces, } - s } pub fn get_space(&self, idx: usize) -> Result<&Array<'ctx>, JingleError> { @@ -67,13 +78,16 @@ impl<'ctx> State<'ctx> { .ok_or(UnmodeledSpace)?; match space._type { SpaceType::IPTR_CONSTANT => Ok(BV::from_i64( - self.z3, + self.jingle.z3, varnode.offset as i64, (varnode.size * 8) as u32, )), _ => { - let offset = - BV::from_i64(self.z3, varnode.offset as i64, space.index_size_bytes * 8); + let offset = BV::from_i64( + self.jingle.z3, + varnode.offset as i64, + space.index_size_bytes * 8, + ); let arr = self.spaces.get(varnode.space_index).ok_or(UnmodeledSpace)?; arr.read_data(&offset, varnode.size) } @@ -85,7 +99,11 @@ impl<'ctx> State<'ctx> { .get_space_info(varnode.space_index) .ok_or(UnmodeledSpace)?; - let offset = BV::from_i64(self.z3, varnode.offset as i64, space.index_size_bytes * 8); + let offset = BV::from_i64( + self.jingle.z3, + varnode.offset as i64, + space.index_size_bytes * 8, + ); let arr = self.spaces.get(varnode.space_index).ok_or(UnmodeledSpace)?; arr.read_metadata(&offset, varnode.size) } @@ -149,10 +167,13 @@ impl<'ctx> State<'ctx> { val: BV<'b>, ) -> Result<(), JingleError> { if dest.size as u32 * 8 != val.get_size() { - dbg!(dest.size, val.get_size()); - return Err(Mismatched); + return Err(MismatchedWordSize); } - match self.space_info[dest.space_index]._type { + let info = self + .jingle + .get_space_info(dest.space_index) + .ok_or(UnmodeledSpace)?; + match info._type { SpaceType::IPTR_CONSTANT => Err(ConstantWrite), _ => { let space = self @@ -161,12 +182,8 @@ impl<'ctx> State<'ctx> { .ok_or(UnmodeledSpace)?; space.write_data( &val, - &BV::from_u64( - self.z3, - dest.offset, - self.space_info[dest.space_index].index_size_bytes * 8, - ), - ); + &BV::from_u64(self.jingle.z3, dest.offset, info.index_size_bytes * 8), + )?; Ok(()) } } @@ -178,39 +195,42 @@ impl<'ctx> State<'ctx> { val: BV<'b>, ) -> Result<(), JingleError> { if dest.size != val.get_size() as usize { - return Err(Mismatched); - } - match self.space_info[dest.space_index]._type { - // We are allowing writes to the constant space for metadata - // to allow flagging userop values for syscalls - _ => { - let space = self - .spaces - .get_mut(dest.space_index) - .ok_or(UnmodeledSpace)?; - space.write_metadata( - &val, - &BV::from_u64( - self.z3, - dest.offset, - self.space_info[dest.space_index].index_size_bytes * 8, - ), - ); - Ok(()) - } + return Err(MismatchedWordSize); } + // We are allowing writes to the constant space for metadata + // to allow flagging userop values for syscalls + let space = self + .spaces + .get_mut(dest.space_index) + .ok_or(UnmodeledSpace)?; + let info = self + .jingle + .get_space_info(dest.space_index) + .ok_or(UnmodeledSpace)?; + + space.write_metadata( + &val, + &BV::from_u64(self.jingle.z3, dest.offset, info.index_size_bytes * 8), + )?; + Ok(()) } + /// Model a write to an [IndirectVarNode] on top of the current context. pub fn write_varnode_indirect<'a>( &'a mut self, dest: &IndirectVarNode, val: BV<'ctx>, ) -> Result<(), JingleError> { - if self.space_info[dest.pointer_space_index]._type == SpaceType::IPTR_CONSTANT { + let info = self + .jingle + .get_space_info(dest.pointer_space_index) + .ok_or(UnmodeledSpace)?; + + if info._type == SpaceType::IPTR_CONSTANT { return Err(ConstantWrite); } let ptr = self.read_varnode(&dest.pointer_location)?; - self.spaces[dest.pointer_space_index].write_data(&val, &ptr); + self.spaces[dest.pointer_space_index].write_data(&val, &ptr)?; Ok(()) } @@ -219,11 +239,16 @@ impl<'ctx> State<'ctx> { dest: &IndirectVarNode, val: BV<'ctx>, ) -> Result<(), JingleError> { - if self.space_info[dest.pointer_space_index]._type == SpaceType::IPTR_CONSTANT { + let info = self + .jingle + .get_space_info(dest.pointer_space_index) + .ok_or(UnmodeledSpace)?; + + if info._type == SpaceType::IPTR_CONSTANT { return Err(ConstantWrite); } let ptr = self.read_varnode(&dest.pointer_location)?; - self.spaces[dest.pointer_space_index].write_metadata(&val, &ptr); + self.spaces[dest.pointer_space_index].write_metadata(&val, &ptr)?; Ok(()) } @@ -249,11 +274,13 @@ impl<'ctx> State<'ctx> { } pub fn get_default_code_space(&self) -> &Array<'ctx> { - self.spaces[self.default_code_space_index].get_space() + self.spaces[self.jingle.get_code_space_idx()].get_space() } pub fn get_default_code_space_info(&self) -> &SpaceInfo { - &self.space_info[self.default_code_space_index] + self.jingle + .get_space_info(self.jingle.get_code_space_idx()) + .unwrap() } pub(crate) fn immediate_metadata_array(&self, val: bool, s: usize) -> BV<'ctx> { @@ -262,9 +289,33 @@ impl<'ctx> State<'ctx> { false => 0, }; (0..s) - .map(|_| BV::from_u64(self.z3, val, 1)) + .map(|_| BV::from_u64(self.jingle.z3, val, 1)) .reduce(|a, b| a.concat(&b)) .map(|b| b.simplify()) .unwrap() } + + pub fn _eq(&self, other: &State<'ctx>) -> Result, JingleError> { + let mut terms = vec![]; + for (i, _) in self + .get_all_space_info() + .iter() + .enumerate() + .filter(|(_, n)| n._type == SpaceType::IPTR_PROCESSOR) + { + let self_space = self.get_space(i)?; + let other_space = other.get_space(i)?; + terms.push(self_space._eq(other_space)) + } + let eq_terms: Vec<&Bool> = terms.iter().collect(); + Ok(Bool::and(self.jingle.z3, eq_terms.as_slice())) + } + + pub fn fmt_smt_arrays(&self) -> String { + let mut lines = vec![]; + for x in &self.spaces { + lines.push(x.fmt_smt_array()) + } + lines.join("\n") + } } diff --git a/jingle/src/modeling/state/space.rs b/jingle/src/modeling/state/space.rs index 13d43df..b340ca0 100644 --- a/jingle/src/modeling/state/space.rs +++ b/jingle/src/modeling/state/space.rs @@ -1,34 +1,36 @@ -use crate::JingleError; -use crate::JingleError::{UnexpectedArraySort, ZeroSizedVarnode}; +use crate::JingleError::{MismatchedAddressSize, UnexpectedArraySort, ZeroSizedVarnode}; +use crate::{JingleContext, JingleError}; use jingle_sleigh::{SleighEndianness, SpaceInfo}; use std::ops::Add; -use z3::ast::{Array, BV}; -use z3::{Context, Sort}; +use z3::ast::{Array, Ast, BV}; +use z3::Sort; /// SLEIGH models programs using many spaces. This struct serves as a helper for modeling a single /// space. `jingle` uses an SMT Array sort to model a space. /// /// `jingle` also maintains a separate Array holding "metadata" for the space. For right now, this /// metadata has a single-bit bitvector as its word type, and it is only used for tracking whether -/// a given value originated from a CALLOTHER operation. This is necessary for distinguishing between -/// normal indirect jumps and some syscalls +/// a given value originated from a CALLOTHER operation. This is necessary for distinguishing +/// between normal indirect jumps and some syscalls #[derive(Clone, Debug)] pub(crate) struct ModeledSpace<'ctx> { endianness: SleighEndianness, data: Array<'ctx>, #[allow(unused)] metadata: Array<'ctx>, + space_info: SpaceInfo, } impl<'ctx> ModeledSpace<'ctx> { /// Create a new modeling space with the given z3 context, using the provided space metadata - pub(crate) fn new(z3: &'ctx Context, space_info: &SpaceInfo) -> Self { - let domain = Sort::bitvector(z3, space_info.index_size_bytes * 8); - let range = Sort::bitvector(z3, space_info.word_size_bytes * 8); + pub(crate) fn new(jingle: &JingleContext<'ctx>, space_info: &SpaceInfo) -> Self { + let domain = Sort::bitvector(jingle.z3, space_info.index_size_bytes * 8); + let range = Sort::bitvector(jingle.z3, space_info.word_size_bytes * 8); Self { endianness: space_info.endianness, - data: Array::fresh_const(z3, &space_info.name, &domain, &range), - metadata: Array::const_array(z3, &domain, &BV::from_u64(z3, 0, 1)), + data: Array::fresh_const(jingle.z3, &space_info.name, &domain, &range), + metadata: Array::const_array(jingle.z3, &domain, &BV::from_u64(jingle.z3, 0, 1)), + space_info: space_info.clone(), } } @@ -43,6 +45,9 @@ impl<'ctx> ModeledSpace<'ctx> { offset: &BV<'ctx>, size_bytes: usize, ) -> Result, JingleError> { + if offset.get_size() != self.space_info.index_size_bytes * 8 { + return Err(MismatchedAddressSize); + } read_from_array(&self.data, offset, size_bytes, self.endianness) } @@ -53,17 +58,40 @@ impl<'ctx> ModeledSpace<'ctx> { offset: &BV<'ctx>, size_bytes: usize, ) -> Result, JingleError> { + if offset.get_size() != self.space_info.index_size_bytes * 8 { + return Err(MismatchedAddressSize); + } read_from_array(&self.metadata, offset, size_bytes, self.endianness) } /// Write the given bitvector of data to the given bitvector offset - pub(crate) fn write_data(&mut self, val: &BV<'ctx>, offset: &BV<'ctx>) { - self.data = write_to_array::<8>(&self.data, val, offset, self.endianness) + pub(crate) fn write_data( + &mut self, + val: &BV<'ctx>, + offset: &BV<'ctx>, + ) -> Result<(), JingleError> { + if offset.get_size() != self.space_info.index_size_bytes * 8 { + return Err(MismatchedAddressSize); + } + self.data = write_to_array::<8>(&self.data, val, offset, self.endianness); + Ok(()) } /// Write the given bitvector of metadata to the given bitvector offset - pub(crate) fn write_metadata(&mut self, val: &BV<'ctx>, offset: &BV<'ctx>) { - self.metadata = write_to_array::<1>(&self.metadata, val, offset, self.endianness) + pub(crate) fn write_metadata( + &mut self, + val: &BV<'ctx>, + offset: &BV<'ctx>, + ) -> Result<(), JingleError> { + if offset.get_size() != self.space_info.index_size_bytes * 8 { + return Err(MismatchedAddressSize); + } + self.metadata = write_to_array::<1>(&self.metadata, val, offset, self.endianness); + Ok(()) + } + + pub(crate) fn fmt_smt_array(&self) -> String { + format!("{:?}", self.data.simplify()) } } @@ -110,11 +138,17 @@ fn write_to_array<'ctx, const W: u32>( #[cfg(test)] mod tests { use crate::modeling::state::space::ModeledSpace; + use crate::tests::SLEIGH_ARCH; + use crate::JingleContext; + use jingle_sleigh::context::SleighContextBuilder; use jingle_sleigh::{SleighEndianness, SpaceInfo, SpaceType}; use z3::ast::{Ast, BV}; use z3::{Config, Context}; - fn make_space(z3: &Context, endianness: SleighEndianness) -> ModeledSpace { + fn make_space<'ctx>( + z3: &JingleContext<'ctx>, + endianness: SleighEndianness, + ) -> ModeledSpace<'ctx> { let space_info = SpaceInfo { endianness, name: "ram".to_string(), @@ -123,15 +157,21 @@ mod tests { index: 0, _type: SpaceType::IPTR_PROCESSOR, }; - ModeledSpace::new(&z3, &space_info) + ModeledSpace::new(z3, &space_info) } fn test_endian_write(e: SleighEndianness) { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); let z3 = Context::new(&Config::new()); - let mut space = make_space(&z3, e); - space.write_data( - &BV::from_u64(&z3, 0xdead_beef, 32), - &BV::from_u64(&z3, 0, 32), - ); + let jingle = JingleContext::new(&z3, &sleigh); + let mut space = make_space(&jingle, e); + space + .write_data( + &BV::from_u64(&z3, 0xdead_beef, 32), + &BV::from_u64(&z3, 0, 32), + ) + .unwrap(); let expected = match e { SleighEndianness::Big => [0xde, 0xad, 0xbe, 0xef], SleighEndianness::Little => [0xef, 0xbe, 0xad, 0xde], @@ -147,17 +187,23 @@ mod tests { } fn test_endian_read(e: SleighEndianness) { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); let z3 = Context::new(&Config::new()); - let mut space = make_space(&z3, e); + let jingle = JingleContext::new(&z3, &sleigh); + let mut space = make_space(&jingle, e); let byte_layout = match e { SleighEndianness::Big => [0xde, 0xad, 0xbe, 0xef], SleighEndianness::Little => [0xef, 0xbe, 0xad, 0xde], }; for i in 0..4 { - space.write_data( - &BV::from_u64(&z3, byte_layout[i as usize], 8), - &BV::from_u64(&z3, i, 32), - ); + space + .write_data( + &BV::from_u64(&z3, byte_layout[i as usize], 8), + &BV::from_u64(&z3, i, 32), + ) + .unwrap(); } let val = space .read_data(&BV::from_u64(&z3, 0, 32), 4) @@ -168,9 +214,15 @@ mod tests { } fn test_single_write(e: SleighEndianness) { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); let z3 = Context::new(&Config::new()); - let mut space = make_space(&z3, e); - space.write_data(&BV::from_u64(&z3, 0x42, 8), &BV::from_u64(&z3, 0, 32)); + let jingle = JingleContext::new(&z3, &sleigh); + let mut space = make_space(&jingle, e); + space + .write_data(&BV::from_u64(&z3, 0x42, 8), &BV::from_u64(&z3, 0, 32)) + .unwrap(); let expected = 0x42; let data = space .read_data(&BV::from_u64(&z3, 0, 32), 1) diff --git a/jingle/src/translator.rs b/jingle/src/translator.rs index 2927a95..9b81bab 100644 --- a/jingle/src/translator.rs +++ b/jingle/src/translator.rs @@ -1,8 +1,9 @@ use crate::error::JingleError; -use jingle_sleigh::context::SleighContext; use jingle_sleigh::{Instruction, RegisterManager, SpaceInfo, VarNode}; use crate::modeling::ModeledInstruction; +use crate::JingleContext; +use jingle_sleigh::context::loaded::LoadedSleighContext; use jingle_sleigh::JingleSleighError::InstructionDecode; use jingle_sleigh::SpaceManager; use z3::Context; @@ -12,14 +13,15 @@ use z3::Context; /// modeling them in one go #[derive(Debug, Clone)] pub struct SleighTranslator<'ctx> { - z3_ctx: &'ctx Context, - sleigh: &'ctx SleighContext, + jingle: JingleContext<'ctx>, + sleigh: &'ctx LoadedSleighContext<'ctx>, } impl<'ctx> SleighTranslator<'ctx> { /// Make a new sleigh translator - pub fn new(sleigh: &'ctx SleighContext, z3_ctx: &'ctx Context) -> Self { - Self { z3_ctx, sleigh } + pub fn new(sleigh: &'ctx LoadedSleighContext, z3_ctx: &'ctx Context) -> Self { + let jingle = JingleContext::new(z3_ctx, sleigh); + Self { jingle, sleigh } } /// Ask sleigh to read one instruction from the given offset and attempt @@ -31,8 +33,7 @@ impl<'ctx> SleighTranslator<'ctx> { ) -> Result, JingleError> { let op = self .sleigh - .read(offset, 1) - .next() + .instruction_at(offset) .ok_or(InstructionDecode)?; self.model_instruction(op) } @@ -42,11 +43,11 @@ impl<'ctx> SleighTranslator<'ctx> { &self, instr: Instruction, ) -> Result, JingleError> { - ModeledInstruction::new(instr, self.sleigh, self.z3_ctx) + ModeledInstruction::new(instr, &self.jingle) } } -impl<'ctx> SpaceManager for SleighTranslator<'ctx> { +impl SpaceManager for SleighTranslator<'_> { fn get_space_info(&self, idx: usize) -> Option<&SpaceInfo> { self.sleigh.get_space_info(idx) } @@ -60,12 +61,12 @@ impl<'ctx> SpaceManager for SleighTranslator<'ctx> { } } -impl<'ctx> RegisterManager for SleighTranslator<'ctx> { +impl RegisterManager for SleighTranslator<'_> { fn get_register(&self, name: &str) -> Option { self.sleigh.get_register(name) } - fn get_register_name(&self, location: VarNode) -> Option<&str> { + fn get_register_name(&self, location: &VarNode) -> Option<&str> { self.sleigh.get_register_name(location) } diff --git a/jingle/src/varnode/display.rs b/jingle/src/varnode/display.rs index af88f6a..adde07e 100644 --- a/jingle/src/varnode/display.rs +++ b/jingle/src/varnode/display.rs @@ -15,7 +15,7 @@ pub enum ResolvedVarNodeDisplay<'ctx> { Indirect(ResolvedIndirectVarNodeDisplay<'ctx>), } -impl<'ctx> Display for ResolvedVarNodeDisplay<'ctx> { +impl Display for ResolvedVarNodeDisplay<'_> { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { match self { ResolvedVarNodeDisplay::Direct(d) => d.fmt(f), diff --git a/jingle/src/varnode/mod.rs b/jingle/src/varnode/mod.rs index adcfc6d..59204d1 100644 --- a/jingle/src/varnode/mod.rs +++ b/jingle/src/varnode/mod.rs @@ -3,7 +3,7 @@ mod display; use crate::error::JingleError; use crate::error::JingleError::UnmodeledSpace; use crate::varnode::display::{ResolvedIndirectVarNodeDisplay, ResolvedVarNodeDisplay}; -use jingle_sleigh::SpaceManager; +use jingle_sleigh::RegisterManager; use jingle_sleigh::VarNode; use std::hash::Hash; use z3::ast::BV; @@ -12,6 +12,7 @@ use z3::ast::BV; pub struct ResolvedIndirectVarNode<'ctx> { pub pointer_space_idx: usize, pub pointer: BV<'ctx>, + pub pointer_location: VarNode, pub access_size_bytes: usize, } @@ -24,8 +25,11 @@ pub enum ResolvedVarnode<'ctx> { Indirect(ResolvedIndirectVarNode<'ctx>), } -impl<'ctx> ResolvedVarnode<'ctx> { - pub fn display(&self, ctx: &T) -> Result { +impl ResolvedVarnode<'_> { + pub fn display( + &self, + ctx: &T, + ) -> Result { match self { ResolvedVarnode::Direct(d) => Ok(ResolvedVarNodeDisplay::Direct(d.display(ctx)?)), ResolvedVarnode::Indirect(i) => Ok(ResolvedVarNodeDisplay::Indirect( diff --git a/jingle_sleigh/.gitignore b/jingle_sleigh/.gitignore index 3954c15..6bcecb0 100644 --- a/jingle_sleigh/.gitignore +++ b/jingle_sleigh/.gitignore @@ -1,3 +1,4 @@ cmake-build-debug .idea +.cache build diff --git a/jingle_sleigh/CMakeLists.txt b/jingle_sleigh/CMakeLists.txt index 5066462..3f25b7f 100644 --- a/jingle_sleigh/CMakeLists.txt +++ b/jingle_sleigh/CMakeLists.txt @@ -30,7 +30,17 @@ add_library(jingle_sleigh_cpp src/ffi/cpp/compile.cpp src/ffi/cpp/addrspace_handle.cpp src/ffi/cpp/addrspace_manager_handle.cpp - src/ffi/cpp/context.h) + src/ffi/cpp/context.h + src/ffi/cpp/sleigh_image.cpp + src/ffi/cpp/sleigh_image.h + src/ffi/cpp/exception.h + src/ffi/cpp/varnode_translation.cpp + src/ffi/cpp/varnode_translation.h + src/ffi/cpp/jingle_pcode_emitter.cpp + src/ffi/cpp/jingle_assembly_emitter.cpp + src/ffi/cpp/jingle_assembly_emitter.h + src/ffi/cpp/rust_load_image.cpp + src/ffi/cpp/rust_load_image.h) add_executable(sleigh_compile src/ffi/cpp/sleigh/address.cc diff --git a/jingle_sleigh/Cargo.toml b/jingle_sleigh/Cargo.toml index 1500e81..61f0126 100644 --- a/jingle_sleigh/Cargo.toml +++ b/jingle_sleigh/Cargo.toml @@ -20,21 +20,18 @@ include = [ # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html [dependencies] -cxx = "1.0.120" -serde = { version = "1.0.197", features = ["derive"] } +cxx = "1.0.131" +serde = { version = "1.0.203", features = ["derive"] } serde-xml-rs = "0.6.0" -thiserror = { version = "1.0.58", features = [] } -elf = { version = "0.7.4", optional = true } -object = { version = "0.35.0", optional = true } +thiserror = { version = "1.0.61", features = [] } +object = { version = "0.36.0", optional = true } tracing = "0.1.40" [build-dependencies] -cxx-build = "1.0.120" +cxx-build = "1.0.131" [features] -compile = [] -elf = ["dep:elf"] gimli = ["dep:object"] -default = ["elf", "gimli"] +default = ["gimli"] diff --git a/jingle_sleigh/build.rs b/jingle_sleigh/build.rs index 0efc1f5..bcd0a79 100644 --- a/jingle_sleigh/build.rs +++ b/jingle_sleigh/build.rs @@ -1,11 +1,104 @@ use std::fs; use std::fs::copy; -use std::path::PathBuf; +use std::path::{Path, PathBuf}; + +const SLEIGH_SOURCES: &[&str] = &[ + "address.cc", + "compression.cc", + "context.cc", + "globalcontext.cc", + "float.cc", + "marshal.cc", + "opcodes.cc", + "pcoderaw.cc", + "semantics.cc", + "slaformat.cc", + "sleigh.cc", + "sleighbase.cc", + "slghpatexpress.cc", + "slghpattern.cc", + "slghsymbol.cc", + "space.cc", + "translate.cc", + "xml.cc", + "filemanage.cc", + "pcodecompile.cc", +]; + +const SLEIGH_HEADERS: &[&str] = &[ + "address.hh", + "compression.hh", + "context.hh", + "error.hh", + "filemanage.hh", + "float.hh", + "globalcontext.hh", + "loadimage.hh", + "marshal.hh", + "opbehavior.hh", + "opcodes.hh", + "partmap.hh", + "pcodecompile.hh", + "pcoderaw.hh", + "semantics.hh", + "slaformat.hh", + "sleigh.hh", + "sleighbase.hh", + "slghpatexpress.hh", + "slghpattern.hh", + "slghsymbol.hh", + "space.hh", + "translate.hh", + "types.h", + "xml.hh", +]; + +const ZLIB_HEADERS: &[&str] = &[ + "deflate.h", + "gzguts.h", + "inffast.h", + "inffixed.h", + "inflate.h", + "inftrees.h", + "trees.h", + "zconf.h", + "zlib.h", + "zutil.h", +]; + +const ZLIB_SOURCES: &[&str] = &[ + "deflate.c", + "inflate.c", + "zutil.c", + "inftrees.c", + "inffast.c", + "trees.c", + "adler32.c", +]; + +const JINGLE_CPP_SOURCES: &[&str] = &[ + "context.cpp", + "dummy_load_image.cpp", + "rust_load_image.cpp", + "addrspace_handle.cpp", + "addrspace_manager_handle.cpp", + "varnode_translation.cpp", + "jingle_pcode_emitter.cpp", + "jingle_assembly_emitter.cpp", +]; + +const RUST_FFI_BRIDGES: &[&str] = &[ + "addrspace.rs", + "context_ffi.rs", + "instruction.rs", + "opcode.rs", +]; + fn main() { if cfg!(target_os = "macos") { println!("cargo::rustc-link-search=/opt/homebrew/lib") } - if !cpp_src_path().exists() { + if !sleigh_path().exists() | !zlib_path().exists() { let submod = submod_path(); if !submod.read_dir().is_ok_and(|f| f.count() != 0) { panic!( @@ -17,101 +110,117 @@ fn main() { copy_sources(); } - let mut rust_sources = vec![ - "src/ffi/addrspace.rs", - "src/ffi/context_ffi.rs", - "src/ffi/instruction.rs", - "src/ffi/opcode.rs", - "src/ffi/image.rs", - ]; - - let mut cpp_sources = vec![ - "src/ffi/cpp/sleigh/address.cc", - "src/ffi/cpp/sleigh/context.cc", - "src/ffi/cpp/sleigh/globalcontext.cc", - "src/ffi/cpp/sleigh/float.cc", - "src/ffi/cpp/sleigh/marshal.cc", - "src/ffi/cpp/sleigh/opcodes.cc", - "src/ffi/cpp/sleigh/pcoderaw.cc", - "src/ffi/cpp/sleigh/semantics.cc", - "src/ffi/cpp/sleigh/sleigh.cc", - "src/ffi/cpp/sleigh/sleighbase.cc", - "src/ffi/cpp/sleigh/slghpatexpress.cc", - "src/ffi/cpp/sleigh/slghpattern.cc", - "src/ffi/cpp/sleigh/slghsymbol.cc", - "src/ffi/cpp/sleigh/space.cc", - "src/ffi/cpp/sleigh/translate.cc", - "src/ffi/cpp/sleigh/xml.cc", - "src/ffi/cpp/sleigh/filemanage.cc", - "src/ffi/cpp/sleigh/pcodecompile.cc", - "src/ffi/cpp/sleigh/slghscan.cc", - "src/ffi/cpp/sleigh/slghparse.cc", - "src/ffi/cpp/context.cpp", - "src/ffi/cpp/addrspace_handle.cpp", - "src/ffi/cpp/addrspace_manager_handle.cpp", - ]; - if cfg!(compile) { - rust_sources.push("src/ffi/compile.rs"); - cpp_sources.push("src/ffi/cpp/compile.cpp"); - cpp_sources.push("src/ffi/cpp/sleigh/slgh_compile.cc"); - } + let map_path = |p: fn() -> PathBuf| { + move |s: &&str| { + let mut b = p(); + b.push(s); + b + } + }; + + let rust_bridges: Vec = RUST_FFI_BRIDGES.iter().map(map_path(ffi_rs_path)).collect(); + + let jingle_cpp_sources: Vec = JINGLE_CPP_SOURCES + .iter() + .map(map_path(ffi_cpp_path)) + .collect(); + + let sleigh_sources: Vec = SLEIGH_SOURCES.iter().map(map_path(sleigh_path)).collect(); + let zlib_sources: Vec = ZLIB_SOURCES.iter().map(map_path(zlib_path)).collect(); + // This assumes all your C++ bindings are in lib - cxx_build::bridges(rust_sources) - .files(cpp_sources) + let mut bridge = cxx_build::bridges(&rust_bridges); + bridge + .files(jingle_cpp_sources) + .files(sleigh_sources) + .files(zlib_sources) .flag_if_supported("-std=c++17") - .flag_if_supported("-Dmain=c_main") - .flag_if_supported("-Wno-unused-parameter") - .flag_if_supported("-Wno-unused-function") - .flag_if_supported("-Wno-unneeded-internal-declaration") - .flag_if_supported("-Wno-format") - .flag_if_supported("-Wno-unused-but-set-variable") - .flag_if_supported("-Wno-sign-compare") - .flag_if_supported("-Wno-deprecated-copy-with-user-provided-copy") - .compile("jingle_sleigh"); + .flag_if_supported("-DLOCAL_ZLIB") + .flag_if_supported("-DNO_GZIP") + .flag_if_supported("-Wno-register") + .flag_if_supported("-w"); + + if cfg!(windows) { + bridge.flag_if_supported("-D_WINDOWS"); + } + bridge.compile("jingle_sleigh"); println!("cargo::rerun-if-changed=src/ffi/cpp/"); - println!("cargo::rerun-if-changed=src/ffi/addrspace.rs"); - println!("cargo::rerun-if-changed=src/ffi/compile.rs"); - println!("cargo::rerun-if-changed=src/ffi/context_ffi.rs"); - println!("cargo::rerun-if-changed=src/ffi/instruction.rs"); - println!( - "cargo::rerun-if-changed={}", - ghidra_cpp_path().to_str().unwrap() - ); + for src in rust_bridges { + println!("cargo::rerun-if-changed={}", src.to_str().unwrap()); + } } fn copy_sources() { - fs::create_dir(cpp_src_path()).unwrap(); - for path in fs::read_dir(ghidra_cpp_path()).unwrap().flatten() { - if let Some(name) = path.file_name().to_str() { - if name.ends_with(".cc") || name.ends_with(".hh") || name.ends_with(".h") { - let mut result = cpp_src_path(); - result.push(name); - copy(path.path().as_path(), result.as_path()).unwrap(); - println!("Copying {}", name) + copy_cpp_sources( + ghidra_sleigh_path(), + sleigh_path(), + SLEIGH_SOURCES, + SLEIGH_HEADERS, + ); + copy_cpp_sources(ghidra_zlib_path(), zlib_path(), ZLIB_SOURCES, ZLIB_HEADERS); +} + +fn copy_cpp_sources, E: AsRef>( + inpath: T, + outpath: E, + sources: &[&str], + headers: &[&str], +) { + let _ = fs::create_dir(&outpath); + for direntry in fs::read_dir(inpath).unwrap().flatten() { + let path = direntry.path(); + let filename = path.file_name(); + if let Some(filename) = filename { + let filename = filename.to_str().unwrap(); + if sources.contains(&filename) || headers.contains(&filename) { + let mut result = PathBuf::from(outpath.as_ref()); + result.push(filename); + copy(direntry.path(), result.as_path()).unwrap(); + println!( + "Copying {} ({} => {})", + filename, + direntry.path().to_str().unwrap(), + result.to_str().unwrap() + ); } } } } -fn cpp_src_path() -> PathBuf { +fn ffi_rs_path() -> PathBuf { let mut p = PathBuf::new(); p.push("src"); p.push("ffi"); + p +} + +fn ffi_cpp_path() -> PathBuf { + let mut p = ffi_rs_path(); p.push("cpp"); + p +} + +fn sleigh_path() -> PathBuf { + let mut p = ffi_cpp_path(); p.push("sleigh"); p } +fn zlib_path() -> PathBuf { + let mut p = ffi_cpp_path(); + p.push("zlib"); + p +} + fn submod_path() -> PathBuf { let mut p = PathBuf::new(); p.push("ghidra"); p } -fn ghidra_cpp_path() -> PathBuf { - let mut p = PathBuf::new(); - p.push(submod_path()); +fn ghidra_sleigh_path() -> PathBuf { + let mut p = submod_path(); p.push("Ghidra"); p.push("Features"); p.push("Decompiler"); @@ -120,3 +229,14 @@ fn ghidra_cpp_path() -> PathBuf { p.push("cpp"); p } + +fn ghidra_zlib_path() -> PathBuf { + let mut p = submod_path(); + p.push("Ghidra"); + p.push("Features"); + p.push("Decompiler"); + p.push("src"); + p.push("decompile"); + p.push("zlib"); + p +} diff --git a/jingle_sleigh/ghidra b/jingle_sleigh/ghidra index ec868c1..5faf793 160000 --- a/jingle_sleigh/ghidra +++ b/jingle_sleigh/ghidra @@ -1 +1 @@ -Subproject commit ec868c12b688636db73ab569fb24599a8cf9d470 +Subproject commit 5faf79368040e33dc385af7a5bc8afc6ca1f5339 diff --git a/jingle_sleigh/src/context/builder/image/gimli.rs b/jingle_sleigh/src/context/builder/image/gimli.rs deleted file mode 100644 index 8597b86..0000000 --- a/jingle_sleigh/src/context/builder/image/gimli.rs +++ /dev/null @@ -1,70 +0,0 @@ -use crate::context::builder::image::Perms; -use crate::context::{Image, ImageSection}; -use crate::JingleSleighError; -use crate::JingleSleighError::ImageLoadError; -use object::elf::{PF_R, PF_W, PF_X}; -use object::macho::{VM_PROT_EXECUTE, VM_PROT_READ, VM_PROT_WRITE}; -use object::{Architecture, Endianness, File, Object, ObjectSegment, SegmentFlags}; - -impl<'d> TryFrom> for Image { - type Error = JingleSleighError; - fn try_from(value: File) -> Result { - let mut img: Image = Image { sections: vec![] }; - for x in value.segments() { - let base_address = x.address(); - let data = x.data().map_err(|_| ImageLoadError)?.to_vec(); - let perms = map_flags(&x.flags()); - img.sections.push(ImageSection { - perms, - data, - base_address: base_address as usize, - }) - } - Ok(img) - } -} - -fn map_flags(flags: &SegmentFlags) -> Perms { - match flags { - SegmentFlags::Elf { p_flags } => Perms { - exec: (p_flags & PF_X) == PF_X, - write: (p_flags & PF_W) == PF_W, - read: (p_flags & PF_R) == PF_R, - }, - SegmentFlags::MachO { flags, .. } => Perms { - exec: (flags & VM_PROT_EXECUTE) == VM_PROT_EXECUTE, - write: (flags & VM_PROT_WRITE) == VM_PROT_WRITE, - read: (flags & VM_PROT_READ) == VM_PROT_READ, - }, - _ => Perms { - read: false, - write: false, - exec: false, - }, - } -} -pub fn map_gimli_architecture(file: &File) -> Option<&'static str> { - match &file.architecture() { - Architecture::Unknown => None, - Architecture::Aarch64 => match file.endianness() { - Endianness::Little => Some("AARCH64:LE:64:v8A"), - Endianness::Big => Some("AARCH64:BE:64:v8A"), - }, - Architecture::Aarch64_Ilp32 => match file.endianness() { - Endianness::Little => Some("AARCH64:LE:32:ilp32"), - Endianness::Big => Some("AARCH64:BE:32:ilp32"), - }, - Architecture::Arm => match file.endianness() { - Endianness::Little => Some("ARM:LE:32:v8"), - Endianness::Big => Some("ARM:BE:32:v8"), - }, - Architecture::I386 => Some("x86:LE:32:default"), - Architecture::X86_64 => Some("x86:LE:64:default"), - - Architecture::Xtensa => match file.endianness() { - Endianness::Little => Some("Xtensa:LE:32:default"), - Endianness::Big => Some("Xtensa:BE:32:default"), - }, - _ => None, - } -} diff --git a/jingle_sleigh/src/context/builder/image/mod.rs b/jingle_sleigh/src/context/builder/image/mod.rs deleted file mode 100644 index ec3d764..0000000 --- a/jingle_sleigh/src/context/builder/image/mod.rs +++ /dev/null @@ -1,61 +0,0 @@ -#[cfg(feature = "elf")] -pub mod elf; -#[cfg(feature = "gimli")] -pub mod gimli; - -pub use crate::ffi::image::bridge::{Image, ImageSection, Perms}; -use std::ops::Range; - -impl Image { - pub fn get_range(&self) -> Option> { - let min = self.sections.iter().map(|s| s.base_address).min(); - let max = self - .sections - .iter() - .map(|s| s.base_address + s.data.len()) - .max(); - min.zip(max).map(|(min, max)| min..max) - } - - pub fn sections(&self) -> &[ImageSection] { - &self.sections - } - - pub fn contains_address(&self, addr: usize) -> bool { - self.sections - .iter() - .any(|s| s.base_address <= addr && (s.base_address + s.data.len()) >= addr) - } -} - -impl From<&[u8]> for Image { - fn from(value: &[u8]) -> Self { - Self { - sections: vec![ImageSection { - data: value.to_vec(), - perms: Perms { - read: true, - write: true, - exec: true, - }, - base_address: 0, - }], - } - } -} - -impl From> for Image { - fn from(value: Vec) -> Self { - Self { - sections: vec![ImageSection { - data: value, - perms: Perms { - read: true, - write: true, - exec: true, - }, - base_address: 0, - }], - } - } -} diff --git a/jingle_sleigh/src/context/builder/language_def.rs b/jingle_sleigh/src/context/builder/language_def.rs index 03bcdf7..9582556 100644 --- a/jingle_sleigh/src/context/builder/language_def.rs +++ b/jingle_sleigh/src/context/builder/language_def.rs @@ -12,6 +12,7 @@ pub enum SleighEndian { Big, } +#[allow(unused)] #[derive(Clone, Debug, Deserialize)] pub struct Compiler { pub name: String, @@ -19,12 +20,14 @@ pub struct Compiler { pub id: String, } +#[allow(unused)] #[derive(Clone, Debug, Deserialize)] pub struct ExternalName { pub tool: String, pub name: String, } +#[allow(unused)] #[derive(Clone, Debug, Deserialize)] pub struct LanguageDefinition { pub processor: String, @@ -32,7 +35,7 @@ pub struct LanguageDefinition { pub variant: String, pub version: String, #[serde(rename = "slafile")] - pub sla_file: String, + pub sla_file: PathBuf, #[serde(rename = "processorspec")] pub processor_spec: PathBuf, #[serde(rename = "manualindexfile")] diff --git a/jingle_sleigh/src/context/builder/mod.rs b/jingle_sleigh/src/context/builder/mod.rs index 6be2d01..0e39307 100644 --- a/jingle_sleigh/src/context/builder/mod.rs +++ b/jingle_sleigh/src/context/builder/mod.rs @@ -1,22 +1,19 @@ -use crate::context::builder::image::Image; use crate::context::builder::language_def::{parse_ldef, LanguageDefinition}; use crate::context::builder::processor_spec::parse_pspec; use crate::context::SleighContext; use crate::error::JingleSleighError; -use crate::error::JingleSleighError::{InvalidLanguageId, LanguageSpecRead, NoImageProvided}; +use crate::error::JingleSleighError::{InvalidLanguageId, LanguageSpecRead}; use std::fmt::Debug; use std::fs; use std::path::{Path, PathBuf}; use tracing::{event, instrument, Level}; -pub mod image; pub(crate) mod language_def; pub(crate) mod processor_spec; #[derive(Debug, Default, Clone)] pub struct SleighContextBuilder { defs: Vec<(LanguageDefinition, PathBuf)>, - image: Option, } impl SleighContextBuilder { @@ -28,26 +25,30 @@ impl SleighContextBuilder { self.defs.iter().find(|(p, _)| p.id.eq(id)) } #[instrument(skip_all, fields(%id))] - pub fn build(mut self, id: &str) -> Result { - let image = self.image.take().ok_or(NoImageProvided)?; + pub fn build(&self, id: &str) -> Result { let (lang, path) = self.get_language(id).ok_or(InvalidLanguageId)?; - let sla_path = path.join(&lang.sla_file); - let mut context = SleighContext::new(&sla_path, image)?; + let mut context = SleighContext::new(lang, path)?; event!(Level::INFO, "Created sleigh context"); let pspec_path = path.join(&lang.processor_spec); let pspec = parse_pspec(&pspec_path)?; - for set in pspec.context_data.context_set.sets { - context.set_initial_context(&set.name, set.value as u32) + if let Some(ctx_sets) = pspec.context_data.and_then(|d| d.context_set) { + for set in ctx_sets.sets { + // todo: gross hack + if set.value.starts_with("0x") { + context.set_initial_context( + &set.name, + u32::from_str_radix(&set.value[2..], 16).unwrap(), + )?; + } else { + context.set_initial_context(&set.name, set.value.parse::().unwrap())?; + } + } } - Ok(context) } pub fn load_folder>(path: T) -> Result { let ldef = SleighContextBuilder::_load_folder(path.as_ref())?; - Ok(SleighContextBuilder { - defs: ldef, - image: None, - }) + Ok(SleighContextBuilder { defs: ldef }) } fn _load_folder(path: &Path) -> Result, JingleSleighError> { @@ -56,11 +57,15 @@ impl SleighContextBuilder { if !path.is_dir() { return Err(LanguageSpecRead); } - let ldef_path = find_ldef(&path)?; - let defs = parse_ldef(ldef_path.as_path())?; - let defs = defs + let ldef_paths = find_ldef(&path)?; + let defs: Vec<(LanguageDefinition, PathBuf)> = ldef_paths .iter() - .map(|f| (f.clone(), path.to_path_buf())) + .flat_map(|ldef_path| { + let defs: Vec = parse_ldef(ldef_path.as_path()).unwrap(); + defs.iter() + .map(|f| (f.clone(), path.to_path_buf())) + .collect::>() + }) .collect(); Ok(defs) } @@ -78,24 +83,23 @@ impl SleighContextBuilder { defs.extend(d); } } - Ok(SleighContextBuilder { defs, image: None }) - } - - pub fn set_image(mut self, img: Image) -> Self { - self.image = Some(img); - self + Ok(SleighContextBuilder { defs }) } } -fn find_ldef(path: &Path) -> Result { +fn find_ldef(path: &Path) -> Result, JingleSleighError> { + let mut ldefs = vec![]; for entry in (fs::read_dir(path).map_err(|_| LanguageSpecRead)?).flatten() { if let Some(e) = entry.path().extension() { if e == "ldefs" { - return Ok(entry.path().clone()); + ldefs.push(entry.path().clone()); } } } - Err(LanguageSpecRead) + if ldefs.is_empty() { + return Err(LanguageSpecRead); + } + Ok(ldefs) } #[cfg(test)] diff --git a/jingle_sleigh/src/context/builder/processor_spec.rs b/jingle_sleigh/src/context/builder/processor_spec.rs index ebbc61f..ded59a7 100644 --- a/jingle_sleigh/src/context/builder/processor_spec.rs +++ b/jingle_sleigh/src/context/builder/processor_spec.rs @@ -9,8 +9,9 @@ use std::path::Path; pub struct ContextSet { pub name: String, #[serde(rename = "val")] - pub value: u64, + pub value: String, } +#[allow(unused)] #[derive(Debug, Deserialize)] #[serde(rename = "context_set")] pub struct ContextSetSpace { @@ -21,7 +22,9 @@ pub struct ContextSetSpace { #[derive(Debug, Deserialize)] pub struct ContextData { - pub context_set: ContextSetSpace, + pub context_set: Option, + #[allow(unused)] + pub tracked_set: Option, } #[derive(Debug, Deserialize)] @@ -29,7 +32,7 @@ pub struct ContextData { pub struct ProcessorSpec { // TODO: Properties // properties: Properties - pub context_data: ContextData, + pub context_data: Option, } pub(super) fn parse_pspec(path: &Path) -> Result { diff --git a/jingle_sleigh/src/context/builder/image/elf.rs b/jingle_sleigh/src/context/image/elf.rs similarity index 98% rename from jingle_sleigh/src/context/builder/image/elf.rs rename to jingle_sleigh/src/context/image/elf.rs index 974312c..d952dae 100644 --- a/jingle_sleigh/src/context/builder/image/elf.rs +++ b/jingle_sleigh/src/context/image/elf.rs @@ -44,8 +44,8 @@ mod tests { use elf::endian::AnyEndian; use elf::ElfBytes; - #[test] - fn test_elf() { + // #[test] + fn _test_elf() { let path = std::path::PathBuf::from("../bin/vuln"); let file_data = std::fs::read(path).unwrap(); let slice = file_data.as_slice(); diff --git a/jingle_sleigh/src/context/image/gimli.rs b/jingle_sleigh/src/context/image/gimli.rs new file mode 100644 index 0000000..a2ac846 --- /dev/null +++ b/jingle_sleigh/src/context/image/gimli.rs @@ -0,0 +1,185 @@ +use crate::context::image::{ImageProvider, ImageSection, ImageSectionIterator, Perms}; +use crate::{JingleSleighError, VarNode}; +use object::{Architecture, Endianness, File, Object, ObjectSection, Section, SectionKind}; +use std::cmp::{max, min}; + +#[derive(Debug, PartialEq, Eq)] +pub struct OwnedSection { + data: Vec, + perms: Perms, + base_address: usize, +} + +impl<'a> From<&'a OwnedSection> for ImageSection<'a> { + fn from(value: &'a OwnedSection) -> Self { + ImageSection { + data: value.data.as_slice(), + perms: value.perms.clone(), + base_address: value.base_address, + } + } +} + +impl TryFrom> for OwnedSection { + type Error = JingleSleighError; + + fn try_from(value: Section) -> Result { + let data = value + .data() + .map_err(|_| JingleSleighError::ImageLoadError)? + .to_vec(); + Ok(OwnedSection { + data, + perms: map_sec_kind(&value.kind()), + base_address: value.address() as usize, + }) + } +} + +#[derive(Debug)] +pub struct OwnedFile { + sections: Vec, +} + +impl OwnedFile { + pub fn new(file: &File) -> Result { + let mut sections = vec![]; + for x in file.sections().filter(|f| f.kind() == SectionKind::Text) { + sections.push(x.try_into()?); + } + Ok(Self { sections }) + } +} + +impl ImageProvider for OwnedFile { + fn load(&self, vn: &VarNode, output: &mut [u8]) -> usize { + let mut written = 0; + output.fill(0); + let output_start_addr = vn.offset as usize; + let output_end_addr = output_start_addr + vn.size; + if let Some(x) = self.get_section_info().find(|s| { + output_start_addr >= s.base_address + && output_start_addr < (s.base_address + s.data.len()) + }) { + let input_start_addr = x.base_address; + let input_end_addr = input_start_addr + x.data.len(); + let start_addr = max(input_start_addr, output_start_addr); + let end_addr = max(min(input_end_addr, output_end_addr), start_addr); + if end_addr > start_addr { + let i_s = start_addr - x.base_address; + let i_e = end_addr - x.base_address; + let o_s = start_addr - vn.offset as usize; + let o_e = end_addr - vn.offset as usize; + let out_slice = &mut output[o_s..o_e]; + let in_slice = &x.data[i_s..i_e]; + out_slice.copy_from_slice(in_slice); + written += end_addr - start_addr; + } + } + written + } + + fn has_full_range(&self, vn: &VarNode) -> bool { + self.get_section_info().any(|s| { + s.base_address <= vn.offset as usize + && (s.base_address + s.data.len()) >= (vn.offset as usize + vn.size) + }) + } + + fn get_section_info(&self) -> ImageSectionIterator { + ImageSectionIterator::new(self.sections.iter().map(ImageSection::from)) + } +} + +impl ImageProvider for File<'_> { + fn load(&self, vn: &VarNode, output: &mut [u8]) -> usize { + let mut written = 0; + output.fill(0); + let output_start_addr = vn.offset as usize; + let output_end_addr = output_start_addr + vn.size; + if let Some(x) = self.sections().find(|s| { + output_start_addr >= s.address() as usize + && output_start_addr < (s.address() + s.size()) as usize + }) { + if let Ok(data) = x.data() { + let input_start_addr = x.address() as usize; + let input_end_addr = input_start_addr + data.len(); + let start_addr = max(input_start_addr, output_start_addr); + let end_addr = max(min(input_end_addr, output_end_addr), start_addr); + if end_addr > start_addr { + let i_s = start_addr - x.address() as usize; + let i_e = end_addr - x.address() as usize; + let o_s = start_addr - vn.offset as usize; + let o_e = end_addr - vn.offset as usize; + let out_slice = &mut output[o_s..o_e]; + let in_slice = &data[i_s..i_e]; + out_slice.copy_from_slice(in_slice); + written += end_addr - start_addr; + } + } + } + written + } + + fn has_full_range(&self, vn: &VarNode) -> bool { + self.sections().any(|s| { + s.address() <= vn.offset && (s.address() + s.size()) >= (vn.offset + vn.size as u64) + }) + } + + fn get_section_info(&self) -> ImageSectionIterator { + ImageSectionIterator::new(self.sections().filter_map(|s| { + if let Ok(data) = s.data() { + Some(ImageSection { + data, + base_address: s.address() as usize, + perms: map_sec_kind(&s.kind()), + }) + } else { + None + } + })) + } +} + +pub fn map_gimli_architecture(file: &File) -> Option<&'static str> { + match &file.architecture() { + Architecture::Unknown => None, + Architecture::Aarch64 => match file.endianness() { + Endianness::Little => Some("AARCH64:LE:64:v8A"), + Endianness::Big => Some("AARCH64:BE:64:v8A"), + }, + Architecture::Aarch64_Ilp32 => match file.endianness() { + Endianness::Little => Some("AARCH64:LE:32:ilp32"), + Endianness::Big => Some("AARCH64:BE:32:ilp32"), + }, + Architecture::Arm => match file.endianness() { + Endianness::Little => Some("ARM:LE:32:v8"), + Endianness::Big => Some("ARM:BE:32:v8"), + }, + Architecture::I386 => Some("x86:LE:32:default"), + Architecture::X86_64 => Some("x86:LE:64:default"), + Architecture::PowerPc64 => match file.endianness() { + Endianness::Little => Some("PowerPC:LE:64:default"), + Endianness::Big => Some("PowerPC:BE:64:default"), + }, + Architecture::Xtensa => match file.endianness() { + Endianness::Little => Some("Xtensa:LE:32:default"), + Endianness::Big => Some("Xtensa:BE:32:default"), + }, + _ => None, + } +} + +fn map_sec_kind(kind: &SectionKind) -> Perms { + match kind { + SectionKind::Unknown => Perms::RWX, + SectionKind::Text => Perms::RX, + SectionKind::Data => Perms::RW, + SectionKind::ReadOnlyData => Perms::R, + SectionKind::ReadOnlyDataWithRel => Perms::R, + SectionKind::ReadOnlyString => Perms::R, + SectionKind::UninitializedData => Perms::RW, + _ => Perms::NONE, + } +} diff --git a/jingle_sleigh/src/context/image/mod.rs b/jingle_sleigh/src/context/image/mod.rs new file mode 100644 index 0000000..90435cd --- /dev/null +++ b/jingle_sleigh/src/context/image/mod.rs @@ -0,0 +1,176 @@ +use crate::VarNode; +use std::cmp::min; +use std::iter::once; +use std::ops::Range; + +#[cfg(feature = "gimli")] +pub mod gimli; + +pub trait ImageProvider { + fn load(&self, vn: &VarNode, output: &mut [u8]) -> usize; + + fn has_full_range(&self, vn: &VarNode) -> bool; + fn get_section_info(&self) -> ImageSectionIterator; + + fn get_bytes(&self, vn: &VarNode) -> Option> { + let mut vec = vec![0u8; vn.size]; + let size = self.load(vn, &mut vec); + if size < vn.size { + None + } else { + Some(vec) + } + } +} + +pub struct ImageSectionIterator<'a> { + iter: Box> + 'a>, +} + +impl<'a> ImageSectionIterator<'a> { + pub fn new> + 'a>(iter: T) -> Self { + Self { + iter: Box::new(iter), + } + } +} + +impl<'a> Iterator for ImageSectionIterator<'a> { + type Item = ImageSection<'a>; + + fn next(&mut self) -> Option { + self.iter.next() + } +} +impl ImageProvider for &[u8] { + fn load(&self, vn: &VarNode, output: &mut [u8]) -> usize { + //todo: check the space. Ignoring for now + let vn_range: Range = Range::from(vn); + let vn_range = Range { + start: vn_range.start, + end: min(vn_range.end, self.len()), + }; + if let Some(s) = self.get(vn_range) { + if let Some(o) = output.get_mut(0..s.len()) { + o.copy_from_slice(s) + } + let o_len = output.len(); + if let Some(o) = output.get_mut(s.len()..o_len) { + o.fill(0); + } + s.len() + } else { + output.fill(0); + 0 + } + } + + fn has_full_range(&self, vn: &VarNode) -> bool { + let vn_range: Range = Range::from(vn); + vn_range.start < self.len() && vn_range.end <= self.len() + } + + fn get_section_info(&self) -> ImageSectionIterator { + ImageSectionIterator::new(once(ImageSection { + data: self, + base_address: 0, + perms: Perms { + read: true, + write: false, + exec: true, + }, + })) + } +} + +impl ImageProvider for Vec { + fn load(&self, vn: &VarNode, output: &mut [u8]) -> usize { + self.as_slice().load(vn, output) + } + + fn has_full_range(&self, vn: &VarNode) -> bool { + self.as_slice().has_full_range(vn) + } + + fn get_section_info(&self) -> ImageSectionIterator { + ImageSectionIterator::new(once(ImageSection { + data: self, + base_address: 0, + perms: Perms { + read: true, + write: false, + exec: true, + }, + })) + } +} + +impl ImageProvider for &T { + fn load(&self, vn: &VarNode, output: &mut [u8]) -> usize { + (*self).load(vn, output) + } + + fn has_full_range(&self, vn: &VarNode) -> bool { + (*self).has_full_range(vn) + } + + fn get_section_info(&self) -> ImageSectionIterator { + (*self).get_section_info() + } +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct Perms { + pub read: bool, + pub write: bool, + pub exec: bool, +} + +impl Perms { + pub const RWX: Perms = Perms { + read: true, + write: true, + exec: true, + }; + pub const RX: Perms = Perms { + read: true, + write: false, + exec: true, + }; + + pub const RW: Perms = Perms { + read: true, + write: true, + exec: false, + }; + pub const R: Perms = Perms { + read: true, + write: false, + exec: false, + }; + + pub const NONE: Perms = Perms { + read: false, + write: false, + exec: false, + }; +} + +#[derive(Debug, Clone, PartialEq)] +pub struct ImageSection<'a> { + pub data: &'a [u8], + pub base_address: usize, + pub perms: Perms, +} + +#[cfg(test)] +mod tests { + use crate::context::image::{ImageProvider, ImageSection}; + + #[test] + fn test_vec_sections() { + let data: Vec = vec![1, 2, 3]; + let sections: Vec = data.get_section_info().collect(); + assert_ne!(sections, vec![]) + } +} diff --git a/jingle_sleigh/src/context/instruction_iterator.rs b/jingle_sleigh/src/context/instruction_iterator.rs new file mode 100644 index 0000000..09c5465 --- /dev/null +++ b/jingle_sleigh/src/context/instruction_iterator.rs @@ -0,0 +1,90 @@ +use crate::context::SleighContext; +use crate::Instruction; + +pub struct SleighContextInstructionIterator<'a> { + sleigh: &'a SleighContext, + remaining: usize, + offset: u64, + terminate_branch: bool, + already_hit_branch: bool, +} + +impl<'a> SleighContextInstructionIterator<'a> { + pub(crate) fn new( + sleigh: &'a SleighContext, + offset: u64, + remaining: usize, + terminate_branch: bool, + ) -> Self { + SleighContextInstructionIterator { + sleigh, + remaining, + offset, + terminate_branch, + already_hit_branch: false, + } + } +} + +impl Iterator for SleighContextInstructionIterator<'_> { + type Item = Instruction; + + fn next(&mut self) -> Option { + if self.remaining == 0 { + return None; + } + if self.terminate_branch && self.already_hit_branch { + return None; + } + let instr = self + .sleigh + .ctx + .get_one_instruction(self.offset) + .map(Instruction::from) + .ok()?; + self.already_hit_branch = instr.terminates_basic_block(); + self.offset += instr.length as u64; + self.remaining -= 1; + Some(instr) + } +} + +#[cfg(test)] +mod test { + use crate::context::builder::SleighContextBuilder; + use crate::pcode::PcodeOperation; + use crate::{Instruction, SpaceManager}; + + use crate::tests::SLEIGH_ARCH; + use crate::varnode; + + #[test] + fn get_one() { + let mov_eax_0: [u8; 6] = [0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3]; + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + let sleigh = sleigh.initialize_with_image(mov_eax_0.as_slice()).unwrap(); + let instr = sleigh.read(0, 1).last().unwrap(); + assert_eq!(instr.length, 5); + assert!(instr.disassembly.mnemonic.eq("MOV")); + assert!(!instr.ops.is_empty()); + varnode!(&sleigh, #0:4).unwrap(); + let _op = PcodeOperation::Copy { + input: varnode!(&sleigh, #0:4).unwrap(), + output: varnode!(&sleigh, "register"[0]:4).unwrap(), + }; + assert!(matches!(&instr.ops[0], _op)) + } + + #[test] + fn stop_at_branch() { + let mov_eax_0: Vec = vec![0x90, 0x90, 0x90, 0x90]; + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + let sleigh = sleigh.initialize_with_image(mov_eax_0).unwrap(); + let instr: Vec = sleigh.read(0, 5).collect(); + assert_eq!(instr.len(), 4); + } +} diff --git a/jingle_sleigh/src/context/loaded.rs b/jingle_sleigh/src/context/loaded.rs new file mode 100644 index 0000000..adad049 --- /dev/null +++ b/jingle_sleigh/src/context/loaded.rs @@ -0,0 +1,265 @@ +use crate::context::image::{ImageProvider, ImageSection}; +use crate::context::instruction_iterator::SleighContextInstructionIterator; +use crate::context::SleighContext; +use crate::ffi::context_ffi::ImageFFI; +use crate::JingleSleighError::ImageLoadError; +use crate::{Instruction, JingleSleighError, RegisterManager, SpaceInfo, SpaceManager, VarNode}; +use std::fmt::{Debug, Formatter}; +use std::ops::{Deref, DerefMut}; +use std::pin::Pin; + +/// A guard type representing a sleigh context initialized with an image. +/// In addition to the methods in [SleighContext], is able to +/// query bytes for address ranges from its source image, as well +/// as ISA instructions (and associated `p-code`). +pub struct LoadedSleighContext<'a> { + /// A handle to `sleigh`. By construction, this context is initialized with an image + sleigh: SleighContext, + /// A handle to the image source being queried by the [SleighContext]. + img: Pin>>, +} + +impl Debug for LoadedSleighContext<'_> { + fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { + self.sleigh.fmt(f) + } +} +impl Deref for LoadedSleighContext<'_> { + type Target = SleighContext; + + fn deref(&self) -> &Self::Target { + &self.sleigh + } +} + +impl DerefMut for LoadedSleighContext<'_> { + fn deref_mut(&mut self) -> &mut Self::Target { + &mut self.sleigh + } +} + +impl<'a> LoadedSleighContext<'a> { + /// Consumes a [SleighContext] and an image provider, initializes + /// sleigh with the image provider, and combines them into a single + /// [LoadedSleigh*Context] guard value. + pub(crate) fn new( + sleigh_context: SleighContext, + img: T, + ) -> Result { + let img = Box::pin(ImageFFI::new(img, sleigh_context.get_code_space_idx())); + let mut s = Self { + sleigh: sleigh_context, + img, + }; + let (ctx, img) = s.borrow_parts(); + ctx.ctx + .pin_mut() + .setImage(img) + .map_err(|_| ImageLoadError)?; + Ok(s) + } + /// Query `sleigh` for the instruction associated with the given offset in the default code + /// space. + /// todo: consider using a varnode instead of a raw offset. + pub fn instruction_at(&self, offset: u64) -> Option { + let instr = self + .ctx + .get_one_instruction(offset) + .map(Instruction::from) + .ok()?; + let vn = VarNode { + space_index: self.sleigh.get_code_space_idx(), + size: instr.length, + offset, + }; + if self.img.has_range(&vn) { + Some(instr) + } else { + None + } + } + + /// Read an iterator of at most `max_instrs` [`Instruction`]s from `offset` in the default code + /// space. + /// todo: consider using a varnode instead of a raw offset + pub fn read(&self, offset: u64, max_instrs: usize) -> SleighContextInstructionIterator { + SleighContextInstructionIterator::new(self, offset, max_instrs, false) + } + + /// Read the byte range specified by the given [`VarNode`] from the configured image provider. + pub fn read_bytes(&self, vn: &VarNode) -> Option> { + if vn.space_index == self.get_code_space_idx() { + self.img.provider.get_bytes(&self.adjust_varnode_vma(vn)) + } else { + None + } + } + + /// Read an iterator of at most `max_instrs` [`Instruction`]s from `offset` in the default code + /// space, terminating if a branch is encountered. + /// todo: consider using a varnode instead of a raw offset + pub fn read_until_branch( + &self, + offset: u64, + max_instrs: usize, + ) -> SleighContextInstructionIterator { + SleighContextInstructionIterator::new(self, offset, max_instrs, true) + } + + /// Re-initialize `sleigh` with a new image, without re-parsing the `.sla` definitions. This + /// is _much_ faster than generating a new context. + pub fn set_image( + &mut self, + img: T, + ) -> Result<(), JingleSleighError> { + let (sleigh, img_ref) = self.borrow_parts(); + *img_ref = ImageFFI::new(img, sleigh.get_code_space_idx()); + sleigh + .ctx + .pin_mut() + .setImage(img_ref) + .map_err(|_| ImageLoadError) + } + + /// Returns an iterator of entries describing the sections of the configured image provider. + pub fn get_sections(&self) -> impl Iterator { + self.img.provider.get_section_info().map(|mut s| { + s.base_address += self.get_base_address() as usize; + s + }) + } + + fn borrow_parts<'b>(&'b mut self) -> (&'b mut SleighContext, &'b mut ImageFFI<'a>) { + (&mut self.sleigh, &mut self.img) + } + + /// Rebase the loaded image to `offset` + pub fn set_base_address(&mut self, offset: u64) { + self.img.set_base_address(offset); + } + + /// Get the current base address + pub fn get_base_address(&self) -> u64 { + self.img.get_base_address() + } + + // todo: properly account for spaces with non-byte-based indexing + fn adjust_varnode_vma(&self, vn: &VarNode) -> VarNode { + VarNode { + space_index: vn.space_index, + size: vn.size, + offset: vn.offset.wrapping_sub(self.get_base_address()), + } + } +} + +impl SpaceManager for LoadedSleighContext<'_> { + fn get_space_info(&self, idx: usize) -> Option<&SpaceInfo> { + self.sleigh.get_space_info(idx) + } + + fn get_all_space_info(&self) -> &[SpaceInfo] { + self.sleigh.get_all_space_info() + } + + fn get_code_space_idx(&self) -> usize { + self.sleigh.get_code_space_idx() + } +} + +impl RegisterManager for LoadedSleighContext<'_> { + fn get_register(&self, name: &str) -> Option { + self.sleigh.get_register(name) + } + + fn get_register_name(&self, location: &VarNode) -> Option<&str> { + self.sleigh.get_register_name(location) + } + + fn get_registers(&self) -> Vec<(VarNode, String)> { + self.sleigh.get_registers() + } +} + +#[cfg(test)] +mod tests { + use crate::context::SleighContextBuilder; + use crate::tests::SLEIGH_ARCH; + use crate::PcodeOperation::Branch; + use crate::VarNode; + + #[test] + fn test_adjust_vma() { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + let img: [u8; 5] = [0x55, 1, 2, 3, 4]; + let mut loaded = sleigh.initialize_with_image(img.as_slice()).unwrap(); + let first = loaded + .read_bytes(&VarNode { + space_index: 3, + size: 5, + offset: 0, + }) + .unwrap(); + assert_eq!(first.as_slice(), img.as_slice()); + let instr1 = loaded.instruction_at(0).unwrap(); + assert_eq!(instr1.disassembly.mnemonic, "PUSH"); + loaded.set_base_address(100); + assert!(loaded.instruction_at(0).is_none()); + assert_eq!( + loaded.read_bytes(&VarNode { + space_index: 3, + size: 5, + offset: 0 + }), + None + ); + let second = loaded + .read_bytes(&VarNode { + space_index: 3, + size: 5, + offset: 100, + }) + .unwrap(); + assert_eq!(second.as_slice(), img.as_slice()); + let instr2 = loaded.instruction_at(100).unwrap(); + assert_eq!(instr2.disassembly.mnemonic, "PUSH"); + for (a, b) in instr2.ops.iter().zip(instr1.ops) { + assert_eq!(a.opcode(), b.opcode()) + } + } + + #[test] + pub fn relative_addresses() { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + // JMP $+5 + let img: [u8; 2] = [0xeb, 0x05]; + let mut loaded = sleigh.initialize_with_image(img.as_slice()).unwrap(); + let instr = loaded.instruction_at(0).unwrap(); + assert_eq!( + instr.ops[0], + Branch { + input: VarNode { + space_index: 3, + size: 8, + offset: 7 + } + } + ); + loaded.set_base_address(0x100); + let instr2 = loaded.instruction_at(0x100).unwrap(); + assert_eq!( + instr2.ops[0], + Branch { + input: VarNode { + space_index: 3, + size: 8, + offset: 0x107 + } + } + ); + } +} diff --git a/jingle_sleigh/src/context/mod.rs b/jingle_sleigh/src/context/mod.rs index faf5416..62dff14 100644 --- a/jingle_sleigh/src/context/mod.rs +++ b/jingle_sleigh/src/context/mod.rs @@ -1,18 +1,20 @@ mod builder; +pub mod image; +mod instruction_iterator; +pub mod loaded; use crate::error::JingleSleighError; use crate::error::JingleSleighError::{LanguageSpecRead, SleighInitError}; use crate::ffi::addrspace::bridge::AddrSpaceHandle; use crate::ffi::context_ffi::bridge::ContextFFI; -use crate::instruction::Instruction; use crate::space::{RegisterManager, SpaceInfo, SpaceManager}; -#[cfg(feature = "gimli")] -pub use builder::image::gimli::map_gimli_architecture; -pub use builder::image::{Image, ImageSection}; pub use builder::SleighContextBuilder; +use crate::context::builder::language_def::LanguageDefinition; +use crate::context::image::ImageProvider; +use crate::context::loaded::LoadedSleighContext; use crate::ffi::context_ffi::CTX_BUILD_MUTEX; -use crate::ffi::instruction::bridge::VarnodeInfoFFI; +use crate::JingleSleighError::{ImageLoadError, SleighCompilerMutexError}; use crate::VarNode; use cxx::{SharedPtr, UniquePtr}; use std::fmt::{Debug, Formatter}; @@ -21,12 +23,13 @@ use std::path::Path; pub struct SleighContext { ctx: UniquePtr, spaces: Vec, - pub image: Image, + language_id: String, + registers: Vec<(VarNode, String)>, } impl Debug for SleighContext { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { - write!(f, "Sleigh {{image: {:?}}}", self.image) + write!(f, "Sleigh {{arch: {}}}", self.language_id) } } @@ -50,52 +53,65 @@ impl SpaceManager for SleighContext { impl RegisterManager for SleighContext { fn get_register(&self, name: &str) -> Option { - self.ctx.getRegister(name).map(|f| VarNode::from(f)).ok() + self.registers + .iter() + .find(|(_, reg_name)| reg_name.as_str() == name) + .map(|(vn, _)| vn.clone()) } - fn get_register_name(&self, location: VarNode) -> Option<&str> { - let space = self.ctx.getSpaceByIndex(location.space_index as i32); - self.ctx - .getRegisterName(VarnodeInfoFFI { - space, - offset: location.offset, - size: location.size, - }) - .ok() + fn get_register_name(&self, location: &VarNode) -> Option<&str> { + self.registers + .iter() + .find(|(vn, _)| vn == location) + .map(|(_, name)| name.as_str()) } fn get_registers(&self) -> Vec<(VarNode, String)> { - self.ctx - .getRegisters() - .iter() - .map(|b| (VarNode::from(&b.varnode), b.name.clone())) - .collect() + self.registers.clone() } } impl SleighContext { - pub(crate) fn new(path: &Path, image: Image) -> Result { + pub(crate) fn new>( + language_def: &LanguageDefinition, + base_path: T, + ) -> Result { + let path = base_path.as_ref().join(&language_def.sla_file); let abs = path.canonicalize().map_err(|_| LanguageSpecRead)?; let path_str = abs.to_str().ok_or(LanguageSpecRead)?; match CTX_BUILD_MUTEX.lock() { Ok(make_context) => { - let ctx = make_context(path_str, image.clone()).map_err(|_| SleighInitError)?; + let ctx = make_context(path_str).map_err(|e| SleighInitError(e.to_string()))?; let mut spaces: Vec = Vec::with_capacity(ctx.getNumSpaces() as usize); for idx in 0..ctx.getNumSpaces() { spaces.push(SpaceInfo::from(ctx.getSpaceByIndex(idx))); } - Ok(Self { image, ctx, spaces }) + let registers = ctx + .getRegisters() + .iter() + .map(|b| (VarNode::from(&b.varnode), b.name.clone())) + .collect(); + + Ok(Self { + ctx, + spaces, + language_id: language_def.id.clone(), + registers, + }) } - Err(_) => Err(SleighInitError), + Err(_) => Err(SleighCompilerMutexError), } } - pub(crate) fn set_initial_context(&mut self, name: &str, value: u32) { - self.ctx.pin_mut().set_initial_context(name, value); - } - - pub fn read(&self, offset: u64, max_instrs: usize) -> SleighContextInstructionIterator { - SleighContextInstructionIterator::new(self, offset, max_instrs) + pub(crate) fn set_initial_context( + &mut self, + name: &str, + value: u32, + ) -> Result<(), JingleSleighError> { + self.ctx + .pin_mut() + .set_initial_context(name, value) + .map_err(|_| ImageLoadError) } pub fn spaces(&self) -> Vec> { @@ -105,74 +121,81 @@ impl SleighContext { } spaces } -} -pub struct SleighContextInstructionIterator<'a> { - sleigh: &'a SleighContext, - remaining: usize, - offset: u64, -} + pub fn get_language_id(&self) -> &str { + &self.language_id + } -impl<'a> SleighContextInstructionIterator<'a> { - pub(crate) fn new(sleigh: &'a SleighContext, offset: u64, remaining: usize) -> Self { - SleighContextInstructionIterator { - sleigh, - remaining, - offset, - } + pub fn initialize_with_image<'b, T: ImageProvider + 'b>( + self, + img: T, + ) -> Result, JingleSleighError> { + LoadedSleighContext::new(self, img) } } -impl<'a> Iterator for SleighContextInstructionIterator<'a> { - type Item = Instruction; +#[cfg(test)] +mod test { + use crate::context::SleighContextBuilder; + use crate::tests::SLEIGH_ARCH; + use crate::{RegisterManager, VarNode}; - fn next(&mut self) -> Option { - if self.remaining == 0 { - return None; - } - if !self.sleigh.image.contains_address(self.offset as usize) { - return None; + #[test] + fn get_regs() { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + assert_ne!(sleigh.get_registers(), vec![]); + } + + #[test] + fn get_register_name() { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + for (vn, name) in sleigh.get_registers() { + let addr = sleigh.get_register(&name); + assert_eq!(addr, Some(vn)); } - let instr = self - .sleigh - .ctx - .get_one_instruction(self.offset) - .map(Instruction::from) - .ok()?; - self.offset += instr.length as u64; - self.remaining -= 1; - Some(instr) } -} -#[cfg(test)] -mod test { - use crate::context::builder::image::Image; - use crate::context::builder::SleighContextBuilder; - use crate::pcode::PcodeOperation; - use crate::SpaceManager; + #[test] + fn get_invalid_register_name() { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + assert_eq!(sleigh.get_register("fake"), None); + } - use crate::tests::SLEIGH_ARCH; - use crate::varnode; + #[test] + fn get_valid_register() { + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + + assert_eq!( + sleigh.get_register_name(&VarNode { + space_index: 4, + offset: 512, + size: 1 + }), + Some("CF") + ); + } #[test] - fn get_one() { - let mov_eax_0: [u8; 6] = [0xb8, 0x00, 0x00, 0x00, 0x00, 0xc3]; + fn get_invalid_register() { let ctx_builder = SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); - let ctx = ctx_builder - .set_image(Image::from(mov_eax_0.as_slice())) - .build(SLEIGH_ARCH) - .unwrap(); - let instr = ctx.read(0, 1).last().unwrap(); - assert_eq!(instr.length, 5); - assert!(instr.disassembly.mnemonic.eq("MOV")); - assert!(!instr.ops.is_empty()); - varnode!(&ctx, #0:4).unwrap(); - let _op = PcodeOperation::Copy { - input: varnode!(&ctx, #0:4).unwrap(), - output: varnode!(&ctx, "register"[0]:4).unwrap(), - }; - assert!(matches!(&instr.ops[0], _op)) + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + + assert_eq!( + sleigh.get_register_name(&VarNode { + space_index: 40, + offset: 5122, + size: 1 + }), + None + ); } } diff --git a/jingle_sleigh/src/error.rs b/jingle_sleigh/src/error.rs index 99f2daf..c2aacaa 100644 --- a/jingle_sleigh/src/error.rs +++ b/jingle_sleigh/src/error.rs @@ -5,7 +5,7 @@ use thiserror::Error; pub enum JingleSleighError { /// The sleigh compiler was run against a language definition that had some missing files. /// Probably indicates that the path to the language specification was wrong - #[error("missing files needed to init sleigh. Could be sla or ldef or pspec")] + #[error("Unable to parse sleigh language!")] LanguageSpecRead, /// A language specification existed, but was unable to be parsed #[error("failed to parse sleigh language definition")] @@ -15,14 +15,10 @@ pub enum JingleSleighError { InvalidLanguageId, /// Attempted to initialize sleigh but something went wrong #[error("Something went wrong putting bytes into sleigh")] - SleighInitError, + SleighInitError(String), /// Unable to load the provided binary image for sleigh #[error("Something went wrong putting bytes into sleigh")] ImageLoadError, - /// Unable to parse the provided elf for sleigh - #[cfg(feature = "elf")] - #[error("Trouble loading an elf")] - ElfLoadError(#[from] elf::ParseError), /// Attempted to initialize sleigh with an empty image #[error("You didn't provide any bytes to sleigh")] NoImageProvided, @@ -33,6 +29,11 @@ pub enum JingleSleighError { /// A [`VarNode`](crate::VarNode) was constructed referencing a non-existent space #[error("A varnode was constructed referencing a non-existent space")] InvalidSpaceName, + /// Attempted to construct an [Instruction](crate::Instruction) from an empty slice of instructions + #[error("Attempted to construct an instruction from an empty slice of instructions")] + EmptyInstruction, + #[error("Failure to acquire mutex to sleigh FFI function")] + SleighCompilerMutexError, } impl From for std::fmt::Error { diff --git a/jingle_sleigh/src/ffi/addrspace.rs b/jingle_sleigh/src/ffi/addrspace.rs index 566b8da..77f5d85 100644 --- a/jingle_sleigh/src/ffi/addrspace.rs +++ b/jingle_sleigh/src/ffi/addrspace.rs @@ -1,10 +1,10 @@ #[cxx::bridge] pub(crate) mod bridge { - #[rust_name = "SpaceType"] + #[cxx_name = "spacetype"] #[namespace = "ghidra"] #[derive(Debug, Hash, Clone, Copy, Eq, PartialEq, Serialize, Deserialize)] #[repr(u32)] - pub enum spacetype { + pub enum SpaceType { ///< Special space to represent constants IPTR_CONSTANT = 0, ///< Normal spaces modelled by processor @@ -37,9 +37,10 @@ pub(crate) mod bridge { } unsafe extern "C++" { + include!("jingle_sleigh/src/ffi/cpp/sleigh/space.hh"); #[namespace = "ghidra"] - #[rust_name = "SpaceType"] - type spacetype; + #[cxx_name = "spacetype"] + type SpaceType; } unsafe extern "C++" { diff --git a/jingle_sleigh/src/ffi/compile.rs b/jingle_sleigh/src/ffi/compile.rs deleted file mode 100644 index 5c08394..0000000 --- a/jingle_sleigh/src/ffi/compile.rs +++ /dev/null @@ -1,94 +0,0 @@ -use crate::ffi::compile::bridge::{CompileDefine, CompileParams}; -use std::collections::BTreeMap; -use std::path::Path; - -pub struct SleighCompileParams { - defines: BTreeMap, - unnecessary_pcode_warning: bool, - lenient_conflict: bool, - all_collision_warning: bool, - all_nop_warning: bool, - dead_temp_warning: bool, - enforce_local_keyword: bool, - large_temporary_warning: bool, - case_sensitive_register_names: bool, -} - -pub fn compile( - in_path: impl AsRef, - out_path: impl AsRef, - params: Option, -) { - let _hi = Path::new("hi"); - if let Some(in_path) = in_path.as_ref().to_str() { - if let Some(out_path) = out_path.as_ref().to_str() { - bridge::compile(in_path, out_path, params.unwrap_or_default().into()) - } - } -} - -impl Default for SleighCompileParams { - fn default() -> Self { - Self { - defines: BTreeMap::new(), - unnecessary_pcode_warning: false, - lenient_conflict: true, - all_collision_warning: false, - all_nop_warning: false, - dead_temp_warning: false, - enforce_local_keyword: false, - large_temporary_warning: false, - case_sensitive_register_names: false, - } - } -} - -impl From for CompileParams { - fn from(value: SleighCompileParams) -> Self { - Self { - defines: value - .defines - .iter() - .map(|(name, val)| CompileDefine { - name: name.clone(), - value: val.clone(), - }) - .collect(), - unnecessary_pcode_warning: value.unnecessary_pcode_warning, - lenient_conflict: value.lenient_conflict, - all_collision_warning: value.all_collision_warning, - all_nop_warning: value.all_nop_warning, - dead_temp_warning: value.dead_temp_warning, - enforce_local_keyword: value.enforce_local_keyword, - large_temporary_warning: value.large_temporary_warning, - case_sensitive_register_names: value.case_sensitive_register_names, - } - } -} - -#[cxx::bridge] -mod bridge { - struct CompileDefine { - name: String, - value: String, - } - - struct CompileParams { - defines: Vec, - unnecessary_pcode_warning: bool, - lenient_conflict: bool, - all_collision_warning: bool, - all_nop_warning: bool, - dead_temp_warning: bool, - enforce_local_keyword: bool, - large_temporary_warning: bool, - case_sensitive_register_names: bool, - } - - unsafe extern "C++" { - include!("jingle_sleigh/src/ffi/cpp/compile.h"); - - fn compile(inFile: &str, outFile: &str, params: CompileParams); - - } -} diff --git a/jingle_sleigh/src/ffi/context_ffi.rs b/jingle_sleigh/src/ffi/context_ffi.rs index e324527..3237757 100644 --- a/jingle_sleigh/src/ffi/context_ffi.rs +++ b/jingle_sleigh/src/ffi/context_ffi.rs @@ -1,16 +1,23 @@ +// This is necessary due to a change circa rust 1.83.0 that +// flags the lifetime in ImageFFI as needed for elision. +// Could probably be fixed with a change in CXX. +#![allow(clippy::needless_lifetimes)] + +use crate::context::image::ImageProvider; use crate::ffi::context_ffi::bridge::makeContext; +use crate::ffi::instruction::bridge::VarnodeInfoFFI; +use crate::VarNode; use bridge::ContextFFI; -use cxx::{Exception, UniquePtr}; +use cxx::{Exception, ExternType, UniquePtr}; use std::sync::Mutex; -pub(crate) static CTX_BUILD_MUTEX: Mutex< - fn(&str, bridge::Image) -> Result, Exception>, -> = Mutex::new(makeContext); +type ContextGeneratorFp = fn(&str) -> Result, Exception>; + +pub(crate) static CTX_BUILD_MUTEX: Mutex = Mutex::new(makeContext); #[cxx::bridge] pub(crate) mod bridge { unsafe extern "C++" { - type Image = crate::context::Image; type InstructionFFI = crate::ffi::instruction::bridge::InstructionFFI; type VarnodeInfoFFI = crate::ffi::instruction::bridge::VarnodeInfoFFI; @@ -18,6 +25,7 @@ pub(crate) mod bridge { type AddrSpaceHandle = crate::ffi::addrspace::bridge::AddrSpaceHandle; type RegisterInfoFFI = crate::ffi::instruction::bridge::RegisterInfoFFI; + } unsafe extern "C++" { @@ -25,17 +33,86 @@ pub(crate) mod bridge { include!("jingle_sleigh/src/ffi/cpp/exception.h"); pub(crate) type ContextFFI; - pub(super) fn makeContext(slaPath: &str, img: Image) -> Result>; - pub(crate) fn set_initial_context(self: Pin<&mut ContextFFI>, name: &str, value: u32); + pub(super) fn makeContext(slaPath: &str) -> Result>; + pub(crate) fn set_initial_context( + self: Pin<&mut ContextFFI>, + name: &str, + value: u32, + ) -> Result<()>; pub(crate) fn get_one_instruction(&self, offset: u64) -> Result; pub(crate) fn getSpaceByIndex(&self, idx: i32) -> SharedPtr; pub(crate) fn getNumSpaces(&self) -> i32; - pub(crate) fn getRegister(&self, name: &str) -> Result; - pub(crate) fn getRegisterName(&self, name: VarnodeInfoFFI) -> Result<&str>; + // pub(crate) fn getRegister(&self, name: &str) -> Result; + // pub(crate) fn getRegisterName(&self, name: VarnodeInfoFFI) -> Result<&str>; pub(crate) fn getRegisters(&self) -> Vec; + + pub(crate) fn setImage(self: Pin<&mut ContextFFI>, img: &ImageFFI) -> Result<()>; + } + + extern "Rust" { + include!("jingle_sleigh/src/ffi/instruction.rs.h"); + type ImageFFI<'a>; + fn load(self: &ImageFFI, vn: &VarnodeInfoFFI, out: &mut [u8]) -> usize; + } + impl Vec {} +} + +pub(crate) struct ImageFFI<'a> { + /// A thing that has bytes at addresses + pub(crate) provider: Box, + /// The current virtual base address for the image loaded by this context. + pub(crate) base_offset: u64, + /// The space that this image is attached to. For now, always the + /// default code space. + pub(crate) space_index: usize, +} + +impl<'a> ImageFFI<'a> { + pub(crate) fn new(provider: T, idx: usize) -> Self { + Self { + provider: Box::new(provider), + base_offset: 0, + space_index: idx, + } } + pub(crate) fn load(&self, vn: &VarnodeInfoFFI, out: &mut [u8]) -> usize { + let addr = VarNode::from(vn); + if addr.space_index != self.space_index { + return 0; + } + let adjusted = self.adjust_varnode_vma(&addr); + self.provider.load(&adjusted, out) + } + + pub(crate) fn has_range(&self, vn: &VarNode) -> bool { + if vn.space_index != self.space_index { + return false; + } + self.provider.has_full_range(&self.adjust_varnode_vma(vn)) + } + + pub(crate) fn get_base_address(&self) -> u64 { + self.base_offset + } + + pub(crate) fn set_base_address(&mut self, offset: u64) { + self.base_offset = offset + } + // todo: properly account for spaces with non-byte-based indexing + fn adjust_varnode_vma(&self, vn: &VarNode) -> VarNode { + VarNode { + space_index: vn.space_index, + size: vn.size, + offset: vn.offset.wrapping_sub(self.base_offset), + } + } +} + +unsafe impl ExternType for ImageFFI<'_> { + type Id = cxx::type_id!("ImageFFI"); + type Kind = cxx::kind::Opaque; } diff --git a/jingle_sleigh/src/ffi/cpp/.gitignore b/jingle_sleigh/src/ffi/cpp/.gitignore index faa8791..c5401cf 100644 --- a/jingle_sleigh/src/ffi/cpp/.gitignore +++ b/jingle_sleigh/src/ffi/cpp/.gitignore @@ -1 +1,2 @@ -sleigh/** \ No newline at end of file +sleigh/** +zlib/** \ No newline at end of file diff --git a/jingle_sleigh/src/ffi/cpp/context.cpp b/jingle_sleigh/src/ffi/cpp/context.cpp index abe904e..b3afdf8 100644 --- a/jingle_sleigh/src/ffi/cpp/context.cpp +++ b/jingle_sleigh/src/ffi/cpp/context.cpp @@ -1,170 +1,96 @@ #include "context.h" -#include -#include +#include "jingle_assembly_emitter.h" +#include "jingle_pcode_emitter.h" #include "jingle_sleigh/src/ffi/instruction.rs.h" +#include "sleigh/globalcontext.hh" #include "sleigh/loadimage.hh" +#include "sleigh/xml.hh" +#include "varnode_translation.h" +#include "rust_load_image.h" +#include +#include -class PcodeCacher : public ghidra::PcodeEmit { -public: - rust::Vec ops; - - PcodeCacher() = default; - - void dump(const ghidra::Address &addr, ghidra::OpCode opc, ghidra::VarnodeData *outvar, ghidra::VarnodeData *vars, - ghidra::int4 isize) override { - RawPcodeOp op; - op.op = opc; - op.has_output = false; - if (outvar != nullptr && outvar->space != nullptr) { - op.has_output = true; - op.output.offset = outvar->offset; - op.output.size = outvar->size; - op.output.space = std::make_unique(AddrSpaceHandle(outvar->space)); - outvar->space->getType(); - } - op.inputs.reserve(isize); - for (int i = 0; i < isize; i++) { - VarnodeInfoFFI info; - info.space = std::make_unique(vars[i].space); - info.size = vars[i].size; - info.offset = vars[i].offset; - op.space = std::make_unique(addr.getSpace()); - op.inputs.emplace_back(std::move(info)); - } - ops.emplace_back(op); - - } -}; - -class AssemblyCacher : public ghidra::AssemblyEmit { -public: - rust::String mnem; - rust::String body; - - AssemblyCacher() : mnem(""), body("") { - - }; - - void dump(const ghidra::Address &addr, const std::string &mnem, const std::string &body) override { - this->mnem = mnem; - this->body = body; - } -}; - -DummyLoadImage::DummyLoadImage() : ghidra::LoadImage("jingle") { - img = Image{}; -} - -DummyLoadImage::DummyLoadImage(Image image) : ghidra::LoadImage("jingle") { - img = std::move(image); -} - -void DummyLoadImage::loadFill(ghidra::uint1 *ptr, ghidra::int4 size, const ghidra::Address &addr) { - size_t offset = addr.getOffset(); - for (const auto §ion: img.sections) { - size_t start = section.base_address; - size_t end = start + section.data.size(); - if (start <= offset && offset < end) { - size_t len = std::min((size_t) size, (size_t) end - (size_t) offset); - size_t start_idx = offset - start; - std::memcpy(ptr, §ion.data[start_idx], len); - offset = offset + len; - } - } - for (size_t i = offset; i < size; ++i) { - ptr[i] = 0; - } -} - -void DummyLoadImage::adjustVma(long adjust) {} - -std::string DummyLoadImage::getArchType() const { - return "placeholder"; -} +ContextFFI::ContextFFI(rust::Str slaPath) + : sleigh(new DummyLoadImage(), &c_db) { + ghidra::AttributeId::initialize(); + ghidra::ElementId::initialize(); -ContextFFI::ContextFFI(rust::Str slaPath, Image image) { - ghidra::AttributeId::initialize(); - ghidra::ElementId::initialize(); + ghidra::DocumentStorage documentStorage = ghidra::DocumentStorage(); - this->img = DummyLoadImage(std::move(image)); - documentStorage = ghidra::DocumentStorage(); - ghidra::Document *doc = documentStorage.openDocument(slaPath.operator std::string()); - ghidra::Element *root = doc->getRoot(); - documentStorage.registerTag(root); - sleigh = std::make_unique(&img, &contextDatabase); - sleigh->initialize(documentStorage); + std::stringstream sleighfilename; + sleighfilename << ""; + sleighfilename << slaPath; + sleighfilename << ""; + ghidra::Document *doc = documentStorage.parseDocument(sleighfilename); + ghidra::Element *root = doc->getRoot(); + documentStorage.registerTag(root); + sleigh.initialize(documentStorage); } void ContextFFI::set_initial_context(rust::Str name, uint32_t val) { - sleigh->setContextDefault(name.operator std::string(), val); + sleigh.setContextDefault(name.operator std::string(), val); } -InstructionFFI ContextFFI::get_one_instruction(uint64_t offset) const { - PcodeCacher pcode; - AssemblyCacher assembly; - ghidra::Address a = ghidra::Address(sleigh->getDefaultCodeSpace(), offset); - sleigh->printAssembly(assembly, a); - sleigh->oneInstruction(pcode, a); - size_t length = sleigh->instructionLength(a); - InstructionFFI i; - Disassembly d; - i.ops = std::move(pcode.ops); - d.args = std::move(assembly.body); - d.mnemonic = std::move(assembly.mnem); - i.disassembly = std::move(d); - i.address = offset; - i.length = length; - return i; +std::shared_ptr +ContextFFI::getSpaceByIndex(ghidra::int4 idx) const { + return std::make_shared(sleigh.getSpace(idx)); } - -std::shared_ptr ContextFFI::getSpaceByIndex(ghidra::int4 idx) const { - return std::make_shared(sleigh->getSpace(idx)); -} - -ghidra::int4 ContextFFI::getNumSpaces() const { - return sleigh->numSpaces(); -} +ghidra::int4 ContextFFI::getNumSpaces() const { return sleigh.numSpaces(); } VarnodeInfoFFI ContextFFI::getRegister(rust::Str name) const { - ghidra::VarnodeData vn = sleigh->getRegister(name.operator std::string()); - VarnodeInfoFFI info; - info.space = std::make_unique(vn.space); - info.size = vn.size; - info.offset = vn.offset; - return info; + ghidra::VarnodeData vn = sleigh.getRegister(name.operator std::string()); + VarnodeInfoFFI info; + info.space = std::make_unique(vn.space); + info.size = vn.size; + info.offset = vn.offset; + return info; }; rust::Str ContextFFI::getRegisterName(VarnodeInfoFFI vn) const { - std::string name = sleigh->getRegisterName(vn.space->getRaw(), vn.offset, vn.size); - return {name}; + std::string name = + sleigh.getRegisterName(vn.space->getRaw(), vn.offset, vn.size); + return {name}; } -std::unique_ptr makeContext(rust::Str slaPath, Image img) { - return std::make_unique(slaPath, std::move(img)); +rust::Vec ContextFFI::getRegisters() const { + std::map reglist; + rust::Vec v; + sleigh.getAllRegisters(reglist); + v.reserve(reglist.size()); + for (auto const &vn : reglist) { + v.emplace_back(collectRegInfo(vn)); + } + return v; } -VarnodeInfoFFI varnodeToFFI(ghidra::VarnodeData vn) { - VarnodeInfoFFI info; - info.space = std::make_unique(vn.space); - info.size = vn.size; - info.offset = vn.offset; - return info; +void ContextFFI::setImage(ImageFFI const &img) { + sleigh.reset(new RustLoadImage(img), &c_db); + ghidra::DocumentStorage documentStorage = ghidra::DocumentStorage(); + sleigh.initialize(documentStorage); } -RegisterInfoFFI collectRegInfo(std::tuple el) { - VarnodeInfoFFI varnode = varnodeToFFI(std::get<0>(el)); - rust::String name = std::get<1>(el); - return {varnode, name}; +InstructionFFI ContextFFI::get_one_instruction(uint64_t offset) const { + JinglePcodeEmitter pcode; + JingleAssemblyEmitter assembly; + ghidra::Address a = ghidra::Address(sleigh.getDefaultCodeSpace(), offset); + sleigh.printAssembly(assembly, a); + sleigh.oneInstruction(pcode, a); + size_t length = sleigh.instructionLength(a); + InstructionFFI i; + Disassembly d; + i.ops = std::move(pcode.ops); + d.args = std::move(assembly.body); + d.mnemonic = std::move(assembly.mnem); + i.disassembly = std::move(d); + i.address = offset; + i.length = length; + return i; } -rust::Vec ContextFFI::getRegisters() const { - std::map reglist; - rust::Vec v; - sleigh->getAllRegisters(reglist); - std::transform(reglist.begin(), reglist.end(), std::back_inserter(v), collectRegInfo); - return v; -} \ No newline at end of file +std::unique_ptr makeContext(rust::Str slaPath) { + return std::make_unique(slaPath); +} diff --git a/jingle_sleigh/src/ffi/cpp/context.h b/jingle_sleigh/src/ffi/cpp/context.h index 4a1295d..3110d0f 100644 --- a/jingle_sleigh/src/ffi/cpp/context.h +++ b/jingle_sleigh/src/ffi/cpp/context.h @@ -1,44 +1,30 @@ #ifndef JINGLE_SLEIGH_CONTEXT_H #define JINGLE_SLEIGH_CONTEXT_H +class ContextFFI; +#include "jingle_sleigh/src/ffi/context_ffi.rs.h" #include "rust/cxx.h" #include "sleigh/types.h" #include "addrspace_handle.h" #include "jingle_sleigh/src/ffi/instruction.rs.h" #include "sleigh/globalcontext.hh" #include "sleigh/sleigh.hh" -#include "jingle_sleigh/src/ffi/image.rs.h" #include "sleigh/loadimage.hh" - -class DummyLoadImage : public ghidra::LoadImage { - Image img; -public: - DummyLoadImage(); - - DummyLoadImage(Image img); - - void loadFill(ghidra::uint1 *ptr, ghidra::int4 size, const ghidra::Address &addr) override; - - std::string getArchType(void) const override; - - void adjustVma(long adjust) override; - -}; - +#include "dummy_load_image.h" class ContextFFI { - DummyLoadImage img; - ghidra::DocumentStorage documentStorage; - ghidra::ContextInternal contextDatabase; - std::unique_ptr sleigh; + ghidra::Sleigh sleigh; + ghidra::ContextInternal c_db; + DummyLoadImage image; public: - explicit ContextFFI(rust::Str slaPath, Image img); + explicit ContextFFI(rust::Str slaPath); void set_initial_context(rust::Str name, uint32_t val); - InstructionFFI get_one_instruction(uint64_t offset) const; + void setImage(ImageFFI const&img); + InstructionFFI get_one_instruction(uint64_t offset) const; [[nodiscard]] std::shared_ptr getSpaceByIndex(ghidra::int4 idx) const; @@ -51,6 +37,10 @@ class ContextFFI { rust::Vec getRegisters() const; }; -std::unique_ptr makeContext(rust::Str slaPath, Image img); +RegisterInfoFFI collectRegInfo(std::tuple el); + +VarnodeInfoFFI varnodeToFFI(ghidra::VarnodeData vn); + +std::unique_ptr makeContext(rust::Str slaPath); #endif //JINGLE_SLEIGH_CONTEXT_H diff --git a/jingle_sleigh/src/ffi/cpp/dummy_load_image.cpp b/jingle_sleigh/src/ffi/cpp/dummy_load_image.cpp new file mode 100644 index 0000000..4c472a5 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/dummy_load_image.cpp @@ -0,0 +1,18 @@ +#include "dummy_load_image.h" + +DummyLoadImage::DummyLoadImage() : ghidra::LoadImage("jingle") { +} + + +void DummyLoadImage::loadFill(ghidra::uint1 *ptr, ghidra::int4 size, + const ghidra::Address &addr) { + ghidra::ostringstream errmsg; + errmsg << "Unable to load " << std::dec << size << " bytes at " + << addr.getShortcut(); + addr.printRaw(errmsg); + throw ghidra::DataUnavailError(errmsg.str()); +} + +void DummyLoadImage::adjustVma(long adjust) {} + +std::string DummyLoadImage::getArchType() const { return "placeholder"; } diff --git a/jingle_sleigh/src/ffi/cpp/dummy_load_image.h b/jingle_sleigh/src/ffi/cpp/dummy_load_image.h new file mode 100644 index 0000000..1657040 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/dummy_load_image.h @@ -0,0 +1,21 @@ + +#ifndef JINGLE_SLEIGH_DUMMY_LOAD_IMAGE_H +#define JINGLE_SLEIGH_DUMMY_LOAD_IMAGE_H + + +#include "sleigh/loadimage.hh" + +class DummyLoadImage : public ghidra::LoadImage { +public: + + DummyLoadImage(); + + void loadFill(ghidra::uint1 *ptr, ghidra::int4 size, const ghidra::Address &addr) override; + + std::string getArchType(void) const override; + + void adjustVma(long adjust) override; + +}; + +#endif //JINGLE_SLEIGH_DUMMY_LOAD_IMAGE_H diff --git a/jingle_sleigh/src/ffi/cpp/exception.h b/jingle_sleigh/src/ffi/cpp/exception.h index fcedb24..8151c57 100644 --- a/jingle_sleigh/src/ffi/cpp/exception.h +++ b/jingle_sleigh/src/ffi/cpp/exception.h @@ -3,17 +3,20 @@ #define JINGLE_EXCEPTION_H #include "sleigh/error.hh" +#include "sleigh/xml.hh" namespace rust { namespace behavior { - template + template static void trycatch(Try &&func, Fail &&fail) noexcept try { - func(); + func(); } catch (const ghidra::LowlevelError &e) { - fail(e.explain); + fail(e.explain); + } catch (const ghidra::DecoderError &e) { + fail(e.explain); } catch (const std::exception &e) { - fail(e.what()); + fail(e.what()); } } } diff --git a/jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.cpp b/jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.cpp new file mode 100644 index 0000000..5a0502e --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.cpp @@ -0,0 +1,11 @@ +// +// Created by toolCHAINZ on 10/14/24. +// + +#include "jingle_assembly_emitter.h" + +void JingleAssemblyEmitter::dump(const ghidra::Address &addr, const std::string &mnem, const std::string &body) { + this->mnem = mnem; + this->body = body; + +} diff --git a/jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.h b/jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.h new file mode 100644 index 0000000..517ccd7 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/jingle_assembly_emitter.h @@ -0,0 +1,21 @@ +// +// Created by toolCHAINZ on 10/14/24. +// + +#ifndef JINGLE_SLEIGH_JINGLE_ASSEMBLY_EMITTER_H +#define JINGLE_SLEIGH_JINGLE_ASSEMBLY_EMITTER_H + +#include "sleigh/translate.hh" +#include "rust/cxx.h" + +class JingleAssemblyEmitter : public ghidra::AssemblyEmit { + + + void dump(const ghidra::Address &addr, const std::string &mnem, const std::string &body) override; + +public: + rust::String body; + rust::String mnem; +}; + +#endif //JINGLE_SLEIGH_JINGLE_ASSEMBLY_EMITTER_H diff --git a/jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.cpp b/jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.cpp new file mode 100644 index 0000000..78798f5 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.cpp @@ -0,0 +1,33 @@ + +// +// Created by toolCHAINZ on 10/14/24. +// + +#include "jingle_pcode_emitter.h" +#include "addrspace_handle.h" + +void JinglePcodeEmitter::dump(const ghidra::Address &addr, ghidra::OpCode opc, ghidra::VarnodeData *outvar, + ghidra::VarnodeData *vars, ghidra::int4 isize) { + RawPcodeOp op; + op.op = opc; + op.has_output = false; + if (outvar != nullptr && outvar->space != nullptr) { + op.has_output = true; + op.output.offset = outvar->offset; + op.output.size = outvar->size; + op.output.space = std::make_unique(AddrSpaceHandle(outvar->space)); + outvar->space->getType(); + } + op.inputs.reserve(isize); + for (int i = 0; i < isize; i++) { + VarnodeInfoFFI info; + info.space = std::make_unique(vars[i].space); + info.size = vars[i].size; + info.offset = vars[i].offset; + op.space = std::make_unique(addr.getSpace()); + op.inputs.emplace_back(std::move(info)); + } + ops.emplace_back(op); + + +} diff --git a/jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.h b/jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.h new file mode 100644 index 0000000..11bf90f --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/jingle_pcode_emitter.h @@ -0,0 +1,20 @@ +// +// Created by toolCHAINZ on 10/14/24. +// + +#ifndef JINGLE_SLEIGH_JINGLE_PCODE_EMITTER_H +#define JINGLE_SLEIGH_JINGLE_PCODE_EMITTER_H + +#include "sleigh/translate.hh" +#include "jingle_sleigh/src/ffi/instruction.rs.h" + +class JinglePcodeEmitter : public ghidra::PcodeEmit { + + void dump(const ghidra::Address &addr, ghidra::OpCode opc, ghidra::VarnodeData *outvar, ghidra::VarnodeData *vars, + ghidra::int4 isize) override; + +public: + rust::Vec ops; +}; + +#endif //JINGLE_SLEIGH_JINGLE_PCODE_EMITTER_H diff --git a/jingle_sleigh/src/ffi/cpp/rust_load_image.cpp b/jingle_sleigh/src/ffi/cpp/rust_load_image.cpp new file mode 100644 index 0000000..15771e7 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/rust_load_image.cpp @@ -0,0 +1,28 @@ +// +// Created by toolCHAINZ on 10/15/24. +// + +#include "rust_load_image.h" +#include "sleigh/pcoderaw.hh" +#include "varnode_translation.h" + +void RustLoadImage::loadFill(ghidra::uint1 *ptr, ghidra::int4 size, const ghidra::Address &addr) { + ghidra::VarnodeData vn = {addr.getSpace(), addr.getOffset(), static_cast(size)}; + + size_t result = img.load(varnodeToFFI(vn), rust::Slice(ptr, size)); + if(result == 0){ + ghidra::ostringstream errmsg; + errmsg << "Unable to load " << std::dec << size << " bytes at " + << addr.getShortcut(); + addr.printRaw(errmsg); + throw ghidra::DataUnavailError(errmsg.str()); + } +} + +std::string RustLoadImage::getArchType(void) const { + return "placeholder"; +} + +void RustLoadImage::adjustVma(long adjust) { + +} diff --git a/jingle_sleigh/src/ffi/cpp/rust_load_image.h b/jingle_sleigh/src/ffi/cpp/rust_load_image.h new file mode 100644 index 0000000..6f076b3 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/rust_load_image.h @@ -0,0 +1,24 @@ +// +// Created by toolCHAINZ on 10/15/24. +// + +#ifndef JINGLE_SLEIGH_RUST_LOAD_IMAGE_H +#define JINGLE_SLEIGH_RUST_LOAD_IMAGE_H + +#include "context.h" +#include "sleigh/loadimage.hh" + +class RustLoadImage : public ghidra::LoadImage { + ImageFFI const &img; +public: + RustLoadImage(ImageFFI const& img) : LoadImage("placeholder"), img(img) {}; + + void loadFill(ghidra::uint1 *ptr, ghidra::int4 size, const ghidra::Address &addr) override; + + std::string getArchType(void) const override; + + void adjustVma(long adjust) override; + +}; + +#endif //JINGLE_SLEIGH_RUST_LOAD_IMAGE_H diff --git a/jingle_sleigh/src/ffi/cpp/varnode_translation.cpp b/jingle_sleigh/src/ffi/cpp/varnode_translation.cpp new file mode 100644 index 0000000..b8d8488 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/varnode_translation.cpp @@ -0,0 +1,16 @@ +#include "varnode_translation.h" +#include "addrspace_handle.h" + +VarnodeInfoFFI varnodeToFFI(ghidra::VarnodeData vn) { + VarnodeInfoFFI info; + info.space = std::make_unique(vn.space); + info.size = vn.size; + info.offset = vn.offset; + return info; +} + +RegisterInfoFFI collectRegInfo(std::tuple el) { + VarnodeInfoFFI varnode = varnodeToFFI(std::get<0>(el)); + rust::String name = std::get<1>(el); + return {varnode, name}; +} \ No newline at end of file diff --git a/jingle_sleigh/src/ffi/cpp/varnode_translation.h b/jingle_sleigh/src/ffi/cpp/varnode_translation.h new file mode 100644 index 0000000..2e8ab86 --- /dev/null +++ b/jingle_sleigh/src/ffi/cpp/varnode_translation.h @@ -0,0 +1,13 @@ + +#ifndef JINGLE_SLEIGH_VARNODE_TRANSLATION_H +#define JINGLE_SLEIGH_VARNODE_TRANSLATION_H +#include "sleigh/types.h" +#include "sleigh/translate.hh" +#include "jingle_sleigh/src/ffi/instruction.rs.h" + + +VarnodeInfoFFI varnodeToFFI(ghidra::VarnodeData vn); + +RegisterInfoFFI collectRegInfo(std::tuple el); + +#endif //JINGLE_SLEIGH_VARNODE_TRANSLATION_H diff --git a/jingle_sleigh/src/ffi/image.rs b/jingle_sleigh/src/ffi/image.rs deleted file mode 100644 index a478a92..0000000 --- a/jingle_sleigh/src/ffi/image.rs +++ /dev/null @@ -1,21 +0,0 @@ -#[cxx::bridge] -pub(crate) mod bridge { - #[derive(Debug, Clone)] - pub struct Perms { - pub(crate) read: bool, - pub(crate) write: bool, - pub(crate) exec: bool, - } - - #[derive(Debug, Clone)] - pub struct ImageSection { - pub(crate) data: Vec, - pub(crate) base_address: usize, - pub(crate) perms: Perms, - } - - #[derive(Debug, Clone)] - pub struct Image { - pub sections: Vec, - } -} diff --git a/jingle_sleigh/src/ffi/instruction.rs b/jingle_sleigh/src/ffi/instruction.rs index 6524766..8c76ef1 100644 --- a/jingle_sleigh/src/ffi/instruction.rs +++ b/jingle_sleigh/src/ffi/instruction.rs @@ -36,7 +36,7 @@ pub(crate) mod bridge { /// A display-friendly representation of an instruction, generated by SLEIGH. Sleigh does not /// provide any tokenization of instruction operands, so they are all contained as a string /// representation in the [args](Self::args) attribute. - #[derive(Clone, Debug, Serialize, Deserialize)] + #[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)] pub struct Disassembly { /// SLEIGH's name for an ISA instruction pub mnemonic: String, diff --git a/jingle_sleigh/src/ffi/mod.rs b/jingle_sleigh/src/ffi/mod.rs index de8a0da..3b18d7d 100644 --- a/jingle_sleigh/src/ffi/mod.rs +++ b/jingle_sleigh/src/ffi/mod.rs @@ -1,14 +1,12 @@ pub(crate) mod addrspace; -#[cfg(compile)] -pub(crate) mod compile; pub(crate) mod context_ffi; -pub(crate) mod image; pub(crate) mod instruction; pub(crate) mod opcode; #[cfg(test)] mod tests { - use crate::context::{Image, SleighContextBuilder}; + use crate::context::SleighContextBuilder; + use crate::tests::SLEIGH_ARCH; #[test] fn test_callother_decode() { @@ -16,11 +14,9 @@ mod tests { let builder = SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); - let bin_sleigh = builder - .set_image(Image::try_from(bytes.as_slice()).unwrap()) - .build("x86:LE:64:default") - .unwrap(); - let _lib = bin_sleigh.read(0, 1).next().unwrap(); + let sleigh = builder.build("x86:LE:64:default").unwrap(); + let sleigh = sleigh.initialize_with_image(bytes.as_slice()).unwrap(); + sleigh.instruction_at(0).unwrap(); } #[test] fn test_callother_decode2() { @@ -28,10 +24,28 @@ mod tests { let builder = SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); - let bin_sleigh = builder - .set_image(Image::try_from(bytes.as_slice()).unwrap()) - .build("x86:LE:64:default") - .unwrap(); - let _lib = bin_sleigh.read(0, 1).next().unwrap(); + let sleigh = builder.build("x86:LE:64:default").unwrap(); + let sleigh = sleigh.initialize_with_image(bytes.as_slice()).unwrap(); + sleigh.instruction_at(0).unwrap(); + } + + #[test] + fn test_two_images() { + let mov_eax_0: [u8; 4] = [0x0f, 0x05, 0x0f, 0x05]; + let nops: [u8; 9] = [0x90, 0x90, 0x90, 0x90, 0x0f, 0x05, 0x0f, 0x05, 0x0f]; + let ctx_builder = + SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap(); + let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap(); + let mut sleigh = sleigh.initialize_with_image(mov_eax_0.as_slice()).unwrap(); + let instr1 = sleigh.instruction_at(0); + sleigh.set_image(nops.as_slice()).unwrap(); + let instr2 = sleigh.instruction_at(0); + assert_ne!(instr1, instr2); + assert_ne!(instr1, None); + let instr2 = sleigh.instruction_at(4); + assert_ne!(instr1, instr2); + assert_ne!(instr2, None); + let instr3 = sleigh.instruction_at(8); + assert_eq!(instr3, None); } } diff --git a/jingle_sleigh/src/instruction.rs b/jingle_sleigh/src/instruction.rs index f1763c3..28ff03b 100644 --- a/jingle_sleigh/src/instruction.rs +++ b/jingle_sleigh/src/instruction.rs @@ -1,15 +1,13 @@ use crate::error::JingleSleighError; pub use crate::ffi::instruction::bridge::Disassembly; use crate::ffi::instruction::bridge::InstructionFFI; -use crate::pcode::display::PcodeOperationDisplay; use crate::pcode::PcodeOperation; -use crate::space::SpaceManager; +use crate::JingleSleighError::EmptyInstruction; use crate::OpCode; use serde::{Deserialize, Serialize}; -use std::fmt::{Display, Formatter}; /// A rust representation of a SLEIGH assembly instruction -#[derive(Clone, Debug, Serialize, Deserialize)] +#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)] pub struct Instruction { pub disassembly: Disassembly, /// The PCODE semantics of this instruction @@ -21,28 +19,7 @@ pub struct Instruction { pub address: u64, } -/// A helper structure allowing displaying an instruction and its semantics -/// without requiring lots of pcode metadata to be stored in the instruction itself -pub struct InstructionDisplay<'a, T: SpaceManager> { - pub disassembly: Disassembly, - pub ops: Vec>, -} - impl Instruction { - pub fn display<'a, T: SpaceManager>( - &'a self, - ctx: &'a T, - ) -> Result, JingleSleighError> { - let mut ops: Vec> = Vec::with_capacity(self.ops.len()); - for x in &self.ops { - ops.push(x.display(ctx)?) - } - Ok(InstructionDisplay { - disassembly: self.disassembly.clone(), - ops, - }) - } - pub fn next_addr(&self) -> u64 { self.address + self.length as u64 } @@ -60,17 +37,6 @@ impl Instruction { .any(|o| o.opcode() == OpCode::CPUI_CALLOTHER) } } - -impl<'a, T: SpaceManager> Display for InstructionDisplay<'a, T> { - fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { - writeln!(f, "{} {}", self.disassembly.mnemonic, self.disassembly.args)?; - for x in &self.ops { - writeln!(f, "{}", x)?; - } - Ok(()) - } -} - impl From for Instruction { fn from(value: InstructionFFI) -> Self { let ops = value.ops.into_iter().map(PcodeOperation::from).collect(); @@ -82,3 +48,30 @@ impl From for Instruction { } } } + +/// todo: this is a gross placeholder until I refactor stuff into a proper +/// trace +impl TryFrom<&[Instruction]> for Instruction { + type Error = JingleSleighError; + fn try_from(value: &[Instruction]) -> Result { + if value.is_empty() { + return Err(EmptyInstruction); + } + if value.len() == 1 { + return Ok(value[0].clone()); + } + let ops: Vec = value.iter().flat_map(|i| i.ops.iter().cloned()).collect(); + let length = value.iter().map(|i| i.length).reduce(|a, b| a + b).unwrap(); + let address = value[0].address; + let disassembly = Disassembly { + mnemonic: "".to_string(), + args: "".to_string(), + }; + Ok(Self { + ops, + length, + address, + disassembly, + }) + } +} diff --git a/jingle_sleigh/src/pcode/branch.rs b/jingle_sleigh/src/pcode/branch.rs new file mode 100644 index 0000000..06233cc --- /dev/null +++ b/jingle_sleigh/src/pcode/branch.rs @@ -0,0 +1,27 @@ +use crate::pcode::branch::PcodeBranchDestination::{ + Branch, Conditional, IndirectBranch, IndirectCall, Return, +}; +use crate::{IndirectVarNode, PcodeOperation, VarNode}; + +pub enum PcodeBranchDestination { + Branch(VarNode), + Call(VarNode), + Conditional(VarNode), + IndirectBranch(IndirectVarNode), + IndirectCall(IndirectVarNode), + Return(IndirectVarNode), +} +impl PcodeOperation { + pub fn branch_destination(&self) -> Option { + match self { + PcodeOperation::Branch { input } | PcodeOperation::Call { input } => { + Some(Branch(input.clone())) + } + PcodeOperation::CBranch { input0, .. } => Some(Conditional(input0.clone())), + PcodeOperation::BranchInd { input } => Some(IndirectBranch(input.clone())), + PcodeOperation::CallInd { input } => Some(IndirectCall(input.clone())), + PcodeOperation::Return { input } => Some(Return(input.clone())), + _ => None, + } + } +} diff --git a/jingle_sleigh/src/pcode/display.rs b/jingle_sleigh/src/pcode/display.rs index 9526460..5e0c470 100644 --- a/jingle_sleigh/src/pcode/display.rs +++ b/jingle_sleigh/src/pcode/display.rs @@ -1,284 +1,35 @@ use crate::pcode::PcodeOperation; -use crate::pcode::PcodeOperation::{ - Branch, BranchInd, CBranch, Call, CallInd, CallOther, Copy, Int2Comp, IntAdd, IntAnd, IntCarry, - IntEqual, IntLeftShift, IntLess, IntLessEqual, IntNegate, IntNotEqual, IntOr, IntRightShift, - IntSExt, IntSignedBorrow, IntSignedCarry, IntSignedLess, IntSignedLessEqual, IntSub, IntXor, - IntZExt, Load, PopCount, Return, Store, -}; -use crate::space::SpaceManager; +use crate::RegisterManager; use std::fmt::{Display, Formatter}; -pub struct PcodeOperationDisplay<'a, T: SpaceManager> { +pub struct PcodeOperationDisplay<'a, T: RegisterManager> { pub(crate) op: PcodeOperation, - pub(crate) spaces: &'a T, + pub(crate) ctx: &'a T, } -impl<'a, T> Display for PcodeOperationDisplay<'a, T> +impl PcodeOperationDisplay<'_, T> {} + +impl Display for PcodeOperationDisplay<'_, T> where - T: SpaceManager, + T: RegisterManager, { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { - match &self.op { - Copy { input, output } => { - write!( - f, - "{} = {}", - output.display(self.spaces)?, - input.display(self.spaces)? - ) - } - PopCount { input, output } => { - write!( - f, - "{} = popcount({})", - output.display(self.spaces)?, - input.display(self.spaces)? - ) - } - IntZExt { input, output } => { - write!( - f, - "{} = zext({})", - output.display(self.spaces)?, - input.display(self.spaces)? - ) - } - IntSExt { input, output } => { - write!( - f, - "{} = sext({})", - output.display(self.spaces)?, - input.display(self.spaces)? - ) - } - Store { output, input } => { - write!( - f, - "{} = {}", - output.display(self.spaces)?, - input.display(self.spaces)? - ) - } - Load { input, output } => { - write!( - f, - "{} = {}", - output.display(self.spaces)?, - input.display(self.spaces)? - ) - } - IntCarry { - output, - input0, - input1, - } => write!( - f, - "{} = carry({}, {})", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)? - ), - IntSignedCarry { - output, - input0, - input1, - } => write!( - f, - "{} = s.carry({}, {})", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)? - ), - IntSignedBorrow { - output, - input0, - input1, - } => write!( - f, - "{} = s.borrow({}, {})", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)? - ), - Int2Comp { output, input } => write!( - f, - "{} = -{}", - output.display(self.spaces)?, - input.display(self.spaces)? - ), - IntAdd { - output, - input0, - input1, - } => write!( - f, - "{} = {} + {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntSub { - output, - input0, - input1, - } => write!( - f, - "{} = {} - {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntAnd { - output, - input0, - input1, - } => write!( - f, - "{} = {} & {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntOr { - output, - input0, - input1, - } => write!( - f, - "{} = {} v {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntXor { - output, - input0, - input1, - } => write!( - f, - "{} = {} ^ {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntRightShift { - output, - input0, - input1, - } => write!( - f, - "{} = {} >> {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntLeftShift { - output, - input0, - input1, - } => write!( - f, - "{} = {} << {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntLess { - output, - input0, - input1, - } => write!( - f, - "{} = {} < {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntLessEqual { - output, - input0, - input1, - } => write!( - f, - "{} = {} <= {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntSignedLess { - output, - input0, - input1, - } => write!( - f, - "{} = {} s< {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntSignedLessEqual { - output, - input0, - input1, - } => write!( - f, - "{} = {} s<= {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntEqual { - output, - input0, - input1, - } => write!( - f, - "{} = {} == {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - IntNotEqual { - output, - input0, - input1, - } => write!( - f, - "{} = {} != {}", - output.display(self.spaces)?, - input0.display(self.spaces)?, - input1.display(self.spaces)?, - ), - CallOther { output, inputs } => { - if let Some(output) = output { - write!(f, "{} = ", output.display(self.spaces)?)?; - } - write!(f, "userop(")?; - let mut args = Vec::with_capacity(inputs.len()); - for i in inputs { - args.push(format!("{}", i.display(self.spaces)?)); - } - write!(f, "{}", args.join(", "))?; - write!(f, ")") - } - CallInd { input } => write!(f, "call [{}]", input.display(self.spaces)?), - Return { input } => write!(f, "return [{}]", input.display(self.spaces)?), - Branch { input } => write!(f, "branch {}", input.display(self.spaces)?), - CBranch { input0, input1 } => write!( - f, - "if {} branch {}", - input1.display(self.spaces)?, - input0.display(self.spaces)? - ), - BranchInd { input } => write!(f, "branch [{}]", input.display(self.spaces)?), - Call { input } => write!(f, "call {}", input.display(self.spaces)?), - IntNegate { input, output } => write!( - f, - "{} = ~{}", - output.display(self.spaces)?, - input.display(self.spaces)? - ), - _ => write!(f, " {:?}", self.op), + if let Some(o) = self.op.output() { + write!(f, "{} = ", o.display(self.ctx)?)?; + } + write!(f, "{} ", self.op.opcode())?; + let mut args: Vec = vec![]; + for x in self.op.inputs() { + args.push(format!("{}", x.display(self.ctx)?)); } + write!(f, "{}", args.join(", "))?; + Ok(()) + } +} + +impl Display for crate::ffi::opcode::bridge::OpCode { + fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { + let d = format!("{:?}", self); + write!(f, "{}", &d[5..]) } } diff --git a/jingle_sleigh/src/pcode/mod.rs b/jingle_sleigh/src/pcode/mod.rs index fe13179..e8ce24a 100644 --- a/jingle_sleigh/src/pcode/mod.rs +++ b/jingle_sleigh/src/pcode/mod.rs @@ -1,3 +1,4 @@ +pub mod branch; pub mod display; use crate::pcode::PcodeOperation::{ @@ -16,13 +17,12 @@ use crate::error::JingleSleighError; use crate::ffi::instruction::bridge::RawPcodeOp; pub use crate::ffi::opcode::OpCode; use crate::pcode::display::PcodeOperationDisplay; -use crate::space::SpaceManager; use crate::varnode::{IndirectVarNode, VarNode}; -use crate::GeneralizedVarNode; +use crate::{GeneralizedVarNode, RegisterManager}; use serde::{Deserialize, Serialize}; use std::fmt::Debug; -#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize)] +#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize)] pub enum PcodeOperation { Copy { input: VarNode, @@ -387,16 +387,234 @@ impl PcodeOperation { ) } - pub fn display<'a, T: SpaceManager>( + pub fn display<'a, T: RegisterManager>( &self, ctx: &'a T, ) -> Result, JingleSleighError> { Ok(PcodeOperationDisplay { op: self.clone(), - spaces: ctx, + ctx, }) } + pub fn inputs(&self) -> Vec { + match self { + Copy { input, .. } => { + vec![input.into()] + } + Load { input, .. } => { + vec![input.into()] + } + Store { input, .. } => { + vec![input.into()] + } + Branch { input, .. } => { + vec![input.into()] + } + CBranch { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + BranchInd { input, .. } => { + vec![input.into()] + } + Call { input, .. } => { + vec![input.into()] + } + CallInd { input, .. } => { + vec![input.into()] + } + CallOther { inputs, .. } => inputs.iter().map(|i| i.into()).collect(), + Return { input, .. } => { + vec![input.into()] + } + IntEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntNotEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedLess { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedLessEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntLess { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntLessEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSExt { input, .. } => { + vec![input.into()] + } + IntZExt { input, .. } => { + vec![input.into()] + } + IntAdd { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSub { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntCarry { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedCarry { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedBorrow { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + Int2Comp { input, .. } => { + vec![input.into()] + } + IntNegate { input, .. } => { + vec![input.into()] + } + IntXor { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntAnd { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntOr { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntLeftShift { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntRightShift { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedRightShift { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntMult { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntDiv { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedDiv { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntRem { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + IntSignedRem { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + BoolNegate { input, .. } => { + vec![input.into()] + } + BoolXor { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + BoolAnd { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + BoolOr { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatNotEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatLess { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatLessEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatNaN { input, .. } => { + vec![input.into()] + } + FloatAdd { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatDiv { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatMult { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatSub { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + FloatNeg { input, .. } => { + vec![input.into()] + } + FloatAbs { input, .. } => { + vec![input.into()] + } + FloatSqrt { input, .. } => { + vec![input.into()] + } + FloatIntToFloat { input, .. } => { + vec![input.into()] + } + FloatFloatToFloat { input, .. } => { + vec![input.into()] + } + FloatTrunc { input, .. } => { + vec![input.into()] + } + FloatCeil { input, .. } => { + vec![input.into()] + } + FloatFloor { input, .. } => { + vec![input.into()] + } + FloatRound { input, .. } => { + vec![input.into()] + } + MultiEqual { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + Indirect { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + Piece { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + SubPiece { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + Cast { input, .. } => { + vec![input.into()] + } + PtrAdd { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + PtrSub { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + SegmentOp { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + CPoolRef { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + New { input, .. } => { + vec![input.into()] + } + Insert { input0, input1, .. } => { + vec![input0.into(), input1.into()] + } + Extract { input0, .. } => { + vec![input0.into()] + } + PopCount { input, .. } => { + vec![input.into()] + } + LzCount { input, .. } => { + vec![input.into()] + } + } + } pub fn output(&self) -> Option { match self { Copy { output, .. } => Some(GeneralizedVarNode::from(output)), diff --git a/jingle_sleigh/src/space.rs b/jingle_sleigh/src/space.rs index a227e28..3bdb0d5 100644 --- a/jingle_sleigh/src/space.rs +++ b/jingle_sleigh/src/space.rs @@ -121,8 +121,7 @@ pub trait RegisterManager: SpaceManager { fn get_register(&self, name: &str) -> Option; /// Given a [`VarNode`], get the name of the corresponding architectural register, if one exists - - fn get_register_name(&self, location: VarNode) -> Option<&str>; + fn get_register_name(&self, location: &VarNode) -> Option<&str>; /// Get a listing of all register name/[`VarNode`] pairs fn get_registers(&self) -> Vec<(VarNode, String)>; diff --git a/jingle_sleigh/src/varnode/display.rs b/jingle_sleigh/src/varnode/display.rs index d1ab1bd..13c9267 100644 --- a/jingle_sleigh/src/varnode/display.rs +++ b/jingle_sleigh/src/varnode/display.rs @@ -1,9 +1,14 @@ use crate::ffi::addrspace::bridge::SpaceType; use crate::space::SpaceInfo; -use std::fmt::{Display, Formatter}; +use std::fmt::{Debug, Display, Formatter}; #[derive(Clone, Debug)] -pub struct VarNodeDisplay { +pub enum VarNodeDisplay { + Raw(RawVarNodeDisplay), + Register(String), +} +#[derive(Clone, Debug)] +pub struct RawVarNodeDisplay { pub offset: u64, pub size: usize, pub space_info: SpaceInfo, @@ -22,7 +27,7 @@ pub enum GeneralizedVarNodeDisplay { Indirect(IndirectVarNodeDisplay), } -impl Display for VarNodeDisplay { +impl Display for RawVarNodeDisplay { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { if self.space_info._type == SpaceType::IPTR_CONSTANT { write!(f, "{:x}:{:x}", self.offset, self.size) @@ -35,6 +40,18 @@ impl Display for VarNodeDisplay { } } } +impl Display for VarNodeDisplay { + fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { + match self { + VarNodeDisplay::Raw(r) => { + write!(f, "{}", r) + } + VarNodeDisplay::Register(a) => { + write!(f, "{}", a) + } + } + } +} impl Display for IndirectVarNodeDisplay { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { diff --git a/jingle_sleigh/src/varnode/mod.rs b/jingle_sleigh/src/varnode/mod.rs index 324853c..cfde425 100644 --- a/jingle_sleigh/src/varnode/mod.rs +++ b/jingle_sleigh/src/varnode/mod.rs @@ -7,8 +7,10 @@ use crate::space::SpaceManager; pub use crate::varnode::display::{ GeneralizedVarNodeDisplay, IndirectVarNodeDisplay, VarNodeDisplay, }; +use crate::{RawVarNodeDisplay, RegisterManager}; use serde::{Deserialize, Serialize}; use std::fmt::Debug; +use std::ops::Range; /// A [`VarNode`] is `SLEIGH`'s generalization of an address. It describes a sized-location in /// a given memory space. @@ -33,14 +35,23 @@ pub struct VarNode { } impl VarNode { - pub fn display(&self, ctx: &T) -> Result { - ctx.get_space_info(self.space_index) - .map(|space_info| VarNodeDisplay { - size: self.size, - offset: self.offset, - space_info: space_info.clone(), - }) - .ok_or(JingleSleighError::InvalidSpaceName) + pub fn display( + &self, + ctx: &T, + ) -> Result { + if let Some(name) = ctx.get_register_name(self) { + Ok(VarNodeDisplay::Register(name.to_string())) + } else { + ctx.get_space_info(self.space_index) + .map(|space_info| { + VarNodeDisplay::Raw(RawVarNodeDisplay { + size: self.size, + offset: self.offset, + space_info: space_info.clone(), + }) + }) + .ok_or(JingleSleighError::InvalidSpaceName) + } } pub fn covers(&self, other: &VarNode) -> bool { @@ -53,6 +64,23 @@ impl VarNode { } } +impl From<&VarNode> for Range { + fn from(value: &VarNode) -> Self { + Range { + start: value.offset, + end: value.offset + value.size as u64, + } + } +} + +impl From<&VarNode> for Range { + fn from(value: &VarNode) -> Self { + Range { + start: value.offset as usize, + end: value.offset as usize + value.size, + } + } +} #[macro_export] macro_rules! varnode { ($ctx:expr, #$offset:literal:$size:literal) => { @@ -89,7 +117,7 @@ pub struct IndirectVarNode { } impl IndirectVarNode { - pub fn display( + pub fn display( &self, ctx: &T, ) -> Result { @@ -114,7 +142,7 @@ pub enum GeneralizedVarNode { } impl GeneralizedVarNode { - pub fn display( + pub fn display( &self, ctx: &T, ) -> Result { @@ -182,7 +210,7 @@ mod tests { space_index: 0, size: 4, }; - let tests = vec![ + let tests = [ VarNode { offset: 0, space_index: 0, @@ -214,6 +242,6 @@ mod tests { size: 1, }, ]; - assert!(tests.iter().all(|v| vn1.covers(&v))) + assert!(tests.iter().all(|v| vn1.covers(v))) } } From 68e00780dedfea25d7e2b607c2fb7525fae589a1 Mon Sep 17 00:00:00 2001 From: Mark Date: Fri, 24 Jan 2025 15:56:42 +0000 Subject: [PATCH 2/2] Update readme --- README.md | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/README.md b/README.md index c58646a..e39e2da 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,43 @@ related crates: expose APIs for constructing or reasoning about control-flow graphs. A more robust analysis is forthcoming, depending on my research needs. +## Requirements + +### Building + +If you're working directly with the `jingle` source distribution, +you will need to manually download a copy of the `ghidra` source tree +in order to build `jingle` or `jingle_sleigh` + +If you're working with `git`, this can be done using the existing submodule. +Simply run + +```shell +git submodule init && git submodule update +``` + +If you are for some reason using a zipped source distribution, +then you can run the following: + +```shell +cd jingle_sleigh +git clone https://github.com/NationalSecurityAgency/ghidra.git +``` + +If you are using `jingle` as a cargo `git` or `crates.io` dependency, +this step is not necessary. `cargo` will handle all this in the `git` case +and we will vendor the necessary `ghidra` sources into all `crates.io` releases. + +### Running + +While `jingle` can be configured to work with a single set `sleigh` architecture, +the default way to use it is to point it to an existing `ghidra` installation. +[Install ghidra](https://ghidra-sre.org) and, if you are using `jingle` programatically, +point it at the top level folder of the installation. If you are using the [CLI](./jingle), +then provide the path to ghidra as an argument in your first run. + +The only thing ghidra is used for here is as a standardized folder layout for `sleigh` architectures. +`jingle` has no ghidra dependency outside of the bundled `sleigh` C++ code. ## Usage In order to use `jingle`, include it in your `Cargo.toml` as usual: