Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add ability to set module data in scan callback #147

Merged
merged 2 commits into from
May 6, 2024

Conversation

vthib
Copy link
Contributor

@vthib vthib commented Apr 30, 2024

Some modules in YARA need to be fed data to be usable, notably the cuckoo module. This works by setting the module data in the "import module" callback, as can be seen here:

https://github.com/VirusTotal/yara/blob/923368eab/cli/yara.c#L1200

This MR adds bindings to be able to do exactly this: the object related to this callback msg is wrapped in a YrModuleImport object, which exposes two functions:

  • one to retrieve the module name
  • one to set the module data

This makes the code looks like this:

let report = r#"{ "network": ... }"#;
let res = yara_scanner.scan_mem_callback(b"", |msg| {
    if let yara::CallbackMsg::ImportModule(mut module) = msg {
        if module.name() == Some(b"cuckoo") {
            // Safety: report is alive for longer than the scan.
            unsafe {
                module.set_module_data(
                    report.as_mut_ptr().cast(),
                    report.len(),
                );
            }
        }
    }
    yara::CallbackReturn::Continue
});

I haven't added a test for it, because the only module that uses this is the cuckoo module, and to use it, the module-cuckoo feature must be enabled and the libjansson-dev needs to be installed. If you prefer to have a test, I can try to update the CI to have a test like this working.

@vthib
Copy link
Contributor Author

vthib commented Apr 30, 2024

Looks like the macos-latest runner was updated from 12.7 to 14.4 and it no longer works well, i don't think it's related to my changes

/// The caller must guarantee that:
/// - `ptr` is valid for reads of `size` bytes.
/// - `ptr` stays valid for the full duration of the scan.
pub unsafe fn set_module_data(&mut self, ptr: *mut c_void, size: u64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vthib hello! Are you sure to make a raw pointer as a public interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see any other way, making this function safe would mean changing this pointer to a borrowed ref, but it would need to be tied to a lifetime that outlives the borrow on Rules/Scanner, which i don't think would be easy to express here. And even then, there isn't really a type to expose, since the type depends on the module.

I feel like exposing the raw pointer, with a clear unsafe function, is the best trade-off here. This is a very niche use-case, and leaving it to the user to properly set it makes the API changes minimal and still make it possible to set this module data if needed

@vthib vthib force-pushed the set-module-data branch from 07ae17a to 1a44772 Compare May 1, 2024 12:46
vthib added 2 commits May 1, 2024 14:59
Some modules in YARA need to be fed data to be usable, notably the
cuckoo module. This works by setting the module data in the "import
module" callback, as can be seen here:

<https://github.com/VirusTotal/yara/blob/923368eab/cli/yara.c#L1200>

This MR adds bindings to be able to do exactly this: the object related
to this callback msg is wrapped in a YrModuleImport object, which
exposes two functions:

- one to retrieve the module name
- one to set the module data

This makes the code looks like this:

```rust
let res = yara_scanner.scan_mem_callback(b"", |msg| {
    if let yara::CallbackMsg::ImportModule(mut module) = msg {
        if module.name() == Some(b"cuckoo") {
            // Safety: report is alive for longer than the scan.
            unsafe {
                module.set_module_data(
                    report.as_mut_ptr().cast(),
                    report.len() as u64,
                );
            }
        }
    }
    yara::CallbackReturn::Continue
});
```

I haven't added a test for it, because the only module that uses this is
the cuckoo module, and to use it, the module-cuckoo feature must be
enabled and the libjansson-dev needs to be installed. If you prefer to
have a test, I can try to update the CI to have a test like this
working.
the macos-14 runner is M1 based and thus aarch64, which makes several
matrix jobs fail due to missing vendored bindings.

Fix the jobs by hardcoding the macos version to the one used before the
update of macos-latest to macos-14.
@vthib vthib force-pushed the set-module-data branch from 95b60cd to 6aa7cfa Compare May 1, 2024 12:59
@Hugal31 Hugal31 merged commit cda5060 into Hugal31:master May 6, 2024
28 checks passed
@Hugal31
Copy link
Owner

Hugal31 commented May 6, 2024

I also don't see a better way to expose the setter for module data, without asking for a box or some complicated lifetimes stunts.
LGTM, thanks!

@vthib vthib deleted the set-module-data branch May 6, 2024 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants