Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(driver/modern_bpf,userspace/libpman): support multiple programs for each event #2255

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

FedeDP
Copy link
Contributor

@FedeDP FedeDP commented Jan 23, 2025

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area driver-modern-bpf
/area libpman

Does this PR require a change in the driver versions?

What this PR does / why we need it:

Allow to specify multiple program names for each event type and try to inject each of them until success.
This allows us to inject bpf_loop sendmmsg and recvmmsg programs where supported, and fallback at a program just sending first message where it isn't.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

This superseedes #2233

Does this PR introduce a user-facing change?:

new(driver/modern_bpf,userspace/libpman): support multiple programs for each event

Copy link

github-actions bot commented Jan 23, 2025

Perf diff from master - unit tests

    10.41%     -0.60%  [.] sinsp_parser::reset
     2.84%     +0.48%  [.] sinsp_evt::load_params
     8.01%     +0.46%  [.] sinsp_evt::get_type
     2.02%     -0.39%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     3.99%     -0.38%  [.] gzfile_read
     1.19%     +0.33%  [.] libsinsp::sinsp_suppress::process_event
     1.35%     +0.29%  [.] scap_event_decode_params
     2.52%     +0.28%  [.] is_conversion_needed
     2.71%     +0.28%  [.] sinsp_thread_manager::find_thread
    12.07%     +0.28%  [.] sinsp::next

Heap diff from master - unit tests

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            -0.0370         -0.0369           152           147           152           147
BM_sinsp_split_median                                          -0.0449         -0.0448           153           146           153           146
BM_sinsp_split_stddev                                          -0.1877         -0.1879             2             1             2             1
BM_sinsp_split_cv                                              -0.1564         -0.1567             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  -0.0611         -0.0610            62            58            62            58
BM_sinsp_concatenate_paths_relative_path_median                -0.0589         -0.0588            62            58            62            58
BM_sinsp_concatenate_paths_relative_path_stddev                -0.4351         -0.4345             1             1             1             1
BM_sinsp_concatenate_paths_relative_path_cv                    -0.3984         -0.3978             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     +0.0305         +0.0305            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_median                   +0.0304         +0.0304            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_stddev                   +0.3122         +0.3090             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       +0.2734         +0.2703             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  -0.0533         -0.0533            61            58            61            58
BM_sinsp_concatenate_paths_absolute_path_median                -0.0707         -0.0707            62            58            62            58
BM_sinsp_concatenate_paths_absolute_path_stddev                -0.6986         -0.6990             2             1             2             1
BM_sinsp_concatenate_paths_absolute_path_cv                    -0.6816         -0.6820             0             0             0             0
BM_sinsp_split_container_image_mean                            -0.0507         -0.0506           404           384           404           384
BM_sinsp_split_container_image_median                          -0.0498         -0.0498           405           384           405           384
BM_sinsp_split_container_image_stddev                          -0.0261         -0.0271             2             2             2             2
BM_sinsp_split_container_image_cv                              +0.0259         +0.0247             0             0             0             0

@FedeDP FedeDP changed the title new(driver/modern_bpf,userspace/libpman): support multiple programs for each event wip: new(driver/modern_bpf,userspace/libpman): support multiple programs for each event Jan 23, 2025
Copy link

codecov bot commented Jan 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.29%. Comparing base (6c46ed3) to head (7c19adb).
Report is 57 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2255      +/-   ##
==========================================
+ Coverage   75.16%   75.29%   +0.12%     
==========================================
  Files         278      279       +1     
  Lines       34478    34389      -89     
  Branches     5922     5878      -44     
==========================================
- Hits        25916    25893      -23     
+ Misses       8562     8496      -66     
Flag Coverage Δ
libsinsp 75.29% <ø> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Jan 23, 2025

X64 kernel testing matrix

KERNEL CMAKE-CONFIGURE KMOD BUILD KMOD SCAP-OPEN BPF-PROBE BUILD BPF-PROBE SCAP-OPEN MODERN-BPF SCAP-OPEN
amazonlinux2-4.19 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2-5.10 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2-5.15 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2-5.4 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2022-5.15 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2023-6.1 🟢 🟢 🟢 🟢 🟢 🟢
archlinux-6.0 🟢 🟢 🟢 🟢 🟢 🟢
archlinux-6.7 🟢 🟢 🟢 🟢 🟢 🟢
centos-3.10 🟢 🟢 🟢 🟡 🟡 🟡
centos-4.18 🟢 🟢 🟢 🟢 🟢 🟢
centos-5.14 🟢 🟢 🟢 🟢 🟢 🟢
fedora-5.17 🟢 🟢 🟢 🟢 🟢 🟢
fedora-5.8 🟢 🟢 🟢 🟢 🟢 🟢
fedora-6.2 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-3.10 🟢 🟢 🟢 🟡 🟡 🟡
oraclelinux-4.14 🟢 🟢 🟢 🟢 🟢 🟡
oraclelinux-5.15 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-5.4 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-4.15 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-5.8 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-6.5 🟢 🟢 🟢 🟢 🟢 🟢

ARM64 kernel testing matrix

KERNEL CMAKE-CONFIGURE KMOD BUILD KMOD SCAP-OPEN BPF-PROBE BUILD BPF-PROBE SCAP-OPEN MODERN-BPF SCAP-OPEN
amazonlinux2-5.4 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2022-5.15 🟢 🟢 🟢 🟢 🟢 🟢
fedora-6.2 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-4.14 🟢 🟢 🟢 🟡 🟡 🟡
oraclelinux-5.15 🟢 🟢 🟢 🟢 🟢 🟢
ubuntu-6.5 🟢 🟢 🟢 🟢 🟢 🟢

@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from 6096d33 to 92df24d Compare January 23, 2025 14:10
Copy link

Please double check driver/API_VERSION file. See versioning.

/hold

@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from 3bbfcfd to aa354b6 Compare January 24, 2025 10:46
…or each event.

Try to inject each of them until success.
This allows us to inject `bpf_loop` sendmmsg and recvmmsg programs where supported,
and fallback at just sending first message where it isn't.

Signed-off-by: Federico Di Pierro <[email protected]>
@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from aa354b6 to 24581f6 Compare January 24, 2025 10:54
@FedeDP FedeDP changed the title wip: new(driver/modern_bpf,userspace/libpman): support multiple programs for each event new(driver/modern_bpf,userspace/libpman): support multiple programs for each event Jan 24, 2025
@FedeDP
Copy link
Contributor Author

FedeDP commented Jan 24, 2025

Removed wip since the impl is complete.
Kernel-testing matrix is now fully green. Only issue remaining are:

  • s390x is not seeing BPF_FUNC_loop symbol in events_prog_names.c
  • amd64 runner is correctly picking up sendmmsg_x and recvmmsg_x (with bpf_loop) but then it fails with:
libbpf: prog 'sendmmsg_x': BPF program load failed: Permission denied
libbpf: prog 'sendmmsg_x': -- BEGIN PROG LOAD LOG --
combined stack size of 2 calls is 576. Too large

This seems like a bug in the verifier since i cannot repro it locally, neither in arm64 CI, neither in kernel testing matrix.
Arm64 runner uses kernel 6.8.0-1020-azure while amd64 uses 6.5.0-1025-azure. Will try to investigate further.

@FedeDP
Copy link
Contributor Author

FedeDP commented Jan 24, 2025

In the last commit i tried to split sendmmsg and recvmmsg programs in case of ret < 0 by chaining a tail call in that case.
Still no luck. I also tried to drop the __always_inline from the handle_exit function and it failed with the same error: https://github.com/falcosecurity/libs/actions/runs/12950456593/job/36123287152

Note also that the exact same code for the 2 bpf programs wasn't failing some weeks ago (ie: at this commit: 67975da).

@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from 0323c41 to 24581f6 Compare January 24, 2025 15:47
@FedeDP
Copy link
Contributor Author

FedeDP commented Jan 24, 2025

/milestone next-driver

@poiana poiana added this to the next-driver milestone Jan 24, 2025
@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from b000c6f to d930932 Compare January 30, 2025 13:48
@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from bb4321f to ffb1207 Compare January 31, 2025 09:03
@FedeDP
Copy link
Contributor Author

FedeDP commented Jan 31, 2025

Oh i should test with same clang version we use in the CI: clang amd64 1:14.0-55~exp2

@Apteryks
Copy link

Hi,

I've rebased my #1842 PR on this branch, and tried to build it with Clang 13 in the environment. I'm getting an undeclared error:

cd /tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/build/libpman && /gnu/store/86fc8bi3mciljxz7c79jx8zr4wsx7xw8-gcc-11.4.0/bin/gcc -DHAVE_SYS_SYSMACROS_H -DPLATFORM_NAME=\"Linux\" -DSCAP_HOSTNAME_ENV_VAR=\"SCAP_HOSTNAME\" -DSCAP_HOST_ROOT_ENV_VAR_NAME=\"HOST_ROOT\" -DSCAP_KERNEL_MODULE_NAME=\"scap\" -D__STDC_FORMAT_MACROS -Dpman_EXPORTS -I/gnu/store/li8wwfm5izk2qwmgm7yvb7bxrvc28wp6-googletest-1.12.1/include/gtest -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source/userspace/libpman/include -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source/userspace/libpman/src -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source/userspace -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/build/skel_dir -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/build -I/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source/userspace/libscap/linux -Wall -ggdb -O2 -g -DNDEBUG -fPIC -MD -MT libpman/CMakeFiles/pman.dir/src/events_prog_table.c.o -MF CMakeFiles/pman.dir/src/events_prog_table.c.o.d -o CMakeFiles/pman.dir/src/events_prog_table.c.o -c /tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source/userspace/libpman/src/events_prog_table.c
/tmp/guix-build-falcosecurity-libs-0.20.0.drv-0/source/userspace/libpman/src/events_prog_table.c:253:52: error: ‘BPF_FUNC_loop’ undeclared here (not in a function); did you mean ‘BPF_FUNC_bind’?
  253 |         [PPME_SOCKET_RECVMMSG_X] = {{"recvmmsg_x", BPF_FUNC_loop}, {"recvmmsg_old_x", 0}},
      |                                                    ^~~~~~~~~~~~~
      |                                                    BPF_FUNC_bind
make[2]: *** [libpman/CMakeFiles/pman.dir/build.make:191: libpman/CMakeFiles/pman.dir/src/events_prog_table.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....

@FedeDP
Copy link
Contributor Author

FedeDP commented Jan 31, 2025

Which libbpf version are you using?

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 3, 2025

That's weird, you are using correct versions; it seems like the build is using an older libbpf version though since its headers do not contain BPF_FUNC_loop symbol.
The same happened in the s390x CI: d4e2e2a

…_dynamic_snaplen at each bpf_loop iteration for sendmmsg and recvmmsg.

This also fixes a verifier issue on clang 14, related to stack length.

Signed-off-by: Federico Di Pierro <[email protected]>
@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from e83a441 to 3ab30fc Compare February 3, 2025 14:11
@FedeDP FedeDP force-pushed the new/support_recvmmsg_sendmmsg_bpf_loop branch from cdd8980 to acbd21c Compare February 3, 2025 15:27
@@ -55,7 +55,7 @@ jobs:
kernelrelease: 6.4.1-1.el9.elrepo.aarch64
target: centos
kernelurls: https://download.falco.org/fixtures/libs/kernel-ml-devel-6.4.1-1.el9.elrepo.aarch64.rpm
runs-on: ubuntu-latest
runs-on: ubuntu-24.04-arm
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improvement: run driverkit arm64 CI on arm64 gh runners.

@@ -220,8 +220,9 @@ jobs:
cd src && make install
cd ../../
git clone https://github.com/libbpf/libbpf.git --branch v1.3.0 --single-branch
cd libbpf/src && BUILD_STATIC_ONLY=y DESTDIR=/ make install
cd libbpf/src && BUILD_STATIC_ONLY=y DESTDIR=/ make install install_uapi_headers
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix: s390x was using system libbpf headers, not the one built by us in the CI.

@@ -676,6 +676,7 @@ static __always_inline void auxmap__store_socktuple_param(struct auxiliary_map *
switch(socket_family) {
case AF_INET: {
struct inet_sock *inet = (struct inet_sock *)sk;
struct sockaddr_in usrsockaddr_in = {};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple similar fixes: do not keep a reference to an out-of-scope variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe i'm missing something but why is this a fix? it seems to be used only inside the below if to read some data from the kernel

@@ -1556,6 +1558,13 @@ static __always_inline void apply_dynamic_snaplen(struct pt_regs *regs,
*/
unsigned long args[5] = {0};
struct sockaddr *sockaddr = NULL;
typedef union {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple similar improvements: don't waste lots of stack space while we just need one of these.

// in dynamic_snaplen_args.
// This also gives a small perf boost while using `bpf_loop` because we don't need
// to re-fetch first 3 syscall args at every iteration.
__builtin_memcpy(args, input_args->mm_args, 3 * sizeof(unsigned long));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sendmmsg/recvmmsg improvement: since we are going to call apply_dynamic_snaplen for each bpf_loop iteration, just call extract__network_args once (in the sendmmsg/recvmmsg main program) and then reference it.
This also fixes a verifier issue on clang-14 about stack size too large.

@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 3, 2025

/cc @Molter73 @Andreagit97

Copy link
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @FedeDP! PR is looking really good! Just have a couple minor comments and questions.

driver/modern_bpf/helpers/store/auxmap_store_params.h Outdated Show resolved Hide resolved
driver/modern_bpf/helpers/store/auxmap_store_params.h Outdated Show resolved Hide resolved
userspace/libpman/src/lifecycle.c Outdated Show resolved Hide resolved
userspace/libpman/src/lifecycle.c Outdated Show resolved Hide resolved
userspace/libpman/src/lifecycle.c Outdated Show resolved Hide resolved
userspace/libpman/src/lifecycle.c Outdated Show resolved Hide resolved
Use anonymous unions in modern bpf driver. Moreover, add some debug prints to `pman_prepare_progs_before_loading`,
and always disable all unused programs autoload.

Signed-off-by: Federico Di Pierro <[email protected]>

Co-authored-by: Mauro Ezequiel Moltrasio <[email protected]>
@FedeDP
Copy link
Contributor Author

FedeDP commented Feb 5, 2025

Ehy @Molter73 should've addressed everything! Let me know :) Thanks for the in-depth review btw!

Copy link
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's just one more tiny comment, but it is more of a readability thing, everything else looks good.

/lgtm

progs[idx].name,
progs[idx].feat);
pman_print_msg(FALCOSECURITY_LOG_SEV_DEBUG, (const char *)msg);
chosen_idx = idx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just break out of the inner loop here and use idx directly in the next block? It would also mean we don't need should_disable because the final block in the loop will only be reached from the other branch of this if-else block.

@poiana
Copy link
Contributor

poiana commented Feb 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP, Molter73

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, thanks!

@@ -676,6 +676,7 @@ static __always_inline void auxmap__store_socktuple_param(struct auxiliary_map *
switch(socket_family) {
case AF_INET: {
struct inet_sock *inet = (struct inet_sock *)sk;
struct sockaddr_in usrsockaddr_in = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe i'm missing something but why is this a fix? it seems to be used only inside the below if to read some data from the kernel

bool should_disable = chosen_idx != -1;
if(!should_disable) {
if(progs[idx].feat > 0 &&
libbpf_probe_bpf_helper(BPF_PROG_TYPE_RAW_TRACEPOINT, progs[idx].feat, NULL) ==
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual programs we use in modern ebpf is BPF_PROG_TYPE_TRACING even if it shouldn't change too much

Suggested change
libbpf_probe_bpf_helper(BPF_PROG_TYPE_RAW_TRACEPOINT, progs[idx].feat, NULL) ==
libbpf_probe_bpf_helper(BPF_PROG_TYPE_TRACING, progs[idx].feat, NULL) ==

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

5 participants