Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(savefile): introduce scap-file converter skeleton #2168

Merged

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area libscap-engine-savefile

/area libscap

/area tests

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

This PR implements the skeleton for the scap-file converter. The idea is to have a declarative converter, so we just need to fill a table and the code should do the rest for us. In this branch, you can find a complete example of how the converter could work https://github.com/falcosecurity/libs/compare/master...Andreagit97:libs:remove_sys_enter_new?expand=1

I'm sharing some highlights directly extracted from that branch here.

The conversion table has as a key {evt_type, num_params} and as a value the action to take + some instructions if needed. The idea is to reconstruct the history of our event table from the very beginning and do each conversion until we reach the final event version that we have today in our event table.

static std::unordered_map<conversion_key, conversion_info> g_conversion_table = {
        ////////////////////////////
        // BRK
        ////////////////////////////
        // Is useless to convert it to `PPME_SYSCALL_BRK_4_E` because we will just add a 0
        // parameter. The parameters of the 2 events are not the same.
        {{PPME_SYSCALL_BRK_1_E, 1}, {.action = C_ACTION_SKIP}},
        {{PPME_SYSCALL_BRK_1_X, 1},
         {.action = C_ACTION_CHANGE_TYPE,
          .desired_type = PPME_SYSCALL_BRK_4_X,
          .instr = {{C_INSTR_FROM_OLD, 0},
                    {C_INSTR_FROM_DEFAULT, 1},
                    {C_INSTR_FROM_DEFAULT, 2},
                    {C_INSTR_FROM_DEFAULT, 3}}}},
        {{PPME_SYSCALL_BRK_4_E, 1}, {.action = C_ACTION_STORE}},
        {{PPME_SYSCALL_BRK_4_X, 4},
         {.action = C_ACTION_ADD_PARAMS, .instr = {{C_INSTR_FROM_ENTER, 0}}}},
};
      // new event version introduced in the sys_enter/sys_exit work. parameter `addr` is what we have today in `PPME_SYSCALL_BRK_4_E`  event
       [PPME_SYSCALL_BRK_4_X] = {"brk",
                                  EC_MEMORY | EC_SYSCALL,
                                  EF_TMP_CONVERTER_MANAGED,
                                  5,
                                  {{"res", PT_UINT64, PF_HEX},
                                   {"vm_size", PT_UINT32, PF_DEC},
                                   {"vm_rss", PT_UINT32, PF_DEC},
                                   {"vm_swap", PT_UINT32, PF_DEC},
                                   {"addr", PT_UINT64, PF_HEX}}},

Let's consider the BRK syscall case:

  • {PPME_SYSCALL_BRK_1_E, 1} -> brings information that we don't use anymore today so we can skip it C_ACTION_SKIP
  • {PPME_SYSCALL_BRK_1_X, 1} -> we need to convert it to its following version {PPME_SYSCALL_BRK_4_X, 4}. The number of parameters is reconstructed from the number of instructions. Basically, we are saying for each param where we need to take it.
  • {PPME_SYSCALL_BRK_4_E, 1} -> We will need its parameter in the new PPME_SYSCALL_BRK_4_X with 5 parameters (so the new event version we will introduce in the sys_enter/sys_exit work)
  • {PPME_SYSCALL_BRK_4_X, 4} -> We need to add a new parameter because the new event version has 5 parameters and we can get that parameter from the enter event previously stored.

The idea is to test each conversion with its own test. This is an example for the BRK syscall

////////////////////////////
// BRK
////////////////////////////

TEST_F(convert_event_test, PPME_SYSCALL_BRK_1_E_skip) {
	uint64_t ts = 12;
	int64_t tid = 25;
	uint32_t size = 0;

	// The open enter event should be skipped.
	assert_single_conversion_skip(create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_1_E, 1, size));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_1_X_to_PPME_SYSCALL_BRK_4_X) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;

	// These will be always 0 because we are creating them with the default values
	uint32_t vm_size = 0;
	uint32_t vm_rss = 0;
	uint32_t vm_swap = 0;

	assert_single_conversion_success(conversion_result::CONVERSION_CONTINUE,
	                                 create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_1_X, 1, res),
	                                 create_safe_scap_event(ts,
	                                                        tid,
	                                                        PPME_SYSCALL_BRK_4_X,
	                                                        4,
	                                                        res,
	                                                        vm_size,
	                                                        vm_rss,
	                                                        vm_swap));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_4_E_store) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t addr = 178;

	// we need to keep the memory alive until we check the storage presence
	auto evt = create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_E, 1, addr);
	assert_single_conversion_skip(evt);
	assert_event_storage_presence(evt);
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_4_X_to_5_params_no_enter) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;
	uint32_t vm_size = 14;
	uint32_t vm_rss = 28;
	uint32_t vm_swap = 39;

	// Address is zero because in this scenario we don't retrieve the enter event
	uint64_t addr = 0;

	assert_single_conversion_success(
	        conversion_result::CONVERSION_COMPLETED,
	        create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_X, 4, res, vm_size, vm_rss, vm_swap),
	        create_safe_scap_event(ts,
	                               tid,
	                               PPME_SYSCALL_BRK_4_X,
	                               5,
	                               res,
	                               vm_size,
	                               vm_rss,
	                               vm_swap,
	                               addr));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_4_X_to_5_params_with_enter) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;
	uint32_t vm_size = 14;
	uint32_t vm_rss = 28;
	uint32_t vm_swap = 39;

	// We should retrieve the correct `addr` in the final event.
	uint64_t addr = 17;

	// After the first conversion we should have the storage
	auto evt = create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_E, 1, addr);
	assert_single_conversion_skip(evt);
	assert_event_storage_presence(evt);

	assert_single_conversion_success(
	        conversion_result::CONVERSION_COMPLETED,
	        create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_X, 4, res, vm_size, vm_rss, vm_swap),
	        create_safe_scap_event(ts,
	                               tid,
	                               PPME_SYSCALL_BRK_4_X,
	                               5,
	                               res,
	                               vm_size,
	                               vm_rss,
	                               vm_swap,
	                               addr));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_1_X_full_conversion) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;
	// They should be all 0 since they are all defaulted to 0
	uint32_t vm_size = 0;
	uint32_t vm_rss = 0;
	uint32_t vm_swap = 0;
	uint64_t addr = 0;

	assert_full_conversion(create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_1_X, 1, res),
	                       create_safe_scap_event(ts,
	                                              tid,
	                                              PPME_SYSCALL_BRK_4_X,
	                                              5,
	                                              res,
	                                              vm_size,
	                                              vm_rss,
	                                              vm_swap,
	                                              addr));
}

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copy link

github-actions bot commented Nov 22, 2024

Please double check driver/SCHEMA_VERSION file. See versioning.

/hold

Copy link

github-actions bot commented Nov 22, 2024

Perf diff from master - unit tests

     3.07%     -0.76%  [.] sinsp_thread_manager::get_thread_ref
     2.00%     -0.74%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     4.90%     +0.69%  [.] sinsp_evt::get_type
     3.43%     +0.60%  [.] gzfile_read
     1.20%     -0.58%  [.] std::vector<sinsp_evt_param, std::allocator<sinsp_evt_param> >::emplace_back<sinsp_evt*, unsigned int&, char const*, unsigned long&>
     0.26%     +0.53%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::find
     0.05%     +0.53%  [.] std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<long const, std::shared_ptr<sinsp_fdinfo> >, false> > >::_M_allocate_node<long&, std::unique_ptr<sinsp_fdinfo, std::default_delete<sinsp_fdinfo> > >
     4.10%     +0.51%  [.] sinsp_parser::process_event
     1.10%     +0.49%  [.] sinsp_evt::get_ts
     9.16%     -0.48%  [.] sinsp_parser::reset

Heap diff from master - unit tests

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            -0.0281         -0.0281           148           144           148           144
BM_sinsp_split_median                                          -0.0276         -0.0276           148           144           148           144
BM_sinsp_split_stddev                                          +0.1002         +0.1001             1             1             1             1
BM_sinsp_split_cv                                              +0.1321         +0.1319             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  -0.0045         -0.0045            57            57            57            57
BM_sinsp_concatenate_paths_relative_path_median                -0.0046         -0.0047            57            57            57            57
BM_sinsp_concatenate_paths_relative_path_stddev                +0.3178         +0.3204             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_cv                    +0.3237         +0.3263             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     +0.0376         +0.0376            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_median                   +0.0391         +0.0391            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_stddev                   -0.0032         -0.0030             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       -0.0393         -0.0391             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  +0.0028         +0.0028            56            56            56            56
BM_sinsp_concatenate_paths_absolute_path_median                +0.0017         +0.0017            56            56            56            56
BM_sinsp_concatenate_paths_absolute_path_stddev                +2.0994         +2.0982             0             1             0             1
BM_sinsp_concatenate_paths_absolute_path_cv                    +2.0906         +2.0894             0             0             0             0
BM_sinsp_split_container_image_mean                            -0.0025         -0.0025           383           383           383           383
BM_sinsp_split_container_image_median                          -0.0021         -0.0021           384           383           384           383
BM_sinsp_split_container_image_stddev                          +0.0645         +0.0632             2             2             2             2
BM_sinsp_split_container_image_cv                              +0.0672         +0.0659             0             0             0             0

Copy link

codecov bot commented Nov 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.06%. Comparing base (512f9b7) to head (205a2fa).
Report is 17 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2168      +/-   ##
==========================================
+ Coverage   74.77%   75.06%   +0.28%     
==========================================
  Files         254      255       +1     
  Lines       33505    33552      +47     
  Branches     5747     5736      -11     
==========================================
+ Hits        25054    25186     +132     
+ Misses       8451     8366      -85     
Flag Coverage Δ
libsinsp 75.06% <ø> (+0.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Andreagit97 Andreagit97 force-pushed the implement_scap_file_converter branch from fd56382 to 5984e0f Compare November 22, 2024 16:14
@Andreagit97 Andreagit97 force-pushed the implement_scap_file_converter branch from 562b48a to 1775efd Compare November 22, 2024 16:37
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job andre, i really like the declarative approach and the shiny new tests (also thanks for remembering to drop the print_scap_event logic from scap-open :D)
Left a bunch of comments!

Signed-off-by: Andrea Terzolo <[email protected]>
Co-authored-by: Federico Di Pierro <[email protected]>
@Andreagit97 Andreagit97 force-pushed the implement_scap_file_converter branch from d408cf7 to 205a2fa Compare November 27, 2024 11:42
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Nov 27, 2024

LGTM label has been added.

Git tree hash: f815a4c99aea2a9926b6a69cf5a750768e2d1460

@poiana
Copy link
Contributor

poiana commented Nov 27, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@FedeDP
Copy link
Contributor

FedeDP commented Nov 29, 2024

/unhold

No schema was touched.

@poiana poiana merged commit 5094053 into falcosecurity:master Nov 29, 2024
60 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants