Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
yunwei37 committed Jan 26, 2024
1 parent 84c533f commit f59aac6
Show file tree
Hide file tree
Showing 8 changed files with 336 additions and 45 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,8 @@ Android:

- [eBPF开发实践:使用 user ring buffer 向内核异步发送信息](src/35-user-ringbuf/README.md)
- [用户空间 eBPF 运行时:深度解析与应用实践](src/36-userspace-ebpf/README.md)

- [借助 eBPF 和 BTF,让用户态也能一次编译、到处运行](src/38-btf-uprobe/README.md)

持续更新中...

## 为什么要写这个教程?
Expand Down
1 change: 1 addition & 0 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ Other:

- [Using user ring buffer to send information to the kernel](src/35-user-ringbuf/README.md)
- [Userspace eBPF Runtimes: Overview and Applications](src/36-userspace-ebpf/README.md)
- [Compile Once, Run Everywhere for userspace trace with eBPF and BTF](src/38-btf-uprobe/README.md)

Continuously updating...

Expand Down
4 changes: 4 additions & 0 deletions src/37-uprobe-rust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,10 @@ Attaching 1 probe...
Function hello-world called 6
```

这可能是由于 Rust 没有稳定的 ABI。 Rust,正如它迄今为止所存在的那样,保留了以任何它想要的方式对这些结构成员进行排序的权利。 因此,被调用者的编译版本可能会完全按照上面的方式对成员进行排序,而调用库的编程的编译版本可能会认为它实际上是这样布局的:

TODO: 进一步分析(未完待续)

## 参考资料

- <https://doc.rust-lang.org/rustc/symbol-mangling/index.html>
4 changes: 4 additions & 0 deletions src/37-uprobe-rust/README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,10 @@ Attaching 1 probe...
Function hello-world called 6
```

This may due to Rust does not have a stable ABI. Rust, as it has existed so far, has reserved the right to order those struct members any way it wants. So the compiled version of the callee might order the members exactly as above, while the compiled version of the programming calling into the library might think its actually laid out like this:

TODO: Further analysis (to be continued)

## References

- <https://doc.rust-lang.org/rustc/symbol-mangling/index.html>
97 changes: 54 additions & 43 deletions src/38-btf-uprobe/README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,45 @@
# Use BTF for userspace application CO-RE
# 借助 eBPF 和 BTF,让用户态也能一次编译、到处运行

eBPF, short for extended Berkeley Packet Filter, is a powerful and versatile technology used in modern Linux systems. It allows for the running of sandboxed programs in a virtual machine-like environment within the kernel, providing a safe way to extend the capabilities of the kernel without the risk of crashing the system or compromising security.
在现代 Linux 系统中,eBPF(扩展的 Berkeley Packet Filter)是一项强大而灵活的技术。它允许在内核中运行沙盒化程序,类似于虚拟机环境,为扩展内核功能提供了一种既安全又不会导致系统崩溃或安全风险的方法。

The term "co-re" in the context of eBPF stands for "Compile Once, Run Everywhere". This is a key feature of eBPF that addresses a major challenge: the compatibility of eBPF programs across different kernel versions.
eBPF 中的 “co-re” 代表“一次编译、到处运行”。这是其关键特征之一,用于解决 eBPF 程序在不同内核版本间兼容性的主要挑战。eBPF 的 CO-RE 功能可以实现在不同的内核版本上运行同一 eBPF 程序,而无需重新编译。

### The Challenge
利用 eBPF 的 Uprobe 功能,可以追踪用户空间应用程序并访问其内部数据结构。然而,用户空间应用程序的 CO-RE 实践目前尚不完善。本文将介绍一种新方法,利用 CO-RE 为用户空间应用程序确保 eBPF 程序在不同应用版本间的兼容性,从而避免了多次编译的需求。例如,在从加密流量中捕获 SSL/TLS 明文数据时,你或许不需要为每个版本的 OpenSSL 维护一个单独的 eBPF 程序。

- **Kernel Dependencies**: Traditional eBPF programs are tightly coupled with the specific Linux kernel version they are compiled for. This is because they rely on specific internal data structures and kernel APIs which can change between kernel versions.
- **Portability Issues**: If you wanted to run an eBPF program on different Linux systems with different kernel versions, you'd traditionally have to recompile the eBPF program for each kernel version, which is a cumbersome and inefficient process.
为了在用户空间应用程序中实现eBPF的“一次编译、到处运行”(Co-RE)特性,我们需要利用BPF类型格式(BTF)来克服传统eBPF程序的一些限制。这种方法的关键在于为用户空间程序提供与内核类似的类型信息和兼容性支持,从而使得eBPF程序能够更灵活地应对不同版本的用户空间应用和库。

### The Co-RE Solution
本文是eBPF开发者教程的一部分,详细内容可访问[这里](https://eunomia.dev/tutorials/)。源代码在[GitHub库](https://github.com/eunomia-bpf/bpf-developer-tutorial)中可用。

- **Abstracting Kernel Dependencies**: Co-RE enables eBPF programs to be more portable by abstracting away specific kernel dependencies. This is achieved through the use of BPF Type Format (BTF) and relocations.
- **BPF Type Format (BTF)**: BTF provides rich type information about data structures and functions in the kernel. This metadata allows eBPF programs to understand the layout of kernel structures at runtime.
- **Relocations**: eBPF programs compiled with Co-RE support contain relocations that are resolved at load time. These relocations adjust the program's references to kernel data structures and functions according to the actual layout and addresses in the running kernel.
## 为什么我们需要CO-RE?

### Advantages of Co-RE
- **内核依赖性**:传统的eBPF程序和它们被编译的特定Linux内核版本紧密耦合。这是因为它们依赖于内核的特定内部数据结构和API,这些可能在内核版本间变化。
- **可移植性问题**:如果你想在带有不同内核版本的不同Linux系统上运行一个eBPF程序,你通常需要为每个内核版本重新编译eBPF程序,这是一个麻烦而低效的过程。

1. **Write Once, Run Anywhere**: eBPF programs compiled with Co-RE can run on different kernel versions without the need for recompilation. This greatly simplifies the deployment and maintenance of eBPF programs in diverse environments.
2. **Safety and Stability**: Co-RE maintains the safety guarantees of eBPF, ensuring that programs do not crash the kernel and adhere to security constraints.
3. **Ease of Development**: Developers don't need to worry about the specifics of each kernel version, which simplifies the development of eBPF programs.
### Co-RE的解决方案

## Problem: userspace application CO-RE
- **抽象内核依赖性**:Co-RE使eBPF程序更具可移植性,通过使用BPF类型格式(BTF)和重定位来抽象特定的内核依赖。
- **BPF类型格式(BTF)**:BTF提供了关于内核中数据结构和函数的丰富类型信息。这些元数据允许eBPF程序在运行时理解内核结构的布局。
- **重定位**:编译支持Co-RE的eBPF程序包含在加载时解析的重定位。这些重定位根据运行内核的实际布局和地址调整程序对内核数据结构和函数的引用。

The eBPF also supports tracing userspace applications. Uprobe is a user-space probe that allows dynamic instrumentation in user-space programs. The probe locations include function entry, specific offsets, and function returns.
### Co-RE的优点

The BTF is designed for the kernel and generated from vmlinux, it can help the eBPF program to be easily compatible with different kernel versions.
1. **编写一次,任何地方运行**:编译有Co-RE的eBPF程序可以在不同的内核版本上运行,无需重新编译。这大大简化了在多样环境中部署和维护eBPF程序。
2. **安全和稳定**:Co-RE保持了eBPF的安全性,确保程序不会导致内核崩溃,遵守安全约束。
3. **简单的开发**:开发者不需要关注每个内核版本的具体情况,这简化了eBPF程序的开发。

The userspace application, however, also need CO-RE. For example, the SSL/TLS uprobe is widely used to capture the plaintext data from the encrypted traffic. It is implemented with the userspace library, such as OpenSSL, GnuTLS, NSS, etc. The userspace application and libraries also has different versions, it would be complex if we need to compile and maintain the eBPF program for each version.
## 用户空间应用程序CO-RE的问题

Here is some new tools and methods to help us enable CO-RE for userspace application.
eBPF也支持追踪用户空间应用程序。Uprobe是一个用户空间探针,允许对用户空间程序进行动态仪表装置。探针位置包括函数入口、特定偏移和函数返回。

## No BTF for userspace program
BTF是为内核设计的,生成自vmlinux,它可以帮助eBPF程序方便地兼容不同的内核版本。

This is a simple uprobe example, it can capture the function call and arguments of the `add_test` function in the userspace program. You can add `#define BPF_NO_PRESERVE_ACCESS_INDEX` in the `uprobe.bpf.c` to make sure the eBPF program can be compiled without BTF for `struct data`.
但是,用户空间应用程序也需要CO-RE。例如,SSL/TLS uprobe被广泛用于从加密流量中捕获明文数据。它是用用户空间库实现的,如OpenSSL、GnuTLS、NSS等。用户空间应用程序和库也有各种版本,如果我们需要为每个版本编译和维护eBPF程序,那就会很复杂。

下面是一些新的工具和方法来帮助我们为用户空间应用程序启用CO-RE。

## 用户空间程序的BTF

这是一个简单的uprobe例子,它可以捕获用户空间程序的`add_test`函数的调用和参数。你可以在`uprobe.bpf.c`中添加`#define BPF_NO_PRESERVE_ACCESS_INDEX`来确保eBPF程序可以在没有`struct data`的BTF的情况下编译。

```c
#define BPF_NO_GLOBAL_DATA
Expand Down Expand Up @@ -62,9 +67,9 @@ int BPF_UPROBE(add_test, struct data *d)
char LICENSE[] SEC("license") = "Dual BSD/GPL";
```
Then, we have two different versions of the userspace program, `examples/btf-base` and `examples/btf-base-new`. The struct `data` is different in the two versions.
然后,我们有两个不同版本的用户空间程序,`examples/btf-base``examples/btf-base-new`。两个版本中的struct `data`是不同的。
`examples/btf-base`:
`examples/btf-base`
```c
// use a different struct
Expand All @@ -85,7 +90,7 @@ int main(int argc, char **argv) {
}
```

`examples/btf-base-new`:
`examples/btf-base-new`

```c
struct data {
Expand All @@ -106,57 +111,57 @@ int main(int argc, char **argv) {
}
```
We can use pahole and clang to generate the btf for each version. make example and generate btf:
我们可以使用pahole和clang来生成每个版本的btf。制作示例并生成btf:
```sh
make -C example # it's like: pahole --btf_encode_detached base.btf btf-base.o
```

The we execute the eBPF program with the userspace program. for `btf-base`:
然后我们执行eBPF程序和用户空间程序。 对于 `btf-base`

```sh
sudo ./uprobe examples/btf-base
```

And also the userspace program:
也是用户空间程序:

```console
$ examples/btf-base
add_test(&d) = 4
```

We will see:
我们将看到:

```console
$ sudo cat /sys/kernel/debug/tracing/trace_pipe\
<...>-25458 [000] ...11 27694.081465: bpf_trace_printk: add_test(&d) 1 + 3 = 4
```

For `btf-base-new`:
对于 `btf-base-new`

```sh
sudo ./uprobe examples/btf-base-new
```

And also the userspace program:
同时也是用户空间程序:

```console
$ examples/btf-base-new
add_test(&d) = 4
```

But we will see:
但我们可以看到:

```console
$ sudo cat /sys/kernel/debug/tracing/trace_pipe\
<...>-25809 [001] ...11 27828.314224: bpf_trace_printk: add_test(&d) 1 + 2 = 3
```

The result is different, because the struct `data` is different in the two versions. The eBPF program can't be compatible with different versions of the userspace program.
结果是不同的,因为两个版本中的struct `data`是不同的。eBPF程序无法与不同版本的用户空间程序兼容。

## Use BTF for userspace program
## 使用用户空间程序的BTF

Comment the `#define BPF_NO_PRESERVE_ACCESS_INDEX` in the `uprobe.bpf.c` to make sure the eBPF program can be compiled with BTF for `struct data`.
`uprobe.bpf.c`中注释掉`#define BPF_NO_PRESERVE_ACCESS_INDEX` ,以确保eBPF程序可以以`struct data`的BTF编译。

```c
#define BPF_NO_GLOBAL_DATA
Expand All @@ -179,7 +184,6 @@ struct data {
#pragma clang attribute pop
#endif


SEC("uprobe/examples/btf-base:add_test")
int BPF_UPROBE(add_test, struct data *d)
{
Expand All @@ -193,15 +197,15 @@ int BPF_UPROBE(add_test, struct data *d)
char LICENSE[] SEC("license") = "Dual BSD/GPL";
```
The record of `struct data` is preserved in the eBPF program. Then, we can use the `btf-base.btf` to compile the eBPF program.
`struct data`的记录在eBPF程序中被保留下来。然后,我们可以使用 `btf-base.btf`来编译eBPF程序。
Merge user btf with kernel btf, so we have a complete btf for the kernel and userspace:
将用户btf与内核btf合并,这样我们就有了一个完整的内核和用户空间的btf:
```sh
./merge-btf /sys/kernel/btf/vmlinux examples/base.btf target-base.btf
```

Then we execute the eBPF program with the userspace program. for `btf-base`:
然后我们使用用户空间程序执行eBPF程序。 对于 `btf-base`

```console
$ sudo ./uprobe examples/btf-base target-base.btf
Expand All @@ -213,16 +217,15 @@ libbpf: prog 'add_test': relo #2: patched insn #11 (ALU/ALU64) imm 4 -> 4
...
```

Execute the userspace program and get result:
执行用户空间程序并获取结果:

```console
$ sudo cat /sys/kernel/debug/tracing/trace_pipe
[sudo] password for yunwei37:
<...>-26740 [001] ...11 28180.156220: bpf_trace_printk: add_test(&d) 1 + 3 = 4

```

Also, we do the same for another version of the userspace program `btf-base-new`:
还可以对另一个版本的用户空间程序`btf-base-new`做同样的操作:

```console
$ ./merge-btf /sys/kernel/btf/vmlinux examples/base-new.btf target-base-new.btf
Expand All @@ -244,12 +247,20 @@ libbpf: elf: symbol address match for 'add_test' in 'examples/btf-base-new': 0x1
Successfully started! Press Ctrl+C to stop.
```

The result is correct:
结果是正确的:

```console
$ sudo cat /sys/kernel/debug/tracing/trace_pipe
[sudo] password for yunwei37:
<...>-26740 [001] ...11 28180.156220: bpf_trace_printk: add_test(&d) 1 + 3 = 4
```

## Conclusion
## 结论

- **灵活性和兼容性**:在用户空间eBPF程序中使用BTF大大增强了它们在不同版本的用户空间应用程序和库之间的灵活性和兼容性。
- **简化了复杂性**:这种方法显著减少了维护不同版本的用户空间应用程序的eBPF程序的复杂性,因为它消除了需要多个程序版本的需要。
- **更广泛的应用**:虽然你的例子关注于SSL/TLS监控,但是这种方法在性能监控、安全和用户空间应用程序的调试等方面有更广泛的应用。

这个示例展示了eBPF在实践中的重要进步,将其强大的功能扩展到更动态地处理用户空间应用程序在Linux环境中。对于处理现代Linux系统复杂性的软件工程师和系统管理员来说,这是一个引人注目的解决方案。

如果你想了解更多关于eBPF知识和实践,你可以访问我们的教程代码库<https://github.com/eunomia-bpf/bpf-developer-tutorial>或者网站<https://eunomia.dev/tutorials/>获得更多示例和完整教程。
Loading

0 comments on commit f59aac6

Please sign in to comment.