Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support splice in http blind tunnel #11890

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

YIHONG-JIN
Copy link

@YIHONG-JIN YIHONG-JIN commented Dec 4, 2024

  1. Add TS_USE_LINUX_SPLICE as a compilation option.
  2. Make MIOBuffer and MIOBufferReader polymorphic classes making their member function virtual.
  3. Create PipeIOBuffer and PipeIOBufferReader as derived classes, encapsulating Linux pipe.
  4. Use dynamic_cast to enable logic switch in state machines and continuations.

Documentations:
ATS_splice_runbook.md
ATS Performance Benchmark.pdf
ATS_splice_design_doc.pdf

@YIHONG-JIN YIHONG-JIN marked this pull request as ready for review December 4, 2024 22:11
@YIHONG-JIN YIHONG-JIN marked this pull request as draft December 4, 2024 22:18
@YIHONG-JIN
Copy link
Author

Passes unit test and manual integration test on Debian 12. Performance benchmark shows that splice could improve maximum blind tunnel throughput from 300 MB/s to 575 MB/s and reduce latency by 40% for MB level payload on C6in.large EC2 instance. Feel free to benchmark this CR.

1. Add TS_USE_LINUX_SPLICE as a compilation option.
2. Make MIOBuffer and MIOBufferReader polymorphic classes making their member function virtual.
3. Create PipeIOBuffer and PipeIOBufferReader as derived classed, encapsulating Linux pipe.
4. Use dynamic_cast to enable logic switch in state machines and continuations.
@YIHONG-JIN YIHONG-JIN marked this pull request as ready for review December 20, 2024 02:07
Copy link
Contributor

@moonchen moonchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. Overall I think this is a clever idea to introduce zero-copy in a way that fits the existing architecture. I found a few small issues that I hope you can address before merging.

@@ -7274,6 +7274,32 @@ HttpSM::setup_blind_tunnel(bool send_response_hdr, IOBufferReader *initial)
// header buffer into new buffer
client_request_body_bytes += from_ua_buf->write(_ua.get_txn()->get_remote_reader());

#if TS_USE_LINUX_SPLICE
MIOBuffer *from_ua_pipe_buf = new_PipeIOBuffer(BUFFER_SIZE_INDEX_32K);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUFFER_SIZE_INDEX_32K is equal to 8. Is this the capacity of the pipe that we're requesting from the kernel?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is the same as the MIOBuffer size we use for blind tunnel. The default linux pipe size is 16 pages so I am trying to save memory by requesting only 8 pages here. However, the system admin still need to lift pipe-user-pages-soft limit to avoid exceptions.


if (r <= 0) {
// Temporary Unavailable, Non-Blocking I/O
if (r == -EAGAIN || r == -ENOTCONN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we get EAGAIN here because the pipe is at capacity? How do we handle that case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually impossible to get EAGAIN here because the pipe is at capacity. We will disable the read vio after each successful read and only reenable it when its corresponding pipe is empty again.

However, it is possible that we get EAGAIN because the socket is somehow unavailable. In that case, we wait for next epoll edge trigger same as the logic without zero copy


#if TS_USE_LINUX_SPLICE

class PipeIOBufferReader : public IOBufferReader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Managing header dependency in ATS (specifically iocore) has been an ongoing challenge. One of the reasons is that we often combine multiple classes into a single header file. Please move the new classes to their own header and source files.

@@ -508,6 +508,81 @@ UnixNetVConnection::net_read_io(NetHandler *nh)
read_disable(nh, this);
return;
}
#if TS_USE_LINUX_SPLICE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is getting quite long, with two independent code paths to read from socket to buffer. I would prefer using polymorphism to handle the different socket-to-buffer copies, but at least we should split this function up so that it's more readable.

Copy link
Author

@YIHONG-JIN YIHONG-JIN Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have considered using polymorphism here but it looks impossible without significant refactoring because of exposure of low level io operations in this function (we will need a new member function in both MIOBuffer and PipeIOBuffer). Will split this function up for now

@@ -656,6 +656,15 @@ new_MIOBuffer_internal(const char *location, int64_t size_index)
TS_INLINE void
free_MIOBuffer(MIOBuffer *mio)
{
#if TS_USE_LINUX_SPLICE
// check if mio is PipeIOBuffer using dynamic_cast
PipeIOBuffer *pipe_mio = dynamic_cast<PipeIOBuffer *>(mio);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on a runtime type check with dynamic_cast to select the proper deallocation mechanism is generally considered an anti-pattern. Instead, ClassAllocator has a Destruct_on_free_ parameter that allows you to use a virtual destructor to clean up properly depending on the underlying type.

@YIHONG-JIN
Copy link
Author

Thanks for the comments @moonchen. I will start to resolve the comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants