-
Notifications
You must be signed in to change notification settings - Fork 0
High performance event‐based stackless coroutine
Date: 2023-10-1
Auther: Zhao Yun Shan
Project: Programming Language C++
|introduction| motivation | priorities| awaitable | event | task | memory | references | advantages |
This paper proposes an event-based mechanism for C++20 stackless coroutines, elaborates on the features of this mechanism, and explains the design idea of the event-based encapsulation.
In the field of modern programming and concurrency, coroutines are considered an extremely important feature. They effectively solve the synchronization issues present in traditional thread models. Coroutines allow programs to pause and resume execution and yield the CPU while waiting for I/O, thus improving system utilization. Moreover, coroutines make the logical of programs clearer, greatly simplifying the code and enhancing its readability and maintainability.
C++20 coroutines
library provides a powerful solution for high-performance coroutines; however, its widespread adoption has been limited due to the lack of user-friendly wrappers. To address this issue, the author introduces a set of event-basded coroutine wrappers. These wrappers decouple coroutines from other modules, allowing developers to easily transform existing libraries and programs into coroutine-based programs, thereby enhancing code readability and maintainability.
- Improve C++20 coroutines by providing a carefully designed wrappers that make coroutines easier to understand and use.
- Construct a unified event-based coroutine mechanism that allows code to switch between thread scope and coroutine scope through events. This enables developers to use coroutines with a consistent mechanism, simplifying program design and enhancing code readability.
C++20 coroutines offer several built-in functions for suspending execution: initial_suspend
, final_suspend
, return_value
, yield_value
, return_void
, yield_void
and so on. These functions are the points where the execution of the coroutine may be suspended, they are all member functions of promise_type
, initial_suspend
invoked when the coroutine_handle
is created and final_suspend
invoked before the coroutine_handle
destroyed. This may seem very flexible, but it causes significant confusion for users, leaving them unsure how to use these features.
For users, the await_ready
function suspends the coroutine if the data is not yet ready. It attempt to guide users to encapsulate I/O context and coroutines together. Many existing open-source libraries follow this approach, the coupling of I/O context and coroutines increases the difficulty of program design, and different coroutine libraries are not mutually compatible. Therefore, decoupling I/O context from coroutines is very important.
Coroutine have been described as "functions whose execution you can pause". This paper starts from this perspective, treating a coroutine as a function, which is the first principle. The author introduces a class named awaitable
, which is a template class and serves as the return value of a function, indicating that it is a coroutine. The awaitable
function has temporary variables and return values of different types, which allows users to pass parameters to the called awaitable
function. There is one important thing: when an awaitable function returns a value, it resumes the caller's coroutine_handle, and continues the caller's execution.
awaitable<int> sleep_for_seconds(int t)
{
co_await sleep_for(t);
co_return 0;
}
awaitable<void> do_something()
{
co_await sleep_for_seconds(5);
}
When the awaitable
function is invoked, co_await
operation creates an coroutine_handle
object associated with a awaitable
, and the initial_suspend
method is invoked. However, it is not necessary to suspend execution at these points, as no one wants a function to suspend at the entry or exit. The function needs to continue executing, which is why the initial_suspend
method returns suspend_never
. Similarly, when the function exits, the final_suspend
method should also return suspend_never
, ensuring that the caller's awaitable
function is resumed if the caller is available."
Therefore, to achieve the goal of "treating a coroutine as a function", all functions of the promise_type
return suspend_never
.
template<class T>
struct promise_type
{
std::suspend_never return_value(T&&);
std::suspend_never yield_value(T&&)=delete;
std::suspend_never return_value(const T&);
std::suspend_never yield_value(const T&)=delete;
std::suspend_never initial_suspend();
std::suspend_never final_suspend() noexcept;
};
co_yield
is another important feature, most commonly used in std::generator
. The member functions yield_value
, and yield_void
in promise_type
control the suspension of the coroutine_handle
. co_await
and co_yield
are applied in two different scenarios, co_yield
can be used for creating Abstract Syntax Trees (AST) and finite-state machine (FSM), although, no one has ever done this. Therefore, the use of co_yield
is disabled in awaitable
functions.
The await_ready
function within the awaitable
is also a point where execution might be suspended. Since the awaitable
is decoupled from any other module, there is no necessary to prepare data, yet the awaitable
function needs to execute the awaitable
function code. Therefore, it should return false here to allow the coroutine to execute.
template <class T, class... R>
class awaitable final : public tasknotify
{
public:
struct promise_value;
struct promise_void;
struct promise_tuple;
struct promise_type : std::conditional_t<(sizeof...(R) > 0), promise_tuple,
std::conditional_t<std::is_void_v<T>, promise_void,
promise_value>>
{
awaitable *m_awaitable = nullptr;
~promise_type()
{
if (m_awaitable)
{
m_awaitable->m_state = STATUS_RESUMED;
auto _awaitable = m_awaitable;
m_awaitable = nullptr;
_awaitable->resume();
}
}
};
awaitable() = default;
awaitable(std::coroutine_handle<promise_type> h)
bool done() { return m_callee ? m_callee.done() : true; }
auto await_resume();
void destroy();
void await_suspend(std::coroutine_handle<> caller);
bool await_ready() { return m_state == STATUS_RESUMED; }
void resume();
private:
Std::coroutine_handle<promise_type> m_callee = nullptr; // the coroutine_handle of this awaitable
std::coroutine_handle<> m_caller = nullptr; // the caller's coroutine_handle of awaitable
std::atomic_int m_state{STATUS_INIT};
};
When the callee's coroutine function finishes executing, the coroutine_handle
is destroyed, at which point the caller's coroutine_handle
is resumed, returning to the caller’s suspension point and continuing the execution of the caller's coroutine function.
Decoupling the awaitable
from other modules is a very wise choice, significantly reducing the complexity of understanding and designing awaitable functions. It also makes the code much clearer. Users can embed the awaitable
into any network library, swiftly transforming asynchronous execution into coroutine execution.
An awaitable function continuously executes code and, in fact, does not suspend. If an awaitable function needs to suspend, you need to use an event
and async
.
The event
is the smallest fundamental coroutine, it does not create a coroutine_handle
, and the async
is actually a list. They work together to implement the suspension and resumption of execution. Taking I/O read/write as an example, when waiting for I/O, invoke "wait_for(g_listener)", the event
is inserted into the async
listener, and the current awaitable
function is suspended, yielding the CPU and transition to thread scope.
struct event final : chain // chain is a simplified list
{
std::atomic_int m_status{STATUS_INIT};
std::coroutine_handle<> m_awaitable = nullptr;// the caller's coroutine_handle of awaitable
event(chain *_eventchain);
virtual ~event();
void await_resume();
bool await_ready();
void await_suspend(std::coroutine_handle<> awaitable);
void resume();
};
event wait_for(async &_this)
{
return event(&_this);
}
async g_listener;
awaitable<void> wait_for_something()
{
co_await wait_for(g_listener);
}
When the data is ready for read/write, invoke listener.notify()
, this will return to the point of "wait_for(g_listener)" and resume the awaitable
function to continue execution, transition to coroutine scope.
Obviously, the awaitable
function will keep executing until it encounters a call of "wait_for((async)listener)". at which point the awaitable
function is suspended. The awaitable
coroutine resumes execution upon calling the listener's notify()
function. This mechanism is very easy to understand and can be embed into existing code with minimal changes.
In a typical function call, the relationship is maintained on the stack. For awaitable functions, the call chain is maintained through a linked list of callers within the awaitable.
Sometimes, it is necessary to wait for multiple coroutines to complete simultaneously. The author introduce a named class task
that used simultaneously waiting for multiple awaitable
coroutines. It is a list that stores tasknotify
instances and hold awaitable
coroutines.
The wait_for_all function waits for multiple awaitable functions to complete. When all the awaited coroutines have finished, wait_for_all returns.
template <class... T>
awaitable<void> wait_for_all(T &&..._task)
{
task w;
(w.insert(&_task), ...);
while (!w.empty())
co_await wait_for(&w);
co_return 0;
}
awaiter co_wait_complated()
{
co_await wait_for_all(sleep_for(1), sleep_for(2));
}
wait_for_any
function waits for any of the awaitable
functions to complete. When one of the awaitable
functions finishes, wait_for_any
returns.
template <class... T>
awaitable<int> wait_for_any(T &&..._task)
{
task w;
(w.insert(&_task), ...);
co_await wait_for(&w);
w.destroy();
co_return 0;
}
awaitable<int> co_wait_incomplated()
{
co_await wait_for_any(sleep_for(1), sleep_for(2));
}
Each time an awaitable function is invoked, memory is dynamically allocated. The size of this memory is calculated at compile time. Temporary variables within awaitable functions increase the size of the coroutine_handle
and can be passed as parameters to the called function, which greatly facilitates user convenience. Moreover, since these temporary variables are allocated once, they reside at contiguous memory addresses.
awaitable<int> co_wait()
{
char word[128];
int array[100];
co_return 0;
}
All coroutines are functions, and their characteristics are similar to those of functions. They can pass parameters to other functions, make recursive calls, serve as member functions, and also call regular functions, ensuring design flexibility. When waiting something, using wait_for((async)listener) can wait for any event, time or anything else. When we receive the expected "things", we can manually trigger the resume to continue the program execution.
The author has provided a complete implementation, the source code is at https://github.com/dou1984/coev/tree/main/src/coev.