C++ Coroutines
这个文章算是一个个人的笔记,想简单记录一下Coroutine相关的细节,以便后续回忆。如果想从头学习coroutine的话,我更推荐这个系列,本文也是从这个系列中摘取一些关键点记录而已。
Awaiter
A type that supports the co_await
operator is called an Awaitable type.
An Awaiter type is a type that implements the three special methods that are called as part of a co_await
expression: await_ready
, await_suspend
and await_resume
.
在调用co_await expr
的时候,我们会先尝试通过expr获取一个awaitable,然后在通过这个awaitable获取一个awaiter。编译器大概翻译的代码如下
template<typename P, typename T>
decltype(auto) get_awaitable(P& promise, T&& expr)
{
if constexpr (has_any_await_transform_member_v<P>)
return promise.await_transform(static_cast<T&&>(expr));
else
return static_cast<T&&>(expr);
}
template<typename Awaitable>
decltype(auto) get_awaiter(Awaitable&& awaitable)
{
if constexpr (has_member_operator_co_await_v<Awaitable>)
return static_cast<Awaitable&&>(awaitable).operator co_await();
else if constexpr (has_non_member_operator_co_await_v<Awaitable&&>)
return operator co_await(static_cast<Awaitable&&>(awaitable));
else
return static_cast<Awaitable&&>(awaitable);
}
在得到Awaiter之后,co_await awaiter
就会被转化为:
{
auto&& value = <expr>;
auto&& awaitable = get_awaitable(promise, static_cast<decltype(value)>(value));
auto&& awaiter = get_awaiter(static_cast<decltype(awaitable)>(awaitable));
if (!awaiter.await_ready())
{
using handle_t = std::experimental::coroutine_handle<P>;
using await_suspend_result_t =
decltype(awaiter.await_suspend(handle_t::from_promise(p)));
<suspend-coroutine>
if constexpr (std::is_void_v<await_suspend_result_t>)
{
awaiter.await_suspend(handle_t::from_promise(p));
<return-to-caller-or-resumer>
}
else
{
static_assert(
std::is_same_v<await_suspend_result_t, bool>,
"await_suspend() must return 'void' or 'bool'.");
if (awaiter.await_suspend(handle_t::from_promise(p)))
{
<return-to-caller-or-resumer>
}
}
<resume-point>
}
return awaiter.await_resume();
}
At the <suspend-coroutine>
point the compiler generates some code to save the current state of the coroutine and prepare it for resumption. This includes storing the location of the <resume-point>
as well as spilling any values currently held in registers into the coroutine frame memory.
At the <return-to-caller-or-resumer>
point execution is transferred back to the caller or resumer, popping the local stack frame but keeping the coroutine frame alive.
在将Coroutine挂起后,传给await suspend的参数是coroutine_handle,我们可以通过coroutine_handle.resume()
来恢复coroutine的执行到<resume-point>
,而.resumt()
会在Coroutine下一次到达<return-to-caller-or-resumer>
的时候返回。
Promise
When you write a coroutine function that has a body, <body-statements>
, which contains one of the coroutine keywords (co_return
, co_await
, co_yield
) then the body of the coroutine is transformed to something (roughly) like the following:
{
co_await promise.initial_suspend();
try
{
<body-statements>
}
catch (...)
{
promise.unhandled_exception();
}
FinalSuspend:
co_await promise.final_suspend();
}
When a coroutine function is called there are a number of steps that are performed prior to executing the code in the source of the coroutine body that are a little different to regular functions.
Here is a summary of the steps (I’ll go into more detail on each of the steps below).
- Allocate a coroutine frame using
operator new
(optional). - Copy any function parameters to the coroutine frame.
- Call the constructor for the promise object of type,
P
. - Call the
promise.get_return_object()
method to obtain the result to return to the caller when the coroutine first suspends. Save the result as a local variable. - Call the
promise.initial_suspend()
method andco_await
the result. - When the
co_await promise.initial_suspend()
expression resumes (either immediately or asynchronously), then the coroutine starts executing the coroutine body statements that you wrote.
Some additional steps are executed when execution reaches a co_return
statement:
- Call
promise.return_void()
orpromise.return_value(<expr>)
- Destroy all variables with automatic storage duration in reverse order they were created.
- Call
promise.final_suspend()
andco_await
the result.
If instead, execution leaves <body-statements>
due to an unhandled exception then:
- Catch the exception and call
promise.unhandled_exception()
from within the catch-block. - Call
promise.final_suspend()
andco_await
the result.
The compiler is free to elide the call to operator new
as an optimisation if:
- it is able to determine that the lifetime of the coroutine frame is strictly nested within the lifetime of the caller; and
- the compiler can see the size of coroutine frame required at the call-site.
Note that if execution runs off the end of a coroutine without a co_return
statement then this is equivalent to having a co_return;
at the end of the function body. In this case, if the promise_type
does not have a return_void()
method then the behaviour is undefined.
Note that while it is allowed to have a coroutine not suspend at the final_suspend
point, it is recommended that you structure your coroutines so that they do suspend at final_suspend
where possible. This is because this forces you to call .destroy()
on the coroutine from outside of the coroutine (typically from some RAII object destructor) and this makes it much easier for the compiler to determine when the scope of the lifetime of the coroutine-frame is nested inside the caller. This in turn makes it much more likely that the compiler can elide the memory allocation of the coroutine frame.
There are several benefits of starting the coroutine lazily:
- It means that we can attach the continuation’s
std::coroutine_handle
before starting execution of the coroutine. This means we don’t need to use thread-synchronisation to arbitrate the race between attaching the continuation later and the coroutine running to completion. - It means that the
task
destructor can unconditionally destroy the coroutine frame - we don’t need to worry about whether the coroutine is potentially executing on another thread since the coroutine will not start executing until we await it, and while it is executing the calling coroutine is suspended and so won’t attempt to call the task destructor until the coroutine finishes executing. This gives the compiler a much better chance at inlining the allocation of the coroutine frame into the frame of the caller. See P0981R0 to read more about the Heap Allocation eLision Optimisation (HALO). - It also improves the exception-safety of your coroutine code. If you don’t immediately
co_await
the returnedtask
and do something else that can throw an exception that causes the stack to unwind and thetask
destructor to run then we can safely destroy the coroutine since we know it hasn’t started yet. We aren’t left with the difficult choice between detaching, potentially leaving dangling references, blocking in the destructor, terminating or undefined-behaviour. This is something that I cover in a bit more detail in my CppCon 2019 talk on Structured Concurrency.
文章评论