Doc. no:  D1134R0
Audience: LEWG
Date:     2018-06-21
Reply-To: Vinnie Falco (vinnie.falco@gmail.com)

An Elegant Coroutine Abstraction

Contents

Abstract

Coroutines enable authors to write code which may be suspended and resumed at well defined locations without inverting the flow of control. A preponderance of experience shows the need to add this as a language feature, as library-only solutions cannot achieve comparable performance and clarity of expression for some use cases. Unfortunately, the leading proposal for introducing coroutines to the language suffers from significant design flaws. In this paper we describe the ideal coroutine abstraction, expose the design flaws with Coroutines TS, and provide an alternative which implements the ideal abstraction.

Introduction

A coroutine is a thread of execution containing well-defined suspend points from which the flow of control may be paused and resumed. A stackful coroutine preserves the stack of the thread of execution starting from the entry point to the suspend point. A stackless coroutine preserves a constant amount of state known at compile time. Stackful coroutines are uncontroversial, many performant implementations have existed as library-only solutions for decades. This paper discusses stackless coroutines exclusively; henceforth they are referred to simply as "coroutines".

All coroutine implementations share a common algorithm, called the coroutine transformation (henceforth termed "transformation"). This algorithm takes ordinary code annotated with suspend points and inserts additional code allowing the thread of execution to be paused at each suspend point and later resumed. At a minimum, the transformation must store sufficient state to identify the last suspend point from which the coroutine should be resumed. An implementation may also choose to preserve the values of some or all local variables and function arguments. The following program shows an example of a coroutine:

#include "coroutine.hpp"

#include <iostream>

struct hello : coroutine
{
    void operator()()
    {
        co_enter(*this)
        {
            co_yield std::cout << "Hello, ";
            co_yield std::cout << "world!";
            co_yield std::cout.flush();
        }
    }
};

int main()
{
    hello c;

    while(! c.await_ready())
        c();
}

In this example, which prints "Hello, world!", the coroutine is represented by the function object hello. The coroutine starts out in the suspended state, and is resumed by invoking the function call operator. The co_enter keyword denotes the entry point into the coroutine body, while the co_yield keyword marks a suspension point. The thread of execution will suspend (return to the caller) after the statement or block following the co_yield keyword executes.

It may be surprising to discover that the coroutine interface used above is implemented entirely as a library solution, without help from the compiler. The source code for the library header file "coroutine.hpp" may be seen in Listing 1. The co_enter and co_yield keywords are defined as macros. Performing the macro substitutions in the function call operator definition above gives us insight into the structure of the transformation:

void operator()()
{
    switch (::detail::coroutine_ref _coro_value = *this)
    case -1: if (_coro_value)
    {
        goto terminate_coroutine;
        terminate_coroutine:
        _coro_value = -1;
        goto bail_out_of_coroutine;
        bail_out_of_coroutine:
        break;
    } else case 0: {
        for (_coro_value = (10);;) if (_coro_value == 0) {
    case (10): ; break; } else
        switch (_coro_value ? 0 : 1) for (;;)
        case -1: if (_coro_value) goto terminate_coroutine; else for (;;)
        case  1: if (_coro_value) goto bail_out_of_coroutine; else
        case  0:
    std::cout << "Hello, ";
        for (_coro_value = (11);;) if (_coro_value == 0) {
    case (11): ; break; } else
        switch (_coro_value ? 0 : 1) for (;;)
        case -1: if (_coro_value) goto terminate_coroutine; else for (;;)
        case  1: if (_coro_value) goto bail_out_of_coroutine; else
        case  0:Coroutines TS
    std::cout << "world!";
        for (_coro_value = (12);;) if (_coro_value == 0) {
    case (12): ; break; } else
        switch (_coro_value ? 0 : 1) for (;;)
        case -1: if (_coro_value) goto terminate_coroutine; else for (;;)
        case  1: if (_coro_value) goto bail_out_of_coroutine; else
        case  0:
    std::cout.flush();
    }
}

The code in Listing 1 is adapted from the stackless coroutine implementation from Boost.Asio[1]. There are some notable limitations. The values of local variables are not automatically preserved; they must be manually saved by using class data members instead. Furthermore, due to the inserted goto and switch statements, code blocks must be arranged to avoid jumps across variable initializations.

Despite these limitations, this simple macro-based approach has a number of strengths. The function is written without inverting the flow of control. The coroutine is a plain struct. No dynamic allocations are performed, and the user decides where and how the object is placed. A coroutine may be part of a larger object which is itself a coroutine and also contains other coroutines. This mechanism has been used in popular libraries such as Boost.Beast[2], making complex asynchronous operations easier to express and reason about. To be clear, this paper does not propose macro solutions to coroutines.

This particular implementation is presented to demonstrate that assistance from the compiler is not required to perform the coroutine transformation. It can be expressed using existing C++ language constructs. A transformation needs to implement a set of required features, otherwise the functionality of a coroutine cannot be achieved. A transformation may provide additional features besides the required ones, but it is not necessary to do so. Those features are as follows:

We use the term coroutine to also refer to the object produced by the transformation, and the term coroutine abstraction (henceforth termed "abstraction") to refer to the interface offered by a particular implementation for creating and interacting with the coroutine. While an abstraction may include additional features (such retaining local variables across suspend points), these features are not required as demonstrated above. An abstraction containing only the features above is termed a minimal abstraction

While a minimal abstraction is functional, it is not something which typical users would want to interact with directly. Most users want higher-level abstractions such as futures, generators, and iterators. There is also great interest in using coroutines with libraries that perform asynchronous networking or other I/O. These higher level abstractions may be part of a coroutine abstraction, or they can be provided as library components built on a minimal abstraction. We use the term coroutine middleware (or just "middleware") to describe additional language or library elements whose functionality goes beyond the minimal abstraction described above.

Coroutines TS

As of this writing, the leading proposal for introducing coroutines into the C++ language is the Coroutines TS described in n4723[3]. The proposal introduces several new keywords including co_await, co_yield, and co_resume. We abbreviate the name of Coroutines TS by referring it to just "TS" or "co-await." This section highlights some of the notable parts of the TS, by starting with an example program written against it:

#include "generator.hpp"
#include <experimental/coroutine>
#include <cstdio>

generator<const char>
hello(const char* p)
{
    while (*p) {
        co_yield *p++;
    }
}

int main()
{
    for (const auto c : hello("Hello, world")) {
        std::putchar(c);
    }
}

After compilation using the latest Visual C++ compiler and the command line switch "/await" to enable co-await features, the executable produces the following output when launched:

Hello, world

The appearance of the co_yield keyword identifies the function hello as a coroutine. Use of the co_yield keyword allows the coroutine to suspend and deliver an intermediate result to the caller. Since the type of the expression *p++ is char const&, it might not be immediately clear how this value interacts with the return type of the function to give the loop in main a value. The TS prescribes a rich set of rules for converting the yielded value into executable code. In this case, the compiler looks to the return type of the function and checks for the nested type promise_type. If found, a local variable p of the promise type is added to the generated coroutine code. The emitted code for the expression co_yield e then becomes p.yield_value(e). The generator class template is an example of coroutine middleware. The full header file is provided in Listing 2.

References

[1] https://www.boost.org/doc/libs/1_67_0/doc/html/boost_asio/overview/core/coroutine.html

[2] https://github.com/boostorg/beast/blob/93c35524a6125db3e6aeefc8abc826a810dd5d61/include/boost/beast/websocket/impl/read.ipp#L177

[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/n4723.pdf

Acknowledgements 1

We give thanks to the following authors for their previous work, which contributed to the content of this paper:

Listing 1

//
// Filename: "coroutine.hpp"
//

namespace detail {

struct coroutine_ref;

} // detail

struct coroutine
{
    coroutine() : line_(0) {}
    bool await_ready() const { return line_ == -1; }

private:
    friend struct detail::coroutine_ref;
    int line_;
};

namespace detail {

struct coroutine_ref
{
    coroutine_ref(coroutine& c) : line_(c.line_), modified_(false) {}
    ~coroutine_ref() { if (!modified_) line_ = -1; }
    operator int() const { return line_; }
    int& operator=(int v) { modified_ = true; return line_ = v; }

private:
    void operator=(coroutine_ref const&);

    int& line_;
    bool modified_;
};

} // detail

#define co_enter(c)                                  \
    switch (::detail::coroutine_ref _coro_value = c) \
    case -1: if (_coro_value)                        \
    {                                                \
        goto terminate_coroutine;                    \
        terminate_coroutine:                         \
        _coro_value = -1;                            \
        goto bail_out_of_coroutine;                  \
        bail_out_of_coroutine:                       \
        break;                                       \
    } else case 0:

#define __co_yield_impl(n)                                                \
        for (_coro_value = (n);;) if (_coro_value == 0) {                 \
    case (n): ; break; } else                                             \
        switch (_coro_value ? 0 : 1) for (;;)                             \
        case -1: if (_coro_value) goto terminate_coroutine; else for (;;) \
        case  1: if (_coro_value) goto bail_out_of_coroutine; else        \
        case  0:

#define co_yield _co_yield_impl(__LINE__)

Listing 2

///////////////////////////////////////////////////////////////////////////////
// Copyright (c) Lewis Baker
// Licenced under MIT license. See LICENSE.txt for details.
///////////////////////////////////////////////////////////////////////////////

/*
Original file located at
https://github.com/lewissbaker/cppcoro/blob/2492c0782911cdf7fe803642040efff3faa3c28a/include/cppcoro/generator.hpp

This is a derivative work
*/

#ifndef CPPCORO_GENERATOR_HPP_INCLUDED
#define CPPCORO_GENERATOR_HPP_INCLUDED

#include <experimental/coroutine>
#include <type_traits>
#include <utility>

template<typename T>
class generator;

namespace detail {

template<typename T>
class generator_promise
{
    pointer_type m_value;

public:
    using value_type = std::remove_reference_t<T>;
    using reference_type = std::conditional_t<std::is_reference_v<T>, T, T&>;
    using pointer_type = value_type*;

    generator_promise() = default;

    generator<T> get_return_object() noexcept;

    constexpr std::experimental::suspend_always initial_suspend() const { return {}; }
    constexpr std::experimental::suspend_always final_suspend() const { return {}; }

    template<
        typename U,
        typename = std::enable_if_t<std::is_same<U, T>::value>>
    std::experimental::suspend_always yield_value(U& value) noexcept
    {
        m_value = std::addressof(value);
        return {};
    }

    std::experimental::suspend_always yield_value(T&& value) noexcept
    {
        m_value = std::addressof(value);
        return {};
    }

    void unhandled_exception()
    {
        std::rethrow_exception(std::current_exception());
    }

    void return_void()
    {
    }

    reference_type value() const noexcept
    {
        return *m_value;
    }

    // Don't allow any use of 'co_await' inside the generator coroutine.
    template<typename U>
    std::experimental::suspend_never await_transform(U&& value) = delete;
};

template<typename T>
class generator_iterator
{
    using coroutine_handle = std::experimental::coroutine_handle<generator_promise<T>>;
    coroutine_handle m_coroutine;

public:
    using iterator_category = std::input_iterator_tag;
    using difference_type = std::size_t;
    using value_type = std::remove_reference_t<T>;
    using reference = value_type&;
    using pointer = value_type*;

    explicit generator_iterator(std::nullptr_t) noexcept
        : m_coroutine(nullptr)
    {}

    explicit generator_iterator(coroutine_handle coroutine) noexcept
        : m_coroutine(coroutine)
    {}

    bool operator==(const generator_iterator& other) const noexcept
    {
        return m_coroutine == other.m_coroutine;
    }

    bool operator!=(const generator_iterator& other) const noexcept
    {
        return !(*this == other);
    }

    generator_iterator& operator++()
    {
        m_coroutine.resume();
        if (m_coroutine.done())
        {
            m_coroutine = nullptr;
        }

        return *this;
    }

    // Don't support post-increment as that would require taking a
    // copy of the old value into the returned iterator as there
    // are no guarantees it's still going to be valid after the
    // increment is executed.
    generator_iterator operator++(int) = delete;

    reference operator*() const noexcept
    {
        return m_coroutine.promise().value();
    }

    pointer operator->() const noexcept
    {
        return std::addressof(operator*());
    }
};

} // detail

template<typename T>
class generator
{
public:

    using promise_type = detail::generator_promise<T>;
    using iterator = detail::generator_iterator<T>;

    generator() noexcept
        : m_coroutine(nullptr)
    {}

    generator(generator&& other) noexcept
        : m_coroutine(other.m_coroutine)
    {
        other.m_coroutine = nullptr;
    }

    generator(const generator& other) = delete;

    ~generator()
    {
        if (m_coroutine)
        {
            m_coroutine.destroy();
        }
    }

    generator& operator=(generator other) noexcept
    {
        swap(other);
        return *this;
    }

    iterator begin()
    {
        if (m_coroutine)
        {
            m_coroutine.resume();
            if (!m_coroutine.done())
            {
                return iterator{ m_coroutine };
            }
        }

        return iterator{ nullptr };
    }

    iterator end() noexcept
    {
        return iterator{ nullptr };
    }

    void swap(generator& other) noexcept
    {
        std::swap(m_coroutine, other.m_coroutine);
    }

private:

    friend class detail::generator_promise<T>;

    explicit generator(std::experimental::coroutine_handle<promise_type> coroutine) noexcept
        : m_coroutine(coroutine)
    {}

    std::experimental::coroutine_handle<promise_type> m_coroutine;

};

template<typename T>
void swap(generator<T>& a, generator<T>& b)
{
    a.swap(b);
}

namespace detail {

template<typename T>
generator<T> generator_promise<T>::get_return_object() noexcept
{
    using coroutine_handle = std::experimental::coroutine_handle<generator_promise<T>>;
    return generator<T>{ coroutine_handle::from_promise(*this) };
}

} // detail

#endif