Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119786: added InternalDocs/generators.md #128524

Merged
merged 4 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion InternalDocs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Runtime Objects

- [Code Objects](code_objects.md)

- [Generators (coming soon)](generators.md)
- [Generators](generators.md)

- [Frames](frames.md)

Expand Down
103 changes: 98 additions & 5 deletions InternalDocs/generators.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,101 @@

Generators and Coroutines
=========================

Generators
==========
----------

The implementation of generators in CPython consists of instances of `PyGenObject`
iritkatriel marked this conversation as resolved.
Show resolved Hide resolved
and bytecode instructions that operate on instances of this type.
iritkatriel marked this conversation as resolved.
Show resolved Hide resolved

A generator object is invoked in a [`frame`](frames.md), like a function.
iritkatriel marked this conversation as resolved.
Show resolved Hide resolved
The difference is that a function returns to the calling frame only once,
while a generator "returns" to the caller every time it emits a new item
with a
[`yield` expression](https://docs.python.org/dev/reference/expressions.html#yield-expressions).
This is implemented by the
[`YIELD_VALUE`](https://docs.python.org/dev/library/dis.html#opcode-YIELD_VALUE)
bytecode, which is similar to
[`RETURN_VALUE`](https://docs.python.org/dev/library/dis.html#opcode-RETURN_VALUE)
in the sense that it puts a value on the stack and returns execution to the
calling frame, but it also needs to perform additional work to leave the generator
frame in a state that allows it to be resumed. In particular, it updates the frame's
instruction pointer and stores the interpreter's exception state on the generator
object. When the generator is resumed, this exception state is copied back to the
interpreter state.

The `frame` of a generator is embedded in the generator object struct as a
[`_PyInterpreterFrame`](frames.md) (see `_PyGenObject_HEAD` in
[`pycore_genobject.h`](../Include/internal/pycore_genobject.h)).
This means that we can get the frame from the generator or the generator
from the frame (see `_PyGen_GetGeneratorFromFrame` in the same file).
Other fields of the generator struct include metadata (such as the name of
the generator function) and runtime state information (such as whether its
frame is executing, suspended, cleared, etc.).

Generator Object Creation and Destruction
-----------------------------------------

The bytecode of a generator function begins with a
[`RETURN_GENERATOR`](https://docs.python.org/dev/library/dis.html#opcode-RETURN_GENERATOR)
instruction, which creates a generator object, along with its embedded frame.
iritkatriel marked this conversation as resolved.
Show resolved Hide resolved
The generator's frame is initialized as a copy of the frame in which
`RETURN_GENERATOR` is executing, but its `owner` field is overwritten to indicate
that it is owned by a generator. Finally, `RETURN_GENERATOR` pushes the new generator
object to the stack and returns to the caller of the generator function. When the
generator is next resumed by [`gen_send_ex2()`](../Objects/genobject.c),
`_PyEval_EvalFrame()` is called to continue executing the generator function,
in the frame that is embedded in the generator object.
iritkatriel marked this conversation as resolved.
Show resolved Hide resolved

When a generator object is destructed in [`gen_dealloc`](../Objects/genobject.c),
iritkatriel marked this conversation as resolved.
Show resolved Hide resolved
its embedded `_PyInterpreterFrame` field may need to be preserved, if it is exposed
to Python as part of a [`PyFrameObject`](frames.md#frame-objects). This is detected
in [`_PyFrame_ClearExceptCode`](../Python/frame.c) by the fact that the interpreter
frame's `frame_obj` field is set, and the frame object it points to has refcount
greater than 1. If so, the `take_ownership()` function is called to create a new
copy of the interpreter frame and transfer ownership of it from the generator to
the frame object.

Iteration
---------

The [`FOR_ITER`](https://docs.python.org/dev/library/dis.html#opcode-FOR_ITER)
instruction calls `__next__` on the iterator which is on the top of the stack,
and pushes the result to the stack. It has [`specializations`](adaptive.md)
for a few common iterator types, including `FOR_ITER_GEN`, for iterating over
a generator.

iritkatriel marked this conversation as resolved.
Show resolved Hide resolved
Chained Generators
------------------

A `yield from` expression creates a generator that efficiently yields the
sequence created by another generator. This is implemented with the
[`SEND` instruction](https://docs.python.org/dev/library/dis.html#opcode-SEND),
which pushes the value of its arg to the stack of the generator's frame, sets
the exception state on this frame, and resumes execution of the chained generator.
On return from `SEND`, the value at the top of the stack is sent back up
the generator chain with a `YIELD_VALUE`. This sequence of `SEND` followed by
`YIELD_VALUE` is repeated in a loop, until a `StopIteration` exception is
raised to indicate that the generator has no more values to emit.

The [`CLEANUP_THROW`](https://docs.python.org/dev/library/dis.html#opcode-CLEANUP_THROW)
instruction is used to handle exceptions raised from the send-yield loop.
Exceptions of type `StopIteration` is handled, their `value` field hold the
value to be returned by the generator's `close()` function. Any other
exception is re-raised by `CLEANUP_THROW`.

Coroutines
----------

Coming soon.
Coroutines are generators that use the value returned from a `yield` expression,
i.e., the argument that was passed to the `.send()` call that resumed it after
it yielded. This makes it possible for data to flow in both directions: from
the generator to the caller via the argument of the `yield` expression, and
from the caller to the generator via the send argument to the `send()` call.
A `yield from` expression passes the `send` argument to the chained generator,
so this data flow works along the chain (see `gen_send_ex2()` in
[`genobject.c`](../Objects/genobject.c)).

<!--
- Generators, async functions, async generators, and ``yield from`` (next, send, throw, close; and await; and how this code breaks the interpreter abstraction)
-->
Recall that a generator's `__next__` function simply calls `self.send(None)`,
so all this works the same in generators and coroutines, but only coroutines
use the value of the argument to `send`.
Loading