From 07844e01f32aab0f382f51a47a0460521b5ef454 Mon Sep 17 00:00:00 2001 From: Nathan Goldbaum Date: Wed, 16 Oct 2024 09:38:51 -0600 Subject: [PATCH 1/3] update porting guide --- docs/porting.md | 166 ++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 148 insertions(+), 18 deletions(-) diff --git a/docs/porting.md b/docs/porting.md index 3addf96..5af16dc 100644 --- a/docs/porting.md +++ b/docs/porting.md @@ -1,10 +1,11 @@ -# Porting Extension Modules to Support Free-Threading - -Many Python extension modules are not thread-safe in the free-threaded build as -of mid-2024. Up until now, the GIL has added implicit locking around any -operation in Python or C that holds the GIL, and the GIL must be explicitly -dropped before many thread-safety issues become problematic. Also, because of -the GIL, attempting to parallelize many workflows using the Python +# Porting Python Packages to Support Free-Threading + +Many Python packages, particularly packages relying on C extension modules, are +not thread-safe in the free-threaded build as of mid-2024. Up until now, the GIL +has added implicit locking around any operation in Python or C that holds the +GIL, and the GIL must be explicitly dropped before many thread-safety issues +become problematic. Also, because of the GIL, attempting to parallelize many +workflows using the Python [threading](https://docs.python.org/3/library/threading.html) module will not produce any speedups, so thread-safety issues that are possible even with the GIL are not hit often since users do not make use of threading as much as other @@ -12,9 +13,11 @@ parallelization strategies. This means many codebases have threading bugs that up-until-now have only been theoretical or present in niche use cases. With free-threading, many more users will want to use Python threads. -This means we must analyze the codebases of extension modules to identify -thread-safety issues and make changes to thread-unsafe low-level code, -including C, C++, and Cython code exposed to Python. +This means we must analyze Python codebases, particularly in low-level extension +modules, to identify thread-safety issues and make changes to thread-unsafe +low-level code, including C, C++, and Cython code exposed to Python. + +## Updating Extension Modules ### Declaring free-threaded support @@ -189,6 +192,15 @@ repeatedly call a function simultaneously. You can also use the `threading` module directly. Adding a `threading.Barrier` before your test code is a good way to synchronize workers and encourage a race condition. +You can also look at +[pytest-run-parallel](https://github.com/Quansight-Labs/pytest-run-parallel) as +well as +[pytest-freethreaded](https://github.com/tonybaloney/pytest-freethreaded), which +both offer pytest plugins to enable running tests in an existing PyTest test +suite simultaneously in many threads. See the section below on [global state in +tests](porting.md#dealing-with-global-state-in-tests) for more information +about updating test suites to work with the free-threaded build. + Many C and C++ extensions assume the GIL serializes access to state shared between threads, introducing the possibility of data races and race conditions that were impossible before when the GIL is disabled. @@ -240,13 +252,21 @@ free-threaded build is used more. The goal for now should be to enable further refinement and experimentation by fixing issues that prevent using the library at all. -### Locking and Synchronization Primitives +## Locking and Synchronization Primitives + +### Native mutexes If your extension is written in C++, Rust, or another modern language that exposes locking primitives in the standard library, you should consider using the locking primitives provided by your language or framework to add locks when needed. +If you need to call arbitrary Python code while the lock is held, care +should be taken to avoid creating deadlocks with the GIL on the GIL-enabled +build. + +### `PyMutex` + For C code or C-like C++ code, the CPython 3.13 C API exposes [`PyMutex`](https://docs.python.org/3.13/c-api/init.html#c.PyMutex), a high-performance locking primitive that supports static allocation. As of @@ -254,7 +274,50 @@ CPython 3.13, the mutex requires only one byte for storage, but future versions of CPython may change that, so you should not rely on the size of `PyMutex` in your code. -## Global state +You can use `PyMutex` in both the free-threaded and GIL-enabled build of Python +3.13 or newer. `PyMutex` is hooked into the CPython runtime, so that if a thread +tries to acquire the mutex and ends up blocked, garbage collection can still +proceed and, in the GIL-enabled build, the blocked thread releases the GIL, +allowing other threads to continue running. This implies that it is impossible +to create a deadlock between a `PyMutex` and the GIL. For this reason, it is not +necessary to add code for the GIL-enabled to ensure the GIL is released before +acquiring a `PyMutex`. If you do not call into the CPython C API while holding +the lock `PyMutex` has no special advantages over other mutexes, besides +low-level details like performance or the size of the mutex object in memory. + +See the section on [dealing with thread-unsafe low-level +libraries](porting.md#dealing-with-thread-unsafe-libraries) below for an example +using PyMutex to lock around a thread-unsafe C library. + +### Critical Sections + +Python 3.13 or newer also a [critical section +API](https://docs.python.org/3/c-api/init.html#python-critical-section-api) that +is useful for locking either a single object or a pair of objects during a +low-level operation. The critical section API is intended to provide weaker, but +still useful locking guarantees compared to directly locking access to object +using a mutex. This provides similar guarantees to the GIL and avoids +the risk of deadlocks introduced by locking individual objects. + +The main difference compared with using a per-object lock is that active +critical sections are suspended if a thread calls `PyEval_SaveThread` (e.g. when +the GIL is released on the GIL-enabled build), and then restored when the thread +calls `PyEval_RestoreThread` (e.g. when the GIL is re-acquired on the +GIL-enabled build). This means that while the critical sections are suspended, +it's possible for any thread to re-acquire a thread state and mutate the locked +object. This can also happen with the GIL, since the GIL is a re-entrant lock, +and extensions are allowed to recursively release and acquire it in an +interleaved manner. + +Critical sections are most useful when implementing the low-level internals of a +custom object that you fully control. You can apply critical sections around +modification of internal state to effectively serialize access to that state. + +See the section below on [dealing with thread-unsafe +objects](porting.md##dealing-with-thread-unsafe-objects) for an example using +the critical section API. + +## Dealing with global state Many CPython C extensions make strong assumptions about the GIL. For example, before NumPy 2.1.0, the C code in NumPy made extensive use of C static global @@ -264,11 +327,11 @@ it is not possible for two C threads to simultaneously see the state of the C global variables, so no data races are possible. In free-threaded Python, global state like this is no longer safe against data -races and undefined behavior in C code. A cache of `PyObject`s stored -in a C global pointer array can be overwritten simultaneously by multiple -Python threads, leading to memory corruption and segfaults. +races and undefined behavior in C code. A cache of `PyObject` pointers stored in +a C global array can be overwritten simultaneously by multiple Python threads, +leading to memory corruption and segfaults. -### Converting to thread local state +### Converting global state to thread local state Often the easiest way to fix data races due to global state is to convert the global state to thread local state. @@ -406,7 +469,74 @@ int function_accessing_the_cache(void) { [this NumPy PR](https://github.com/numpy/numpy/pull/27011) that ports all `PyThread_type_lock` usages to `PyMutex`. -### Dealing with thread-unsafe libraries +## Fixing thread-unsafe tests. + +Many existing tests are written using global state. This is not a problem if the +test only runs once, but if you would like to use your tests to check for +possible thread safety issues by running existing tests on many threads, you +will likely need to update the tests to eliminate use of global state. + +Since tests using global state are inherently racey, this means that test +failures associated with these tests are also inherently flakey. If you see +tests failing intermittently, you should not discount that you are using global +state in a test, or even inadvertantly using global state in PyTest itself. + +#### PyTest is not thread-safe + +See [the PyTest +docs](https://docs.pytest.org/en/stable/explanation/flaky.html#thread-safety) +for more information about this. While tests can manage their own threads, you +should not assume that functionality provided by PyTest is thread-safe. + +Functionality that is known not to be thread-safe includes: + +* [`pytest.warns`](https://docs.pytest.org/en/stable/reference/reference.html#pytest.warns), + it relies on `warnings.catch_warnings`, which is not thread safe. +* The [`tmp_path`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmp_path) and [`tmpdir`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmpdir) fixtures, since they rely on the filesystem +* The [`capsys` + fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-capsys), + because of shared use of `sys.stdout` and `sys.stderr`. +* The [`monkeypatch` + fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-monkeypatch). + +Note that the PyTest maintainers have explicitly ruled out making PyTest +thread-safe, please do not open issues asking to fix thread safety issues in +PyTest itself. + +#### The `warnings` module is not thread safe + +Many tests carefully ensure that warnings will be seen by the user in cases +where the library author intends users to see them. These tests inevintably make +use of the [`warnings` +module](https://docs.python.org/3/library/warnings.html). As noted in [the +documentation for +`warnings.catch_warnings`](https://docs.python.org/3/library/warnings.html#available-context-managers), +the functionality provided by Python to track warnings is inherently +thread-unsafe. This means tests that check for warnings should be marked as +thread-unsafe and should be skipped when running tests on many threads +simultaneously, since they will randomly pass or fail depending on thread +timing. + +Hopefully in the future it will be possible for Python to write a scalable +infrastucture for tracking warnings to fix this issue once and for all. See [the +CPython issue](https://github.com/python/cpython/issues/91505) tracking this +problem for more information. + +#### File system thread safety + +Many tests make use of the file system, either via a temporary file, or by +simply directly writing to the folder running the test. If the filename used by +the test is a constant or it is ever shared between instances of the test, the +filesystem becomes shared global state, and the test will not be thread-safe. + +The easiest way to fix this is to use +[`tempfile`](https://docs.python.org/3/library/tempfile.html), which +automatically handles generating file handles in a thread-safe manner. If for +some reason this isn't practical, consider forcing the filenames used in tests +to be unique, for example by appending a +[UUID](https://docs.python.org/3/library/uuid.html) to the filename. + +## Dealing with thread-unsafe libraries Many C, C++, and Fortran libraries are not written in a thread-safe manner. It is still possible to call these libraries from free-threaded Python, but @@ -462,7 +592,7 @@ int call_library_function(int *argument) { Any other wrapped function needs similar locking around each call into the library. -### Dealing with thread-unsafe objects +## Dealing with thread-unsafe objects Similar to the section above, objects may need locking or atomics if they can be concurrently modified from multiple threads. CPython 3.13 From ae17365cd6a5ac6bd3df145216cb898fae538d8c Mon Sep 17 00:00:00 2001 From: Nathan Goldbaum Date: Fri, 18 Oct 2024 13:52:46 -0600 Subject: [PATCH 2/3] respond to review comments --- docs/porting.md | 60 +++++++++++++++++++++++++------------------------ 1 file changed, 31 insertions(+), 29 deletions(-) diff --git a/docs/porting.md b/docs/porting.md index 5af16dc..17dc8c2 100644 --- a/docs/porting.md +++ b/docs/porting.md @@ -3,18 +3,18 @@ Many Python packages, particularly packages relying on C extension modules, are not thread-safe in the free-threaded build as of mid-2024. Up until now, the GIL has added implicit locking around any operation in Python or C that holds the -GIL, and the GIL must be explicitly dropped before many thread-safety issues +GIL, and the GIL must be explicitly dropped before many thread safety issues become problematic. Also, because of the GIL, attempting to parallelize many workflows using the Python [threading](https://docs.python.org/3/library/threading.html) module will not -produce any speedups, so thread-safety issues that are possible even with the +produce any speedups, so thread safety issues that are possible even with the GIL are not hit often since users do not make use of threading as much as other parallelization strategies. This means many codebases have threading bugs that up-until-now have only been theoretical or present in niche use cases. With free-threading, many more users will want to use Python threads. This means we must analyze Python codebases, particularly in low-level extension -modules, to identify thread-safety issues and make changes to thread-unsafe +modules, to identify thread safety issues and make changes to thread-unsafe low-level code, including C, C++, and Cython code exposed to Python. ## Updating Extension Modules @@ -159,7 +159,7 @@ after importing a module that does not support the GIL. === "f2py" Starting with NumPy 2.1.0 (only available via the nightly wheels or the `main` branch as of right now), extension modules containing f2py-wrapped - Fortran code can declare they are thread safe and support free-threading + Fortran code can declare they are thread-safe and support free-threading using the [`--freethreading-compatible`](https://numpy.org/devdocs/f2py/usage.html#extension-module-construction) command-line argument: @@ -176,7 +176,7 @@ to also add support for the free-threaded build. ## Suggested Plan of Attack -Put priority on thread-safety issues surfaced by real-world testing. Run the +Put priority on thread safety issues surfaced by real-world testing. Run the test suite for your project and fix any failures that occur only with the GIL disabled. Some issues may be due to changes in Python 3.13 that are not specific to the free-threaded build. @@ -196,7 +196,7 @@ You can also look at [pytest-run-parallel](https://github.com/Quansight-Labs/pytest-run-parallel) as well as [pytest-freethreaded](https://github.com/tonybaloney/pytest-freethreaded), which -both offer pytest plugins to enable running tests in an existing PyTest test +both offer pytest plugins to enable running tests in an existing `pytest` test suite simultaneously in many threads. See the section below on [global state in tests](porting.md#dealing-with-global-state-in-tests) for more information about updating test suites to work with the free-threaded build. @@ -280,9 +280,9 @@ tries to acquire the mutex and ends up blocked, garbage collection can still proceed and, in the GIL-enabled build, the blocked thread releases the GIL, allowing other threads to continue running. This implies that it is impossible to create a deadlock between a `PyMutex` and the GIL. For this reason, it is not -necessary to add code for the GIL-enabled to ensure the GIL is released before -acquiring a `PyMutex`. If you do not call into the CPython C API while holding -the lock `PyMutex` has no special advantages over other mutexes, besides +necessary to add code for the GIL-enabled build to ensure the GIL is released +before acquiring a `PyMutex`. If you do not call into the CPython C API while +holding the lock, `PyMutex` has no special advantages over other mutexes, besides low-level details like performance or the size of the mutex object in memory. See the section on [dealing with thread-unsafe low-level @@ -291,11 +291,11 @@ using PyMutex to lock around a thread-unsafe C library. ### Critical Sections -Python 3.13 or newer also a [critical section +Python 3.13 or newer also offers a [critical section API](https://docs.python.org/3/c-api/init.html#python-critical-section-api) that is useful for locking either a single object or a pair of objects during a low-level operation. The critical section API is intended to provide weaker, but -still useful locking guarantees compared to directly locking access to object +still useful locking guarantees compared to directly locking access to an object using a mutex. This provides similar guarantees to the GIL and avoids the risk of deadlocks introduced by locking individual objects. @@ -479,31 +479,33 @@ will likely need to update the tests to eliminate use of global state. Since tests using global state are inherently racey, this means that test failures associated with these tests are also inherently flakey. If you see tests failing intermittently, you should not discount that you are using global -state in a test, or even inadvertantly using global state in PyTest itself. +state in a test, or even inadvertently using global state in `pytest` itself. -#### PyTest is not thread-safe +#### `pytest` is not thread-safe -See [the PyTest +See [the `pytest` docs](https://docs.pytest.org/en/stable/explanation/flaky.html#thread-safety) for more information about this. While tests can manage their own threads, you -should not assume that functionality provided by PyTest is thread-safe. +should not assume that functionality provided by `pytest` is thread-safe. Functionality that is known not to be thread-safe includes: * [`pytest.warns`](https://docs.pytest.org/en/stable/reference/reference.html#pytest.warns), - it relies on `warnings.catch_warnings`, which is not thread safe. -* The [`tmp_path`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmp_path) and [`tmpdir`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmpdir) fixtures, since they rely on the filesystem + it relies on `warnings.catch_warnings`, which is not thread-safe. +* The [`tmp_path`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmp_path) + and [`tmpdir`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmpdir) + fixtures, since they rely on the filesystem * The [`capsys` fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-capsys), because of shared use of `sys.stdout` and `sys.stderr`. * The [`monkeypatch` fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-monkeypatch). -Note that the PyTest maintainers have explicitly ruled out making PyTest +Note that the `pytest` maintainers have explicitly ruled out making `pytest` thread-safe, please do not open issues asking to fix thread safety issues in -PyTest itself. +`pytest` itself. -#### The `warnings` module is not thread safe +#### The `warnings` module is not thread-safe Many tests carefully ensure that warnings will be seen by the user in cases where the library author intends users to see them. These tests inevintably make @@ -623,13 +625,13 @@ A variant for locking two objects at once is also available. For more informatio about `Py_BEGIN_CRITICAL_SECTION`, please see the [Python C API documentation on critical sections](https://docs.python.org/3.13/c-api/init.html#python-critical-section-api). -## Cython thread-safety +## Cython thread safety If your extension is written in Cython, you can generally assume that "Python-level" code that compiles to CPython C API operations on Python objects -is thread safe, but "C-level" code (e.g. code that will compile inside a -`with nogil` block) may have thread-safety issues. Note that not all code outside -`with nogil` blocks is thread safe. For example, a Python wrapper for a +is thread-safe, but "C-level" code (e.g. code that will compile inside a +`with nogil` block) may have thread safety issues. Note that not all code outside +`with nogil` blocks is thread-safe. For example, a Python wrapper for a thread-unsafe C library is thread-unsafe if the GIL is disabled unless there is locking around uses of the thread-unsafe library. Another example: using thread-unsafe C-level constructs like a global variable is also thread-unsafe @@ -640,19 +642,19 @@ if the GIL is disabled. In the free-threaded build it is possible for the reference count of an object to change "underneath" a running thread when it is mutated by another thread. This means that many APIs that assume reference counts cannot be -updated by another thread while it is running are no longer thread safe. In +updated by another thread while it is running are no longer thread-safe. In particular, C code returning "borrowed" references to Python objects in mutable -containers like lists may introduce thread-safety issues. A borrowed reference +containers like lists may introduce thread safety issues. A borrowed reference happens when a C API function does not increment the reference count of a Python object before returning the object to the caller. "New" references are safe to use until the owning thread releases the reference, as in non free-threaded code. -Most direct uses of the CPython C API are thread safe. There is no need to add +Most direct uses of the CPython C API are thread-safe. There is no need to add locking for scenarios that should be bugs in CPython. You can assume, for example, that the initializer for a Python object can only be called by one thread and the C-level implementation of a Python function can only be called on -one thread. Accessing the arguments of a Python function is thread safe no +one thread. Accessing the arguments of a Python function is thread-safe no matter what C API constructs are used and no matter whether the reference is borrowed or owned because two threads can't simultaneously call the same function with the same arguments from the same Python-level context. Of course @@ -672,7 +674,7 @@ to the item being de-allocated while the borrowed reference is still PyObject *item = Py_NewRef(PyList_GetItem(list_object, 0)) ``` -Is not thread safe, because in principle it's possible for the list item to be +Is not thread-safe, because in principle it's possible for the list item to be de-allocated before `Py_NewRef` gets a chance to increment the reference count. For that reason, you should inspect Python C API code to look for patterns From e38fad4b5882d0bd83180ebc7db51d22750e6f69 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci-lite[bot]" <117423508+pre-commit-ci-lite[bot]@users.noreply.github.com> Date: Fri, 18 Oct 2024 19:53:44 +0000 Subject: [PATCH 3/3] [pre-commit.ci lite] apply automatic fixes --- docs/porting.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/porting.md b/docs/porting.md index 17dc8c2..1124354 100644 --- a/docs/porting.md +++ b/docs/porting.md @@ -490,16 +490,16 @@ should not assume that functionality provided by `pytest` is thread-safe. Functionality that is known not to be thread-safe includes: -* [`pytest.warns`](https://docs.pytest.org/en/stable/reference/reference.html#pytest.warns), - it relies on `warnings.catch_warnings`, which is not thread-safe. -* The [`tmp_path`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmp_path) - and [`tmpdir`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmpdir) - fixtures, since they rely on the filesystem -* The [`capsys` - fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-capsys), - because of shared use of `sys.stdout` and `sys.stderr`. -* The [`monkeypatch` - fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-monkeypatch). +- [`pytest.warns`](https://docs.pytest.org/en/stable/reference/reference.html#pytest.warns), + it relies on `warnings.catch_warnings`, which is not thread-safe. +- The [`tmp_path`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmp_path) + and [`tmpdir`](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-tmpdir) + fixtures, since they rely on the filesystem +- The [`capsys` + fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-capsys), + because of shared use of `sys.stdout` and `sys.stderr`. +- The [`monkeypatch` + fixture](https://docs.pytest.org/en/stable/reference/reference.html#std-fixture-monkeypatch). Note that the `pytest` maintainers have explicitly ruled out making `pytest` thread-safe, please do not open issues asking to fix thread safety issues in