gh-126703: Add PyCFunction freelist #128692

eendebakpt · 2025-01-09T20:40:32Z

See #128368 and the corresponding issue for details.

Issue: Use freelist for range object, iterator objects and other often used objects #126703

ZeroIntensity

LGTM, with one comment.

ZeroIntensity · 2025-01-10T13:45:17Z

Include/internal/pycore_freelist_state.h

@@ -22,6 +22,8 @@ extern "C" {
 #  define Py_futureiters_MAXFREELIST 255
 #  define Py_object_stack_chunks_MAXFREELIST 4
 #  define Py_unicode_writers_MAXFREELIST 1
+#  define Py_pycfunctionobject_MAXFREELIST 16


How are these lengths being chosen? 16 seems fine for PyCFunction, but I would think that methods are created much more frequently.

Choosing the number is a bit of an art (and a good number will depend what kind of code is executed).

How well the freelist works depends on the dynamics of the allocation and deallocation. For example the size of Py_unicode_writers_MAXFREELIST is only 1. I suspect this is because the typical use case is: create a writer, write some data, release it again. And during the writing of the data no new unicode writers are constructed.

With the diff below you can see whether a PyCMethodObject is obtained from the freelist (if the size > 0), or whether a new one is allocated (when size = 0).

diff --git a/Include/internal/pycore_freelist.h b/Include/internal/pycore_freelist.h index 84a5ab30f3e..90d6616cbfd 100644 --- a/Include/internal/pycore_freelist.h +++ b/Include/internal/pycore_freelist.h @@ -28,6 +28,10 @@ _Py_freelists_GET(void) #endif } +#define _Py_FREELIST_FREE_PRINT(NAME, op, freefunc) \ + _PyFreeList_Free_Print(&_Py_freelists_GET()->NAME, _PyObject_CAST(op), \ + Py_ ## NAME ## _MAXFREELIST, freefunc, #NAME) + // Pushes `op` to the freelist, calls `freefunc` if the freelist is full #define _Py_FREELIST_FREE(NAME, op, freefunc) \ _PyFreeList_Free(&_Py_freelists_GET()->NAME, _PyObject_CAST(op), \ @@ -69,6 +73,16 @@ _PyFreeList_Free(struct _Py_freelist *fl, void *obj, Py_ssize_t maxsize, } } +static inline void +_PyFreeList_Free_Print(struct _Py_freelist *fl, void *obj, Py_ssize_t maxsize, + freefunc dofree, const char *name) +{ + if (!_PyFreeList_Push(fl, obj, maxsize)) { + printf(" %s: no space to store object\n", name); + dofree(obj); + } +} + static inline void * _PyFreeList_PopNoStats(struct _Py_freelist *fl) { diff --git a/Objects/methodobject.c b/Objects/methodobject.c index fc77055b0a2..58d928d78bf 100644 --- a/Objects/methodobject.c +++ b/Objects/methodobject.c @@ -86,6 +86,7 @@ PyCMethod_New(PyMethodDef *ml, PyObject *self, PyObject *module, PyTypeObject *c "flag but no class"); return NULL; } + printf("PyCMethodObject: try allocation (freelist size %d)\n", _Py_FREELIST_SIZE(pycmethodobject)); PyCMethodObject *om = _Py_FREELIST_POP(PyCMethodObject, pycmethodobject); if (om == NULL) { om = PyObject_GC_New(PyCMethodObject, &PyCMethod_Type); @@ -102,6 +103,7 @@ PyCMethod_New(PyMethodDef *ml, PyObject *self, PyObject *module, PyTypeObject *c "but no METH_METHOD flag"); return NULL; } + printf("PyCFunctionObject: try allocation (freelist size %d)\n", _Py_FREELIST_SIZE(pycfunctionobject)); op = _Py_FREELIST_POP(PyCFunctionObject, pycfunctionobject); if (op == NULL) { op = PyObject_GC_New(PyCFunctionObject, &PyCFunction_Type); @@ -180,11 +182,11 @@ meth_dealloc(PyObject *self) Py_XDECREF(m->m_module); if (m->m_ml->ml_flags & METH_METHOD) { assert(Py_IS_TYPE(self, &PyCMethod_Type)); - _Py_FREELIST_FREE(pycmethodobject, m, PyObject_GC_Del); + _Py_FREELIST_FREE_PRINT(pycmethodobject, m, PyObject_GC_Del); } else { assert(Py_IS_TYPE(self, &PyCFunction_Type)); - _Py_FREELIST_FREE(pycfunctionobject, m, PyObject_GC_Del); + _Py_FREELIST_FREE_PRINT(pycfunctionobject, m, PyObject_GC_Del); } Py_TRASHCAN_END; }

When I use this on python -m test test_operator:

there are many allocations at the start of the program (because nothing has been deallocated, so there is nothing on the freelist)

there is a dynamic section where often objects are allocated and deallocated, the freelist size is changing a lot, but it is not reaching the maximum size of 16

at the end the python interpreter is closing down, so many objects are deallocated

Based on this there would be no need to increase the freelist size. On the other hand, it is almost free to increase the size (the only memory is the objects on the freelist).

I am fine with changing the size (anything between 4 and 400 would be fine with me), but I am not sure it matters or how to make a more informed decision.

Full output is at:

https://gist.github.com/eendebakpt/7a867587450dca4689bb46271fb01ec2

vstinner · 2025-01-15T07:35:15Z

Would you mind to rerun the benchmark on this PR?

eendebakpt · 2025-01-15T10:06:49Z

Here are benchmarks against current main (Linux, PGO). The first benchmark tests the freelist from this PR, the other two are control benchmarks (they should not be affected):

bench_builtin_or_method: Mean +- std dev: [main0] 5.51 us +- 0.11 us -> [pr] 4.03 us +- 0.07 us: 1.37x faster
bench_property: Mean +- std dev: [main0] 1.68 us +- 0.02 us -> [pr] 1.70 us +- 0.03 us: 1.01x slower
bench_class_method: Mean +- std dev: [main0] 1.90 us +- 0.05 us -> [pr] 1.88 us +- 0.04 us: 1.01x faster

Geometric mean: 1.11x faster

Script

# Quick benchmark for cpython freelists

import pyperf

def bench_builtin_or_method(loops):
    range_it = iter(range(loops))
    tpl = tuple(range(50))

    lst = []
    it = iter(set([2, 3, 4]))
    t0 = pyperf.perf_counter()
    for ii in range_it:
        for ii in tpl:
            lst.append
            it.__length_hint__
    return pyperf.perf_counter() - t0


class A:
    def __init__(self, value):
        self.value = value

    def x(self):
        return self.value

    @property
    def v(self):
        return self.value


def bench_property(loops):
    range_it = iter(range(loops))
    tpl = tuple(range(50))

    t0 = pyperf.perf_counter()
    for ii in range_it:
        a = A(ii)
        for ii in tpl:
            _ = a.v
    return pyperf.perf_counter() - t0


def bench_class_method(loops):
    range_it = iter(range(loops))
    tpl = tuple(range(50))

    t0 = pyperf.perf_counter()
    for ii in range_it:
        a = A(ii)
        for ii in tpl:
            _ = a.x()
    return pyperf.perf_counter() - t0


if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.bench_time_func("bench_builtin_or_method", bench_builtin_or_method)
    runner.bench_time_func("bench_property", bench_property)
    runner.bench_time_func("bench_class_method", bench_class_method)

vstinner

LGTM

add pycfunction freelist

20cc6b9

eendebakpt requested a review from ericsnowcurrently as a code owner January 9, 2025 20:40

bedevere-app bot mentioned this pull request Jan 9, 2025

Use freelist for range object, iterator objects and other often used objects #126703

Open

bedevere-app bot added the awaiting review label Jan 9, 2025

blurb-it bot and others added 2 commits January 9, 2025 22:12

📜🤖 Added by blurb_it.

e86ccad

Merge branch 'main' into pycfunctionobject_freelist

a76224b

ZeroIntensity approved these changes Jan 10, 2025

View reviewed changes

bedevere-app bot added awaiting core review and removed awaiting review labels Jan 10, 2025

eendebakpt mentioned this pull request Jan 14, 2025

gh-126703: Add freelists for iterators and range, method and builtin_function_or_method objects #128368

Draft

Merge branch 'main' into pycfunctionobject_freelist

6c9d056

vstinner approved these changes Jan 15, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-126703: Add PyCFunction freelist #128692

gh-126703: Add PyCFunction freelist #128692

eendebakpt commented Jan 9, 2025 •

edited by bedevere-app bot

Loading

ZeroIntensity left a comment

ZeroIntensity Jan 10, 2025

eendebakpt Jan 14, 2025

vstinner commented Jan 15, 2025

eendebakpt commented Jan 15, 2025

vstinner left a comment

gh-126703: Add PyCFunction freelist #128692

Are you sure you want to change the base?

gh-126703: Add PyCFunction freelist #128692

Conversation

eendebakpt commented Jan 9, 2025 • edited by bedevere-app bot Loading

ZeroIntensity left a comment

Choose a reason for hiding this comment

ZeroIntensity Jan 10, 2025

Choose a reason for hiding this comment

eendebakpt Jan 14, 2025

Choose a reason for hiding this comment

vstinner commented Jan 15, 2025

eendebakpt commented Jan 15, 2025

vstinner left a comment

Choose a reason for hiding this comment

eendebakpt commented Jan 9, 2025 •

edited by bedevere-app bot

Loading