Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 023b7d2

Browse filesBrowse files
authored
GH-126491: Lower heap size limit with faster marking (GH-127519)
* Faster marking of reachable objects * Changes calculation of work to do and work done. * Merges transitive closure calculations
1 parent 8b7c194 commit 023b7d2
Copy full SHA for 023b7d2

File tree

Expand file treeCollapse file tree

6 files changed

+208
-243
lines changed
Filter options
Expand file treeCollapse file tree

6 files changed

+208
-243
lines changed

‎InternalDocs/garbage_collector.md

Copy file name to clipboardExpand all lines: InternalDocs/garbage_collector.md
+44-6Lines changed: 44 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -199,22 +199,22 @@ unreachable:
199199

200200
```pycon
201201
>>> import gc
202-
>>>
202+
>>>
203203
>>> class Link:
204204
... def __init__(self, next_link=None):
205205
... self.next_link = next_link
206-
...
206+
...
207207
>>> link_3 = Link()
208208
>>> link_2 = Link(link_3)
209209
>>> link_1 = Link(link_2)
210210
>>> link_3.next_link = link_1
211211
>>> A = link_1
212212
>>> del link_1, link_2, link_3
213-
>>>
213+
>>>
214214
>>> link_4 = Link()
215215
>>> link_4.next_link = link_4
216216
>>> del link_4
217-
>>>
217+
>>>
218218
>>> # Collect the unreachable Link object (and its .__dict__ dict).
219219
>>> gc.collect()
220220
2
@@ -459,11 +459,11 @@ specifically in a generation by calling `gc.collect(generation=NUM)`.
459459
>>> # Create a reference cycle.
460460
>>> x = MyObj()
461461
>>> x.self = x
462-
>>>
462+
>>>
463463
>>> # Initially the object is in the young generation.
464464
>>> gc.get_objects(generation=0)
465465
[..., <__main__.MyObj object at 0x7fbcc12a3400>, ...]
466-
>>>
466+
>>>
467467
>>> # After a collection of the youngest generation the object
468468
>>> # moves to the old generation.
469469
>>> gc.collect(generation=0)
@@ -515,6 +515,44 @@ increment. All objects directly referred to from those stack frames are
515515
added to the working set.
516516
Then the above algorithm is repeated, starting from step 2.
517517

518+
Determining how much work to do
519+
-------------------------------
520+
521+
We need to do a certain amount of work to enusre that garbage is collected,
522+
but doing too much work slows down execution.
523+
524+
To work out how much work we need to do, consider a heap with `L` live objects
525+
and `G0` garbage objects at the start of a full scavenge and `G1` garbage objects
526+
at the end of the scavenge. We don't want the amount of garbage to grow, `G1 ≤ G0`, and
527+
we don't want too much garbage (say 1/3 of the heap maximum), `G0 ≤ L/2`.
528+
For each full scavenge we must visit all objects, `T == L + G0 + G1`, during which
529+
`G1` garbage objects are created.
530+
531+
The number of new objects created `N` must be at least the new garbage created, `N ≥ G1`,
532+
assuming that the number of live objects remains roughly constant.
533+
If we set `T == 4*N` we get `T > 4*G1` and `T = L + G0 + G1` => `L + G0 > 3G1`
534+
For a steady state heap (`G0 == G1`) we get `L > 2G0` and the desired garbage ratio.
535+
536+
In other words, to keep the garbage fraction to 1/3 or less we need to visit
537+
4 times as many objects as are newly created.
538+
539+
We can do better than this though. Not all new objects will be garbage.
540+
Consider the heap at the end of the scavenge with `L1` live objects and `G1`
541+
garbage. Also, note that `T == M + I` where `M` is the number of objects marked
542+
as reachable and `I` is the number of objects visited in increments.
543+
Everything in `M` is live, so `I ≥ G0` and in practice `I` is closer to `G0 + G1`.
544+
545+
If we choose the amount of work done such that `2*M + I == 6N` then we can do
546+
less work in most cases, but are still guaranteed to keep up.
547+
Since `I ≳ G0 + G1` (not strictly true, but close enough)
548+
`T == M + I == (6N + I)/2` and `(6N + I)/2 ≳ 4G`, so we can keep up.
549+
550+
The reason that this improves performance is that `M` is usually much larger
551+
than `I`. If `M == 10I`, then `T ≅ 3N`.
552+
553+
Finally, instead of using a fixed multiple of 8, we gradually increase it as the
554+
heap grows. This avoids wasting work for small heaps and during startup.
555+
518556

519557
Optimization: reusing fields to save memory
520558
===========================================

‎Lib/test/test_gc.py

Copy file name to clipboardExpand all lines: Lib/test/test_gc.py
+3-11Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1161,27 +1161,19 @@ def make_ll(depth):
11611161
return head
11621162

11631163
head = make_ll(1000)
1164-
count = 1000
1165-
1166-
# There will be some objects we aren't counting,
1167-
# e.g. the gc stats dicts. This test checks
1168-
# that the counts don't grow, so we try to
1169-
# correct for the uncounted objects
1170-
# This is just an estimate.
1171-
CORRECTION = 20
11721164

11731165
enabled = gc.isenabled()
11741166
gc.enable()
11751167
olds = []
11761168
initial_heap_size = _testinternalcapi.get_tracked_heap_size()
1177-
for i in range(20_000):
1169+
iterations = max(20_000, initial_heap_size)
1170+
for i in range(iterations):
11781171
newhead = make_ll(20)
1179-
count += 20
11801172
newhead.surprise = head
11811173
olds.append(newhead)
11821174
if len(olds) == 20:
11831175
new_objects = _testinternalcapi.get_tracked_heap_size() - initial_heap_size
1184-
self.assertLess(new_objects, 27_000, f"Heap growing. Reached limit after {i} iterations")
1176+
self.assertLess(new_objects, initial_heap_size/2, f"Heap growing. Reached limit after {i} iterations")
11851177
del olds[:]
11861178
if not enabled:
11871179
gc.disable()

‎Objects/dictobject.c

Copy file name to clipboardExpand all lines: Objects/dictobject.c
+1-3Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7064,9 +7064,7 @@ int
70647064
PyObject_VisitManagedDict(PyObject *obj, visitproc visit, void *arg)
70657065
{
70667066
PyTypeObject *tp = Py_TYPE(obj);
7067-
if((tp->tp_flags & Py_TPFLAGS_MANAGED_DICT) == 0) {
7068-
return 0;
7069-
}
7067+
assert(tp->tp_flags & Py_TPFLAGS_MANAGED_DICT);
70707068
if (tp->tp_flags & Py_TPFLAGS_INLINE_VALUES) {
70717069
PyDictValues *values = _PyObject_InlineValues(obj);
70727070
if (values->valid) {

‎Objects/genobject.c

Copy file name to clipboardExpand all lines: Objects/genobject.c
+3-66Lines changed: 3 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -882,25 +882,7 @@ PyTypeObject PyGen_Type = {
882882
gen_methods, /* tp_methods */
883883
gen_memberlist, /* tp_members */
884884
gen_getsetlist, /* tp_getset */
885-
0, /* tp_base */
886-
0, /* tp_dict */
887-
888-
0, /* tp_descr_get */
889-
0, /* tp_descr_set */
890-
0, /* tp_dictoffset */
891-
0, /* tp_init */
892-
0, /* tp_alloc */
893-
0, /* tp_new */
894-
0, /* tp_free */
895-
0, /* tp_is_gc */
896-
0, /* tp_bases */
897-
0, /* tp_mro */
898-
0, /* tp_cache */
899-
0, /* tp_subclasses */
900-
0, /* tp_weaklist */
901-
0, /* tp_del */
902-
0, /* tp_version_tag */
903-
_PyGen_Finalize, /* tp_finalize */
885+
.tp_finalize = _PyGen_Finalize,
904886
};
905887

906888
static PyObject *
@@ -1242,24 +1224,7 @@ PyTypeObject PyCoro_Type = {
12421224
coro_methods, /* tp_methods */
12431225
coro_memberlist, /* tp_members */
12441226
coro_getsetlist, /* tp_getset */
1245-
0, /* tp_base */
1246-
0, /* tp_dict */
1247-
0, /* tp_descr_get */
1248-
0, /* tp_descr_set */
1249-
0, /* tp_dictoffset */
1250-
0, /* tp_init */
1251-
0, /* tp_alloc */
1252-
0, /* tp_new */
1253-
0, /* tp_free */
1254-
0, /* tp_is_gc */
1255-
0, /* tp_bases */
1256-
0, /* tp_mro */
1257-
0, /* tp_cache */
1258-
0, /* tp_subclasses */
1259-
0, /* tp_weaklist */
1260-
0, /* tp_del */
1261-
0, /* tp_version_tag */
1262-
_PyGen_Finalize, /* tp_finalize */
1227+
.tp_finalize = _PyGen_Finalize,
12631228
};
12641229

12651230
static void
@@ -1464,7 +1429,6 @@ typedef struct _PyAsyncGenWrappedValue {
14641429
(assert(_PyAsyncGenWrappedValue_CheckExact(op)), \
14651430
_Py_CAST(_PyAsyncGenWrappedValue*, (op)))
14661431

1467-
14681432
static int
14691433
async_gen_traverse(PyObject *self, visitproc visit, void *arg)
14701434
{
@@ -1673,24 +1637,7 @@ PyTypeObject PyAsyncGen_Type = {
16731637
async_gen_methods, /* tp_methods */
16741638
async_gen_memberlist, /* tp_members */
16751639
async_gen_getsetlist, /* tp_getset */
1676-
0, /* tp_base */
1677-
0, /* tp_dict */
1678-
0, /* tp_descr_get */
1679-
0, /* tp_descr_set */
1680-
0, /* tp_dictoffset */
1681-
0, /* tp_init */
1682-
0, /* tp_alloc */
1683-
0, /* tp_new */
1684-
0, /* tp_free */
1685-
0, /* tp_is_gc */
1686-
0, /* tp_bases */
1687-
0, /* tp_mro */
1688-
0, /* tp_cache */
1689-
0, /* tp_subclasses */
1690-
0, /* tp_weaklist */
1691-
0, /* tp_del */
1692-
0, /* tp_version_tag */
1693-
_PyGen_Finalize, /* tp_finalize */
1640+
.tp_finalize = _PyGen_Finalize,
16941641
};
16951642

16961643

@@ -1935,16 +1882,6 @@ PyTypeObject _PyAsyncGenASend_Type = {
19351882
PyObject_SelfIter, /* tp_iter */
19361883
async_gen_asend_iternext, /* tp_iternext */
19371884
async_gen_asend_methods, /* tp_methods */
1938-
0, /* tp_members */
1939-
0, /* tp_getset */
1940-
0, /* tp_base */
1941-
0, /* tp_dict */
1942-
0, /* tp_descr_get */
1943-
0, /* tp_descr_set */
1944-
0, /* tp_dictoffset */
1945-
0, /* tp_init */
1946-
0, /* tp_alloc */
1947-
0, /* tp_new */
19481885
.tp_finalize = async_gen_asend_finalize,
19491886
};
19501887

‎Objects/typeobject.c

Copy file name to clipboardExpand all lines: Objects/typeobject.c
+13Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2355,6 +2355,16 @@ subtype_traverse(PyObject *self, visitproc visit, void *arg)
23552355
return 0;
23562356
}
23572357

2358+
2359+
static int
2360+
plain_object_traverse(PyObject *self, visitproc visit, void *arg)
2361+
{
2362+
PyTypeObject *type = Py_TYPE(self);
2363+
assert(type->tp_flags & Py_TPFLAGS_MANAGED_DICT);
2364+
Py_VISIT(type);
2365+
return PyObject_VisitManagedDict(self, visit, arg);
2366+
}
2367+
23582368
static void
23592369
clear_slots(PyTypeObject *type, PyObject *self)
23602370
{
@@ -4147,6 +4157,9 @@ type_new_descriptors(const type_new_ctx *ctx, PyTypeObject *type)
41474157
assert((type->tp_flags & Py_TPFLAGS_MANAGED_DICT) == 0);
41484158
type->tp_flags |= Py_TPFLAGS_MANAGED_DICT;
41494159
type->tp_dictoffset = -1;
4160+
if (type->tp_basicsize == sizeof(PyObject)) {
4161+
type->tp_traverse = plain_object_traverse;
4162+
}
41504163
}
41514164

41524165
type->tp_basicsize = slotoffset;

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.