Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Improving GC collections: dynamic thresholds, single generation gc and time barriers #100403

Copy link
Copy link
Open
@pablogsal

Description

@pablogsal
Issue body actions

In the pursuit of trying to optimize GC runs it has been observed that the weak generational hypothesis may not apply that well to Python. This is because, according to this argument, in the presence of a mixel cycle GC + refcount GC strategy, young objects are mostly cleaned up by reference count, not by the GC. Is important as well that there is no segregation between the GC strategies and that the cycle GC needs to deal with objects that according to this will be mainly cleaned up by reference count alone.

This questions the utility of segregating GC by generations and indeed there is some evidence of this. I have been benchmarking the percentage of success of different generations in some programs (such as blach and mypy and a bunch of HTTP servers) and the success rate of the lower generations is generally small. Here is an example of running black over all the standard library:

Statistics for generation 0

Category Value
count 157917.000000
mean 1.192775
std 3.560391
min 0.000000
25% 0.000000
50% 0.000000
75% 0.480192
max 86.407768

Statistics for generation 1

Category Value
count 14346.000000
mean 2.670852
std 11.642815
min 0.000000
25% 0.000000
50% 0.000000
75% 0.388794
max 97.406097

Statistics for generation 2

Category Value
count 1280.000000
mean 45.698135
std 27.066735
min 0.000000
25% 31.965862
50% 54.618008
75% 67.038842
max 90.592705

I am currently investigating if having a single generation with a dynamic threshold that is similar to the strategy that we use currently for the last generation would be generally better to get better performance.

What do you think?

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.