Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Reduce overhead of indy plumbing #8748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2025
Merged

Conversation

headius
Copy link
Member

@headius headius commented Apr 4, 2025

JRuby 10 enables invokedynamic for all optimization by default. Unfortunately, some of the plumbing around invokedynamic use has never seen much in the way of profiling and optimization. This can impact early boot times for applications and early execution performance and warmup.

This PR will be a pass over the key parts of indy infrastructure with a goal of "harm reduction" when boot-time and early runtime common cases hit this plumbing heavily.

Optimizations included:

  • Reduce overhead from hierarchy-based invalidation by right-sizing lists and avoiding re-invalidation of unused SwitchPoints.

@headius headius added this to the JRuby 10.0.0.0 milestone Apr 4, 2025
@headius headius force-pushed the indy_cost_reduction branch from a253915 to 513f869 Compare April 4, 2025 14:31
Profiling repeated high-ancestor method definitions revealed much
slower performance for all that invalidation when using indy and
SwitchPoint, around 4-6x slower than the simple generation-based
invalidator. Much of the overhead is in building the invalidator
list, SwitchPoint array, and invalidating those SwitchPoints over
and over.

This patch makes the following improvements:

* Guess at the right size for invalidator list based on previous
  invalidation events. We use last size * 1.25 to give room for a
  bit of growth, because reallocation along this path was the top
  alloc in benchmark.
* Only add SwitchPointInvalidator to list if it has an actively-
  used SwitchPoint. When invalidated, we do not immediately create
  a new SwitchPoint, instead replacing the just-invalidated SP with
  an invalid dummy SP. Only when the SP is directly requested do we
  populate it with a live SP. Dummy switchpoints do not need to be
  re-invalidated, so we avoid adding to the list.
* Avoid allocating zero-length SwitchPoint lists when no SP are
  in use.
* Avoid allocating iterators along non-SP invalidation paths.
* Tweaks to the dummy logic to actully use the dummy whenever we
  invalidate or prepare for invalidation.

Given the following benchmark:

```ruby
require 'benchmark'
loop {
  puts Benchmark.measure {
    10000.times {
      module Kernel
        def foo1 = nil
        def foo2 = nil
        def foo3 = nil
        def foo4 = nil
        def foo5 = nil
      end
    }
  }
}
```

Performance goes from:

```
  7.450000   0.140000   7.590000 (  6.643895)
  6.590000   0.060000   6.650000 (  6.532706)
  6.570000   0.070000   6.640000 (  6.555341)
```

to:

```
  1.510000   0.040000   1.550000 (  1.165226)
  1.140000   0.010000   1.150000 (  1.074266)
  1.050000   0.000000   1.050000 (  1.034040)
```

Compared with non-indy:

```
  1.840000   0.050000   1.890000 (  1.138749)
  1.220000   0.000000   1.220000 (  1.064950)
  1.020000   0.010000   1.030000 (  1.013060)
```

Note that this optimization is most effective when few call sites
are being populated, such as early in boot when most invalidation
takes place. Heavy root invalidations at runtime while call sites
are active throughout the subhierarchy will still suffer from SP
churn and excess allocation of supporting structures. A move to
aggregate SP invalidation at the call site (gather all parent SP
as call site guard) will push the cost into initial and re-binds
of those sites, which can be expected to either stabilize or
give up eventually and use simple caching.
@headius headius force-pushed the indy_cost_reduction branch from 513f869 to 0663077 Compare April 7, 2025 21:13
@headius headius merged commit e2909f9 into jruby:master Apr 8, 2025
72 checks passed
@headius headius deleted the indy_cost_reduction branch April 8, 2025 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
Morty Proxy This is a proxified and sanitized view of the page, visit original site.