Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

kfule
Copy link
Contributor

@kfule kfule commented Oct 12, 2025

Vnode.normalizeChildren now preallocates the array length and performs key-consistency checks after normalization. This improves performance, check accuracy, and bundle size respectively.

Description

This PR makes the following changes to Vnode.normalizeChildren:

  • Preallocates the array to the input length.
    • Assigning normalized values immediately afterward improves performance on V8 (node, chrome, edge; about 3% in “rerender same tree”).
  • Performs key-consistency checks against the normalized values.
    • The checks now run in the same loop for normalization.
    • Checking using normalized values improves the accuracy of checks in corner cases.

These changes will not break existing behavior (except for slight differences in the timing or order in which the check errors are thrown.) And the bundle size has been slightly reduced to 8,972 bytes gzipped.

result of npm run perf

npm run perf on my laptop

construct large vnode tree x 27,820 ops/sec ±0.47% (129 runs sampled)
rerender identical vnode x 5,068,693 ops/sec ±0.96% (129 runs sampled)
rerender same tree x 103,402 ops/sec ±0.17% (126 runs sampled)
rerender same tree (with selector-generated attrs) x 134,861 ops/sec ±0.19% (127 runs sampled)
add large nested tree x 8,972 ops/sec ±1.50% (124 runs sampled)
mutate styles/properties x 145 ops/sec ±1.37% (99 runs sampled)
repeated add/removal x 3,013 ops/sec ±1.52% (122 runs sampled)

This represents performance roughly middle between #3041 and #3051 (about 3% gain in “rerender same tree”).

Motivation and Context

This is like the re-opening of #3051.

Optimizing the loop alone resulted in only a small performance improvement, so #3051 was closed (#3051 (comment)). After further investigation, I found that combining it with pre-allocating the array length improved performance.

Unlike #3051, this PR makes improvements without reusing arrays completely, avoiding breaking or major changes.

How Has This Been Tested?

npm run test

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • My change requires a documentation update, and I've opened a pull request to update it already:
  • I have read https://mithril.js.org/contributing.html.

@kfule kfule requested a review from a team as a code owner October 12, 2025 13:49
@dead-claudia
Copy link
Member

Quick question: how is memory usage impacted? Sparse arrays take more memory than equivalent-length dense arrays, so I just want a quick sanity check here on its implications.

Also, how does it compare to an explicit children.push(Vnode.normalize(vnode.children[i]))?

@kfule
Copy link
Contributor Author

kfule commented Oct 13, 2025

@dead-claudia Thank you for your review.

I briefly verified the followings.

how is memory usage impacted?

I logged process.memoryUsage() in onCycle during npm run perf. Because the results varied each run, I couldn’t tell whether there was any real difference between versions.
(I usually benchmark by npm run perf for speed only, so I don’t have much experience with memory consumption benchmarks. If you can suggest a specific method, I’d be grateful.)

Also, how does it compare to an explicit children.push(Vnode.normalize(vnode.children[i]))?

If the push variant performed as well as this PR, I would favor it, but that doesn’t seem to be so.
I tried the following variants, but none of them achieved speeds comparable to this PR.

// (1) About the same performance as v2.3.7
var children = []
var numKeyed = 0
for (var i = 0; i < input.length; i++) {
	children.push(Vnode.normalize(input[i]))
	if (children[i] != null && children[i].key != null) numKeyed++
}

// (2) It is slightly faster than v2.3.7, but it is not as good as this PR.
var children = []
var numKeyed = 0
var normalized
for (var i = 0; i < input.length; i++) {
	children.push(normalized = Vnode.normalize(input[i]))
	if (normalized != null && normalized.key != null) numKeyed++
}

// (3) Almost the same performance as (2)
var children = []
var numKeyed = 0
for (var i = 0; i < input.length; i++) {
	var normalized = Vnode.normalize(input[i])
	children.push(normalized) 
	if (normalized != null && normalized.key != null) numKeyed++
}

@dead-claudia
Copy link
Member

(I usually benchmark by npm run perf for speed only, so I don’t have much experience with memory consumption benchmarks. If you can suggest a specific method, I’d be grateful.)

@kfule Spin up a local server from the repo root, open /performance/index.html in a Chromium browser, open the dev tools, and take a couple heap snapshots. From there, you can compare snapshots to see what all was newly allocated between runs, and that'll help you find the vnodes.

If you use Firefox, the process is similar, just with stuff in different locations.

@kfule
Copy link
Contributor Author

kfule commented Oct 14, 2025

@dead-claudia
Thanks for your advice!
I tried looking for vnodes in the DevTools memory pane on my Edge (one of the Chromium browsers).

For clarity, I assigned the following vnode to this.largeVnodes and searched for it.
(I did this because this.cached is easier to find in the "rerender identical vnode", and I wanted to check a large vnode.)

m(".foo.bar[data-foo=bar]", {p: 2},

// I assigned the vnode to `largeVnodes` like this
this.largeVnodes = m(".foo.bar[data-foo=bar]", {p: 2},

The screenshots of the results from the two branches (main (v2.3.7 mithril.js bundle) and this PR build bundle) are as follows. While they are indeed running different mithril.js scripts, they appear to show no differences.

current main branch (v2.3.7)
image

this PR
image


Additionally, there were largeVnodes within _original as well, and this PR appears to have lower memory consumption.

current main branch (v2.3.7)
image

this PR
image

These do appear to capture the vnode results, but since I'm unfamiliar with this, there might be other areas I should be looking at.

Also, this PR might have an advantage since reallocation disappears with large arrays, though.

@dead-claudia
Copy link
Member

A long time ago, V8 used to always allocate new Array(len) as sparse, so I was a bit skeptical of the benefit at first. I did some digging, and this appears to have changed, and it changed longer ago than I thought.

Comment on lines 21 to 24
for (var i = 0; i < input.length; i++) {
children[i] = Vnode.normalize(input[i])
if (children[i] != null && children[i].key != null) numKeyed++
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this boost performance further?

Suggested change
for (var i = 0; i < input.length; i++) {
children[i] = Vnode.normalize(input[i])
if (children[i] != null && children[i].key != null) numKeyed++
}
for (var i = 0; i < input.length; i++) {
children[i] = Vnode.normalize(input[i])
if (children[i] !== null) numKeyed += children[i].key != null
}

And how does this compare against the above?

Suggested change
for (var i = 0; i < input.length; i++) {
children[i] = Vnode.normalize(input[i])
if (children[i] != null && children[i].key != null) numKeyed++
}
var i = 0
for (var child of input) {
children[i] = child = Vnode.normalize(child)
if (child !== null) numKeyed += child.key != null
}

Engines do weird things here, but the idea is to reduce branches.

  • I've seen engines convert num += bool to branch-free but not if (bool) num++, despite having the type info to do it.
  • V8 can elide bounds checks more easily with for ... of, and Node's seen measurable perf gains from switching.
  • foo != null is technically sugar for foo !== undefined && foo !== null && !isHTMLDocumentAll(foo). Inline caches can reduce that, but foo !== null doesn't require an inline cache at all, just a branch predictor cache.
  • The foo.key requires a type cache no matter what. The JIT will figure out that Vnode.normalize can only return vnodes and null, and so it may be able to skip the undefined check entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dead-claudia
Thank you for your comment. I had overlooked the strict null checks, so I’ve pushed that version for now.

As for performance, the boolean-addition variant appears slightly slower than the current PR. However, note that the current benchmark doesn’t use keyed vnodes, so that might affect fairness.

The final for-of version appears slower than both of those.

In any case, I plan to summarize and share the comparison results with the boolean-addition version later. Since the difference seems small, I’ll try measuring it again on a non-laptop machine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the results from running on a Mac mini (M2) with Node v22. The !== variant (current PR) appears to be slightly better.

# !== null
construct large vnode tree x 85,558 ops/sec ±0.13% (135 runs sampled)
rerender identical vnode x 16,952,524 ops/sec ±0.07% (138 runs sampled)
rerender same tree x 336,489 ops/sec ±0.16% (135 runs sampled)
rerender same tree (with selector-generated attrs) x 432,880 ops/sec ±0.10% (134 runs sampled)
add large nested tree x 31,277 ops/sec ±0.13% (136 runs sampled)
mutate styles/properties x 647 ops/sec ±1.83% (126 runs sampled)
repeated add/removal x 12,115 ops/sec ±2.49% (122 runs sampled)
Completed perf tests in 58038ms

# != null
construct large vnode tree x 85,002 ops/sec ±0.12% (133 runs sampled)
rerender identical vnode x 16,645,591 ops/sec ±0.12% (137 runs sampled)
rerender same tree x 335,902 ops/sec ±0.18% (136 runs sampled)
rerender same tree (with selector-generated attrs) x 423,986 ops/sec ±0.12% (135 runs sampled)
add large nested tree x 30,986 ops/sec ±0.08% (136 runs sampled)
mutate styles/properties x 647 ops/sec ±1.65% (126 runs sampled)
repeated add/removal x 12,234 ops/sec ±2.36% (123 runs sampled)
Completed perf tests in 58207ms

# += bool
construct large vnode tree x 84,533 ops/sec ±0.12% (136 runs sampled)
rerender identical vnode x 16,589,058 ops/sec ±0.09% (138 runs sampled)
rerender same tree x 332,196 ops/sec ±0.19% (135 runs sampled)
rerender same tree (with selector-generated attrs) x 428,660 ops/sec ±0.13% (132 runs sampled)
add large nested tree x 30,983 ops/sec ±0.12% (133 runs sampled)
mutate styles/properties x 639 ops/sec ±1.87% (127 runs sampled)
repeated add/removal x 12,257 ops/sec ±2.52% (123 runs sampled)
Completed perf tests in 58208ms

# for-of
construct large vnode tree x 78,225 ops/sec ±0.14% (137 runs sampled)
rerender identical vnode x 14,888,496 ops/sec ±0.09% (138 runs sampled)
rerender same tree x 324,334 ops/sec ±0.12% (137 runs sampled)
rerender same tree (with selector-generated attrs) x 422,006 ops/sec ±0.08% (136 runs sampled)
add large nested tree x 30,417 ops/sec ±0.24% (136 runs sampled)
mutate styles/properties x 638 ops/sec ±1.68% (127 runs sampled)
repeated add/removal x 12,251 ops/sec ±2.83% (123 runs sampled)
Completed perf tests in 58272ms

# v2.3.7
construct large vnode tree x 78,501 ops/sec ±0.17% (135 runs sampled)
rerender identical vnode x 15,177,095 ops/sec ±0.11% (136 runs sampled)
rerender same tree x 318,919 ops/sec ±0.14% (137 runs sampled)
rerender same tree (with selector-generated attrs) x 410,400 ops/sec ±0.08% (130 runs sampled)
add large nested tree x 29,534 ops/sec ±0.30% (135 runs sampled)
mutate styles/properties x 622 ops/sec ±1.95% (125 runs sampled)
repeated add/removal x 12,160 ops/sec ±2.75% (122 runs sampled)
Completed perf tests in 57669ms

…cy checks after normalization

This change preallocates the array to the input length and collapses multiple loops into a single pass. Assigning immediately after preallocation improves performance on V8 (generally neutral elsewhere).

Key checks are now performed on normalized vnodes, making the consistency validation more accurate and clarifying the correspondence between error messages and code.
Perf-sensitive comments have been clarified to reflect the original intent of commit 6c562d2.

Behavior is unchanged, except that the timing/order of related errors may differ slightly. All existing tests pass.
Additionally, bundle size is slightly reduced.
@kfule
Copy link
Contributor Author

kfule commented Oct 15, 2025

@dead-claudia
Thank you for your review comments. I tried running npm run perf, and the current version of !== seems slightly better.
Personally, since the bundle size doesn't seem to change much either, I think the current version with its good readability is good enough.

@kfule kfule requested a review from dead-claudia October 15, 2025 15:28
@dead-claudia dead-claudia merged commit 952846b into MithrilJS:main Oct 15, 2025
7 checks passed
@JAForbes JAForbes mentioned this pull request Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.