-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-133968: Use private unicode writer for json #133832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
What's the point? This just adds more maintenance if we make changes to how |
They might have caused a performance regression compared to 3.13: faster-cpython/ideas#726 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good idea.
json
optimizations have been rejected in the past--useujson
or something like that if performance is critical.- If we change how
PyUnicodeWriter
works, it adds more maintenance, especially if we use this as precedent for doing this elsewhere. - We should aim for optimizing
PyUnicodeWriter
as a broader change, not speed up each individual case by using the private API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to only replace PyUnicodeWriter_WriteUTF8() with _PyUnicodeWriter_WriteASCIIString()? Do you get similar performance in this case?
See also #133186. |
If |
I chose to not expose it since it generates an invalid string if the input string contains non-ASCII characters. But yeah, maybe we should expose it. The function only validates the input string in debug mode for best performance. |
Maybe, but my current benchmark has too much overhead to measure this accurately. I'll have to rewrite it first. I hope we can figure out how to get the performance of the public API very close to the private one, such that everyone feels comfortable using it. |
I updated the benchmark, but I don't understand why:
Does this have something to do with the |
The private API doesn't enable overallocation by default. |
I replaced PyUnicodeWriter_WriteUTF8() with _PyUnicodeWriter_WriteASCIIString() in Modules/_json.c and ran a benchmark:
Benchmark hidden because not significant (15): encode 100 integers, encode 100 floats, encode 100 "ascii" strings, encode ascii string len=100, encode escaped string len=128, encode Unicode string len=100, encode 1000 integers, encode 1000 floats, encode 1000 "ascii" strings, encode ascii string len=1000, encode Unicode string len=1000, encode 10000 "ascii" strings, encode ascii string len=10000, encode escaped string len=9984, encode Unicode string len=10000 I built Python with Benchmark code: import json
import pyperf
runner = pyperf.Runner()
for count in (100, 1_000, 10_000):
runner.bench_func(f'encode {count} booleans', json.dumps, [True, False] * (count // 2))
runner.bench_func(f'encode {count} integers', json.dumps, list(range(count)))
runner.bench_func(f'encode {count} floats', json.dumps, [1.0] * count)
runner.bench_func(f'encode {count} "ascii" strings', json.dumps, ['ascii'] * count)
text = 'ascii'
text *= (count // len(text) or 1)
runner.bench_func(f'encode ascii string len={len(text)}', json.dumps, text)
text = ''.join(chr(ch) for ch in range(128))
text *= (count // len(text) or 1)
runner.bench_func(f'encode escaped string len={len(text)}', json.dumps, text)
text = 'abcd€'
text *= (count // len(text) or 1)
runner.bench_func(f'encode Unicode string len={len(text)}', json.dumps, text) |
I also ran my benchmark on this PR:
Benchmark hidden because not significant (1): encode Unicode string len=1000 |
Yeah, but both the old code and the public API do, so it's not that. (And you really don't want to turn it off)
|
This line is inefficient for exact string instances ( cpython/Objects/unicodeobject.c Line 13936 in 86c1d43
|
Here's the comparison with a minimal PR:
|
@vstinner, could you add a fast path for exact string instances in |
Merged. I confirmed with two benchmarks that this small optimization makes a big difference on some use cases such as encoding short strings in JSON. |
Ok, I created #133973 to add PyUnicodeWriter_WriteASCII(). |
If that's merged we would use this aproach in 3.14, right? |
@vstinner, it looks like the regression in |
Are you sure about that? The implementation uses a freelist which avoids the heap allocation in most cases. |
Would you mind to create a separated PR just for that? |
Are the benchmarks creating an unrealistic number of concurrent writers? That would starve the freelist and create some allocation overhead, but only on the benchmarks. |
You're right, it does seem to be using the only entry of the freelist. (I disabled the malloc to check) |
pyperformance (with
--enable-optimizations
and--with-lto
)jsonyx-performance-tests (with
--enable-optimizations
and--with-lto
)