-
Notifications
You must be signed in to change notification settings - Fork 3
Re2 bench updates, revert #31 for UAs and OS #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Also make both re2 and regex benches run using simplified / down-compiled regexes. At this point, regex-filtered trounces FilteredRE2 on CPU *for the specific job of ua-parsing*, and its memory use has gotten quite reasonable (overhead compared to re2 is down to just 20%): ```sh > /usr/bin/time -l target/bench_re2 \ target/devices.regexes regex-filtered/samples/useragents.txt 100 -q 633 regexes 1630 atoms in 0.0200757s prefilter built in 0.00527812s 75158 user agents in 0.0353107s 49.67 real 48.88 user 0.35 sys 43958272 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 3499 page reclaims 289 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 2 voluntary context switches 28247 involuntary context switches 600560455436 instructions retired 157226396942 cycles elapsed 35309696 peak memory footprint > /usr/bin/time -l target/release/examples/bench_regex \ target/devices.regexes regex-filtered/samples/useragents.txt -r 100 -q 633 regexes in 0.051890332s 75158 user agents in 0.007460291s 38.93 real 38.52 user 0.22 sys 43958272 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 2798 page reclaims 98 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 17680 involuntary context switches 372376225085 instructions retired 123339367792 cycles elapsed 42370560 peak memory footprint ``` note: bench.rs was renamed to stop conflicting with the one in ua-parser, and make the two bench programs easier to differentiate. Also one day I need to look into the difference between maximum rss and peak memory footprint on macos. It seems weird that RSS matches between the two programs, and RSS and peak match for rust, but re2's peak is 25% lower.
Trying it out confirms ua-parser#31, and the better introspectivity of FilteredRE2 explains why: turns out the data set has a pretty small number of atoms of length 2 with high discriminatory power. Lowering length to 2 increases the number of atoms from 1630 to just 1865 (+235, +14.4%) which explains why memory use is unaffected or even goes down (some regexes which match none of the samples are likely not even tried anymore) but performances increase *dramatically* (48s -> 27s for re2, 38s -> 24s for regex). This makes sense as devices are also where ua-parser#31 got extreme bang for its buck. It's a bit sad seeing re2 catch up so much with our hard work, but it makes sense if we assume `regex` has a more optimised regex matching at the cost of memory: with better discrimination we drastically decrease the amount of regex matching, which benefits the package with the slower regex matching. Although to be fair the re2 bench could also be slower due to the use of an `re2::Set` instead of an aho-corasick automaton. In fact that's pretty likely. However effect seems non-existent to slightly negative for UA and OS: - At 3-atoms, UAs have 849 atoms for 362 regex, and both re2 and regex run in about 10s (9.70~9.90 real), interestingly the RSS and memory footprint of regex are a lot lower there (25MB to 32~33 footprint). - At 2-atoms, UAs have 874 atoms for 362 regex, and both re2 and regex run a bit slower, around 10.50 for re2 and 10.40 for regex, memory use is the same. - OS is basically inbetween, going from 3-atoms to 2-atoms the number of atoms increases a small hair from 353 to 359 (for 201 regexes), the re2 performances remain stable (8.15~8.40) while regex seems to decrease a hair (from 7.10~7.20 to 7.60~7.70). Note that this is all over 100 runs parsing 75158 user agents. But that hints that maybe different configurations for the ua and device parsers would make sense... Fixes ua-parser#30
Keep devices at 2 as it benefits *massively*, but UAs and OS seem to either not benefit or be slightly hampered by smaller atoms. Would be a good idea to investigate why eventually, and maybe make the bench scripts more flexible so they can be more easily run across the three categories and with varied atom lengths.
6259750
to
adc016f
Compare
This was referenced May 11, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.