-
-
Notifications
You must be signed in to change notification settings - Fork 34.7k
[2.7] bpo-26544: Make platform.libc_ver() less slow #10868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -194,7 +194,10 @@ def libc_ver(executable=sys.executable,lib='',version='', chunksize=2048): | |||||||
| binary = f.read(chunksize) | ||||||||
| pos = 0 | ||||||||
| while pos < len(binary): | ||||||||
| m = _libc_search.search(binary,pos) | ||||||||
| if 'libc' in binary or 'GLIBC' in binary: | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe use
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the difference between ('libc' in binary) and (binary.find('libc', pos) >= 0), they are supposed to be equavalent, no? Last time I looked at micro-optimization, an operator was faster than a method call.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They are equivalent only when
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you know that the regex will never match before offset N, maybe we use file.seek(N)? I don't know where the string is supposed to match, so I prefer to avoid to make any assumption. ... By the way, parsing a binary file to find a string, to extract a version number is really ugly. I would prefer that the libc provides its own version at runtime. IMHO running "ldd --version" or directly "/lib64/libc.so.6" would be less ugly:
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternate suggestion:
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see the point of avoiding the two "in" if pos==0? Does it provide any speedup? This code comes from the master branch. I have have a clever optimization, maybe write it in the master branch first, no? This change already makes the function 16x faster, it should be enough no?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It avoids two "in" if If |
||||||||
| m = _libc_search.search(binary, pos) | ||||||||
| else: | ||||||||
| m = None | ||||||||
| if not m or m.end() == len(binary): | ||||||||
| chunk = f.read(chunksize) | ||||||||
| if chunk: | ||||||||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. That must have been slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I'm disappointed by the bad performance of re.search(). For example, re should faster since it is supposed to search for "GLIB" and "libc" patterns "at the same time". For example, it could use two bloom filters at the "same time". But no, it's 16x faster. I don't get it, but I never looked into _sre.c.