-
Notifications
You must be signed in to change notification settings - Fork 66
Improve performance of C queries related to identifiers #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This rule was highlighted as performing poorly as it created an effective cross product on declarations with the same name, which is expensive on some databases. The performance fix is to: (a) Create a predicate which represents all the conflicting identifiers by simply counting the number of declarations and confirming at least on such declaration is external. (b) Use this to create a result table of Declarations that conflict with those identifiers. (c) Implement a "non unique" external identifier class that provides a member predicate to get all the conflicting declarations. The key point here is to prevent the optimiser from doing a join between Declaration.getName() and Declaration.getName(). Instead, the join is between the names of the non-unique external identifiers and the much smaller table of declarations that conflict with at least one such entry.
This rule was identified as containing one of the slowest predicates in our C Coding Standards query suites, and this commit improves the performance. Performance is improved with the following changes: (a) Factoring out a predicate for the repeated calls to getTarget(), to avoid duplication and to create a semantically meaningful predicate which gets the target for a reference to an external identifier. (b) Use the factored out predicate to refine the reference class to only be references to external identifiers. This reduces the size of the class. (c) Create a predicate for computing a table of external identifiers, references to those identifiers and the translation units those exist in. We can then compute this table once, and use it in both the "find me a reference to this external identifier" case and in the "where the external identifier is not referenced in any other translation unit" case. Part (c) is the critical change. Without that, the optimizer was creating an expensive join order in the negation case, where it was effectively creating a cross product of all references to each external identifier, before later excluding on the negation. The use of our predicate in the negation case means we can first create a table of external identifiers and translation units, then apply that in the negation case without cross producting the references.
🤖 Beep Boop! Matrix Testing for this PR has been initiated. Please check back later for results. |
🤖 Beep Boop! clang/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!
|
🤖 Beep Boop! clang/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test! |
🤖 Beep Boop! gcc/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test! |
🤖 Beep Boop! gcc/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!
|
🤖 Beep Boop! Matrix Testing for this PR has been completed. If no reports were posted it means this PR does not contain things that need matrix testing! |
🤖 Beep Boop! Matrix Testing for this PR has been initiated. Please check back later for results. |
🤖 Beep Boop! clang/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test! |
🤖 Beep Boop! gcc/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test! |
🤖 Beep Boop! gcc/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!
|
🤖 Beep Boop! clang/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!
|
🤖 Beep Boop! Matrix Testing for this PR has been completed. If no reports were posted it means this PR does not contain things that need matrix testing! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and TIL!
Description
This addresses performance issues raised by our performance testing on two rules from MISRA C 2012.
Rule 5.8
Per the commit message:
Rule 8.7
Per the commit message:
Change request type
.ql
,.qll
,.qls
or unit tests)Rules with added or modified queries
Rule 5.8
Rule 8.7
Release change checklist
A change note (development_handbook.md#change-notes) is required for any pull request which modifies:
If you are only adding new rule queries, a change note is not required.
Author: Is a change note required?
🚨🚨🚨
Reviewer: Confirm that format of shared queries (not the .qll file, the
.ql file that imports it) is valid by running them within VS Code.
Reviewer: Confirm that either a change note is not required or the change note is required and has been added.
Query development review checklist
For PRs that add new queries or modify existing queries, the following checklist should be completed by both the author and reviewer:
Author
As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
Reviewer
As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.