Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Improve performance of C queries related to identifiers #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 29, 2023

Conversation

lcartey
Copy link
Collaborator

@lcartey lcartey commented Mar 8, 2023

Description

This addresses performance issues raised by our performance testing on two rules from MISRA C 2012.

Rule 5.8

Per the commit message:

 The performance fix is to:
     (a) Create a predicate which represents all the conflicting identifiers
         by simply counting the number of declarations and confirming at
         least one such declaration is external.
     (b) Use this to create a result table of Declarations that conflict
         with those identifiers.
     (c) Implement a "non unique" external identifier class that provides a
         member predicate to get all the conflicting declarations.
    
    The key point here is to prevent the optimiser from doing a join between
    Declaration.getName() and Declaration.getName(). Instead, the join is
    between the names of the non-unique external identifiers and the much
    smaller table of declarations that conflict with at least one such
    entry

Rule 8.7

Per the commit message:

    Performance is improved with the following changes:
     (a) Factoring out a predicate for the repeated calls to getTarget(), to
         avoid duplication and to create a semantically meaningful predicate
         which gets the target for a reference to an external identifier.
     (b) Use the factored out predicate to refine the reference class to
         only be references to external identifiers. This reduces the size
         of the class.
     (c) Create a predicate for computing a table of external identifiers,
         references to those identifiers and the translation units those
         exist in. We can then compute this table once, and use it in both
         the "find me a reference to this external identifier" case and in
         the "where the external identifier is not referenced in any other
         translation unit" case.
    
    Part (c) is the critical change. Without that, the optimizer was
    creating an expensive join order in the negation case, where it
    was effectively creating a cross product of all references to each
    external identifier, before later excluding on the negation. The use
    of our predicate in the negation case means we can first create a table
    of external identifiers and translation units, then apply that in the
    negation case without cross producting the references.

Change request type

  • Release or process automation (GitHub workflows, internal scripts)
  • Internal documentation
  • External documentation
  • Query files (.ql, .qll, .qls or unit tests)
  • External scripts (analysis report or other code shipped as part of a release)

Rules with added or modified queries

  • No rules added
  • Queries have been added for the following rules:
    • rule number here
  • Queries have been modified for the following rules:
    • Rule 5.8
    • Rule 8.7

Release change checklist

A change note (development_handbook.md#change-notes) is required for any pull request which modifies:

  • The structure or layout of the release artifacts.
  • The evaluation performance (memory, execution time) of an existing query.
  • The results of an existing query in any circumstance.

If you are only adding new rule queries, a change note is not required.

Author: Is a change note required?

  • Yes
  • No

🚨🚨🚨
Reviewer: Confirm that format of shared queries (not the .qll file, the
.ql file that imports it) is valid by running them within VS Code.

  • Confirmed

Reviewer: Confirm that either a change note is not required or the change note is required and has been added.

  • Confirmed

Query development review checklist

For PRs that add new queries or modify existing queries, the following checklist should be completed by both the author and reviewer:

Author

  • Have all the relevant rule package description files been checked in?
  • Have you verified that the metadata properties of each new query is set appropriately?
  • Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
  • Are the alert messages properly formatted and consistent with the style guide?
  • Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
    As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
  • Does the query have an appropriate level of in-query comments/documentation?
  • Have you considered/identified possible edge cases?
  • Does the query not reinvent features in the standard library?
  • Can the query be simplified further (not golfed!)

Reviewer

  • Have all the relevant rule package description files been checked in?
  • Have you verified that the metadata properties of each new query is set appropriately?
  • Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
  • Are the alert messages properly formatted and consistent with the style guide?
  • Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
    As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
  • Does the query have an appropriate level of in-query comments/documentation?
  • Have you considered/identified possible edge cases?
  • Does the query not reinvent features in the standard library?
  • Can the query be simplified further (not golfed!)

lcartey added 3 commits March 8, 2023 11:24
This rule was highlighted as performing poorly as it created an
effective cross product on declarations with the same name, which is
expensive on some databases.

The performance fix is to:
 (a) Create a predicate which represents all the conflicting identifiers
     by simply counting the number of declarations and confirming at
     least on such declaration is external.
 (b) Use this to create a result table of Declarations that conflict
     with those identifiers.
 (c) Implement a "non unique" external identifier class that provides a
     member predicate to get all the conflicting declarations.

The key point here is to prevent the optimiser from doing a join between
Declaration.getName() and Declaration.getName(). Instead, the join is
between the names of the non-unique external identifiers and the much
smaller table of declarations that conflict with at least one such
entry.
This rule was identified as containing one of the slowest predicates in
our C Coding Standards query suites, and this commit improves the
performance.

Performance is improved with the following changes:
 (a) Factoring out a predicate for the repeated calls to getTarget(), to
     avoid duplication and to create a semantically meaningful predicate
     which gets the target for a reference to an external identifier.
 (b) Use the factored out predicate to refine the reference class to
     only be references to external identifiers. This reduces the size
     of the class.
 (c) Create a predicate for computing a table of external identifiers,
     references to those identifiers and the translation units those
     exist in. We can then compute this table once, and use it in both
     the "find me a reference to this external identifier" case and in
     the "where the external identifier is not referenced in any other
     translation unit" case.

Part (c) is the critical change. Without that, the optimizer was
creating an expensive join order in the negation case, where it
was effectively creating a cross product of all references to each
external identifier, before later excluding on the negation. The use
of our predicate in the negation case means we can first create a table
of external identifiers and translation units, then apply that in the
negation case without cross producting the references.
@github-actions
Copy link

github-actions bot commented Mar 8, 2023

🤖 Beep Boop! Matrix Testing for this PR has been initiated. Please check back later for results.

💡 If you do not hear back from me please check my status! I will report even if this PR does not contain files eligible for matrix testing.

@jsinglet
Copy link
Contributor

jsinglet commented Mar 8, 2023

🤖 Beep Boop! clang/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!


COMPILE_ERROR_OUTPUT : 
QUERY                : IdentifiersWithExternalLinkageNotUnique
PACKAGE              : Declarations6
TEST_PASS            : True
RULE                 : RULE-5-8
COMPILE_PASS         : True
TEST_DIFFERENCE      : 
SUITE                : MISRA-C-2012

COMPILE_ERROR_OUTPUT : 
QUERY                : ShouldNotBeDefinedWithExternalLinkage
PACKAGE              : Declarations6
TEST_PASS            : True
RULE                 : RULE-8-7
COMPILE_PASS         : True
TEST_DIFFERENCE      : 
SUITE                : MISRA-C-2012


@jsinglet
Copy link
Contributor

jsinglet commented Mar 8, 2023

🤖 Beep Boop! clang/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test!

@jsinglet
Copy link
Contributor

jsinglet commented Mar 8, 2023

🤖 Beep Boop! gcc/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test!

@jsinglet
Copy link
Contributor

jsinglet commented Mar 8, 2023

🤖 Beep Boop! gcc/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!


COMPILE_PASS         : True
RULE                 : RULE-5-8
TEST_DIFFERENCE      : 
QUERY                : IdentifiersWithExternalLinkageNotUnique
SUITE                : MISRA-C-2012
PACKAGE              : Declarations6
TEST_PASS            : True
COMPILE_ERROR_OUTPUT : 

COMPILE_PASS         : True
RULE                 : RULE-8-7
TEST_DIFFERENCE      : 
QUERY                : ShouldNotBeDefinedWithExternalLinkage
SUITE                : MISRA-C-2012
PACKAGE              : Declarations6
TEST_PASS            : True
COMPILE_ERROR_OUTPUT : 


@jsinglet
Copy link
Contributor

jsinglet commented Mar 8, 2023

🤖 Beep Boop! Matrix Testing for this PR has been completed. If no reports were posted it means this PR does not contain things that need matrix testing!

@github-actions
Copy link

github-actions bot commented Mar 9, 2023

🤖 Beep Boop! Matrix Testing for this PR has been initiated. Please check back later for results.

💡 If you do not hear back from me please check my status! I will report even if this PR does not contain files eligible for matrix testing.

@jsinglet
Copy link
Contributor

jsinglet commented Mar 9, 2023

🤖 Beep Boop! clang/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test!

@jsinglet
Copy link
Contributor

jsinglet commented Mar 9, 2023

🤖 Beep Boop! gcc/cpp/x86_64 Matrix Testing for this PR has been completed but I didn't find anything to test!

@jsinglet
Copy link
Contributor

jsinglet commented Mar 9, 2023

🤖 Beep Boop! gcc/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!


SUITE                : MISRA-C-2012
PACKAGE              : Declarations6
RULE                 : RULE-5-8
TEST_PASS            : True
TEST_DIFFERENCE      : 
COMPILE_PASS         : True
COMPILE_ERROR_OUTPUT : 
QUERY                : IdentifiersWithExternalLinkageNotUnique

SUITE                : MISRA-C-2012
PACKAGE              : Declarations6
RULE                 : RULE-8-7
TEST_PASS            : True
TEST_DIFFERENCE      : 
COMPILE_PASS         : True
COMPILE_ERROR_OUTPUT : 
QUERY                : ShouldNotBeDefinedWithExternalLinkage


@jsinglet
Copy link
Contributor

jsinglet commented Mar 9, 2023

🤖 Beep Boop! clang/c/x86_64 Matrix Testing for this PR has been completed. See below for the results!


PACKAGE              : Declarations6
COMPILE_PASS         : True
COMPILE_ERROR_OUTPUT : 
RULE                 : RULE-5-8
SUITE                : MISRA-C-2012
QUERY                : IdentifiersWithExternalLinkageNotUnique
TEST_PASS            : True
TEST_DIFFERENCE      : 

PACKAGE              : Declarations6
COMPILE_PASS         : True
COMPILE_ERROR_OUTPUT : 
RULE                 : RULE-8-7
SUITE                : MISRA-C-2012
QUERY                : ShouldNotBeDefinedWithExternalLinkage
TEST_PASS            : True
TEST_DIFFERENCE      : 


@jsinglet
Copy link
Contributor

jsinglet commented Mar 9, 2023

🤖 Beep Boop! Matrix Testing for this PR has been completed. If no reports were posted it means this PR does not contain things that need matrix testing!

@knewbury01 knewbury01 self-requested a review March 13, 2023 15:53
Copy link
Contributor

@mbaluda mbaluda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and TIL!

@lcartey lcartey added this pull request to the merge queue Mar 29, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Mar 29, 2023
@lcartey lcartey added this pull request to the merge queue Mar 29, 2023
Merged via the queue into main with commit d3d017b Mar 29, 2023
@lcartey lcartey deleted the lcartey/improve-identifier-performance branch March 29, 2023 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.