-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Dataflow: Add a language specific term to join
and branch
#12236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataflow: Add a language specific term to join
and branch
#12236
Conversation
cpp/ql/lib/experimental/semmle/code/cpp/ir/dataflow/internal/DataFlowPrivate.qll
Fixed
Show fixed
Hide fixed
527a745
to
c575155
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as Python is concerned, this looks okay, I guess. 🙂
I'll be curious to see how this is actually used. At a glance, I would have no idea what kinds of integers to put in getAdditionalFlowIntoCallNodeTerm
.
Thanks for the quick 👀, Taus!
For C/C++ I intent to implement I'm not sure how relevant such a case would be for Python, since I |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved for Go, as it doesn't change anything. Switch statements are used a fair amount in Go, but we haven't noticed any particular slow-down caused by them.
Thanks for the approvals on this already 🙇! I had to merge in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
cpp/ql/lib/experimental/semmle/code/cpp/ir/dataflow/internal/DataFlowImpl.qll
Outdated
Show resolved
Hide resolved
50e8f5d
to
555444f
Compare
cpp/ql/lib/experimental/semmle/code/cpp/ir/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/experimental/semmle/code/cpp/ir/dataflow/internal/DataFlowImpl3.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/experimental/semmle/code/cpp/ir/dataflow/internal/DataFlowImpl.qll
Outdated
Show resolved
Hide resolved
You've got yourself a tiny merge conflict now 😉 |
…ter and argument columns.
981392e
to
92ad099
Compare
@aschackmull a rebase + force-push was the easiest way for me to resolve this fun conflict. I think this PR has been revived now 🤞. |
In the process of adding more dataflow to C/C++ we've seen a large performance regression on a codebase that contains code like:
where the code inside
/* ... */
causes the dataflow library to create a giant number of access paths:As @aschackmull pointed out, this looks a lot like a virtual dispatch hierarchy like (sorry for my bad Java 😅):
and when written like that, the dataflow library recognizes that such calls to
foo
may have a high fan-in/fan-out.However, it's not clear how the dataflow library would recognize the
switch
pattern above to selectively disable field flow for such cases. Thus, this PR adds a new predicate that can be overwritten for each language to increase thejoin
andbranch
values when calculating the expected fan-in and fan-out.I've only implemented a "flow-into" version as this was the only thing that was necessary to fix the performance on C/C++.
Implementing this new predicate as
none()
(as I've done so in this PR) will preserve the current fan-in/fan-out estimates. I hope that makes this PR uncontroversial.Once we implement the predicate for C/C++ (which I'll do in a follow-up PR) the stats will be the much more manageable: