The Wayback Machine - https://web.archive.org/web/20211018133308/https://github.com/github/codeql/pull/5463
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Add Header Injection query #5463

Merged
merged 26 commits into from Oct 18, 2021

Conversation

@jorgectf
Copy link
Contributor

@jorgectf jorgectf commented Mar 19, 2021

This PR introduces:

  • Modeling for every way of adding headers to a Flask, Django and Werkzeug response.
  • A query looking for user-input appended to that headers' name/value.
@jorgectf jorgectf marked this pull request as ready for review Apr 3, 2021
@jorgectf jorgectf requested a review from as a code owner Apr 3, 2021
@jorgectf jorgectf force-pushed the jorgectf/python/headerInjection branch from 18889a4 to b0c4986 Apr 8, 2021
@jorgectf
Copy link
Contributor Author

@jorgectf jorgectf commented Jun 18, 2021

The query is now ready for review.

Thanks @tausbn for pointing DefinitionNode out in the Slack group 😊

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Jun 18, 2021

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,370,,,,,,,,
+    `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,417,,,,,,,,
-    Totals,,84,1575,181,13,6,6,,33,1,58
+    Totals,,84,1622,181,13,6,6,,33,1,58
  • Changes to framework-coverage-java.csv:
- org.apache.commons.lang3,,,370,,,,,,,,,,,,,,324,46
+ org.apache.commons.lang3,,,417,,,,,,,,,,,,,,324,93

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Jun 19, 2021

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,370,,,,,,,,
+    `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,417,,,,,,,,
-    Totals,,84,1575,181,13,6,6,,33,1,58
+    Totals,,84,1622,181,13,6,6,,33,1,58
  • Changes to framework-coverage-java.csv:
- org.apache.commons.lang3,,,370,,,,,,,,,,,,,,324,46
+ org.apache.commons.lang3,,,417,,,,,,,,,,,,,,324,93

Copy link
Contributor

@tausbn tausbn left a comment

A few minor things to address, otherwise this looks really nice! 👍

You may also want to look at DataFlow::MethodCallNode, which can help clean up some of the AttrRead juggling that is otherwise needed.

python/ql/src/experimental/semmle/python/Concepts.qll Outdated Show resolved Hide resolved
python/ql/src/experimental/semmle/python/Concepts.qll Outdated Show resolved Hide resolved
| flask_bad.py:44:44:44:50 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| flask_bad.py:44:44:44:55 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
| flask_bad.py:44:44:44:69 | ControlFlowNode for Subscript | semmle.label | ControlFlowNode for Subscript |
#select
Copy link
Contributor

@tausbn tausbn Sep 6, 2021

It looks to me as if there are no results for django_bad.py. Is this intended?

Copy link
Contributor Author

@jorgectf jorgectf Sep 7, 2021

I don't really know why the results don't show when querying the whole taint config. My best guess is RemoteFlowSource does not support request.GET.get("rfs_header").

Copy link
Contributor

@tausbn tausbn Sep 20, 2021

Is this something you would like to try to fix? I'm happy to merge this PR without this test passing, but the query may be missing out on results.

Copy link
Contributor Author

@jorgectf jorgectf Sep 22, 2021

Of course, let me check what kind of RFS are for django and I'll change current ones with those 👍

@jorgectf jorgectf dismissed a stale review via 190bc2f Sep 7, 2021
@jorgectf
Copy link
Contributor Author

@jorgectf jorgectf commented Sep 7, 2021

You may also want to look at DataFlow::MethodCallNode, which can help clean up some of the AttrRead juggling that is otherwise needed.

I can't get what you mean by that. Most functions are DataFlow::Node because they may return different values, and so restricting them to DataFlow::MethodCallNode won't get many results. Does that make sense?

As a minor change, I'd like to know what's your view on #5463 (comment)

@tausbn
Copy link
Contributor

@tausbn tausbn commented Sep 20, 2021

Regarding the use of MethodCallNode, I was thinking of something along the lines of
https://github.com/github/codeql/blob/main/python/ql/lib/semmle/python/frameworks/MarkupSafe.qll#L71
for places where you want to identify a call to a method, for instance in your use of headerSetItemCall.

@jorgectf
Copy link
Contributor Author

@jorgectf jorgectf commented Sep 22, 2021

Regarding the use of MethodCallNode, I was thinking of something along the lines of
https://github.com/github/codeql/blob/main/python/ql/lib/semmle/python/frameworks/MarkupSafe.qll#L71
for places where you want to identify a call to a method, for instance in your use of headerSetItemCall.

I think I get what you mean. Should I refactor the modeling to be more alike that one?

@tausbn
Copy link
Contributor

@tausbn tausbn commented Sep 27, 2021

I think I get what you mean. Should I refactor the modeling to be more alike that one?

If you like, sure! In this PR, the difference it makes is fairly limited. I just thought I should make you aware of this class (so that you may add it to your toolbox 🙂).

@jorgectf
Copy link
Contributor Author

@jorgectf jorgectf commented Sep 27, 2021

I think I get what you mean. Should I refactor the modeling to be more alike that one?

If you like, sure! In this PR, the difference it makes is fairly limited. I just thought I should make you aware of this class (so that you may add it to your toolbox 🙂).

Thanks for that! I'm a bit busy lately so I think I'll leave it as it is this time, but I'll practice that way of modeling (looks very cool!) :)

@jorgectf
Copy link
Contributor Author

@jorgectf jorgectf commented Oct 16, 2021

Hi @tausbn, apologies for the delay (I've had some issues with my computer). To make the query detect the django stuff I've had to model django's way to get a get parameter from a request. It won't take all of django's RFS because they are not modeled (as far as I've seen).

tausbn
tausbn approved these changes Oct 18, 2021
@tausbn
Copy link
Contributor

@tausbn tausbn commented Oct 18, 2021

No worries about the delay. Thanks for the additional Django modelling! I think this looks good to merge now. 👍

@tausbn tausbn merged commit 8e68eae into github:main Oct 18, 2021
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.