-
-
Notifications
You must be signed in to change notification settings - Fork 2k
ci(docs-preview): Refactor to use pull_request_target
#4264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @pwntester for review 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context if it helps work through the review faster :)
workflow_call: | ||
inputs: | ||
preview-context: | ||
description: 'Preview Metadata (JSON)' | ||
required: true | ||
type: string | ||
secrets: | ||
NETLIFY_AUTH_TOKEN: | ||
required: true | ||
NETLIFY_SITE_ID: | ||
required: true | ||
|
||
env: | ||
BUILD_DIR: ${{ fromJSON( inputs.preview-context ).build_dir }} | ||
# PR head SHA (latest commit): | ||
PR_HEADSHA: ${{ fromJSON( inputs.preview-context ).pull_request.head_sha }} | ||
PR_NUMBER: ${{ fromJSON( inputs.preview-context ).pull_request.number }} | ||
# Deploy URL preview prefix (the site name for the prefix is managed at Netlify): | ||
PREVIEW_SITE_PREFIX: ${{ fromJSON( inputs.preview-context ).netlify.deploy_prefix }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both workflows should be fairly easy to grok now with inputs
/ env
hoisted up to the top of the workflow.
Let me know if you disagree with the JSON input approach. I could make each one a separate input instead, but that seemed redundant and would just add more noise?
# Common inputs shared between the two workflow jobs (`preview` + `deploy`). | ||
# Composed as JSON to pass as a single input which each called job will map into separate ENV for use. | ||
env: | ||
PREVIEW_CONTEXT: | | ||
{ | ||
"build_dir": "docs/site/", | ||
"netlify": { | ||
"site_name": "dms-doc-previews", | ||
"deploy_prefix": "pullrequest-${{ github.event.pull_request.number }}" | ||
}, | ||
"pull_request": { | ||
"head_repo": "${{ github.event.pull_request.head.repo.full_name }}", | ||
"head_sha": "${{ github.event.pull_request.head.sha }}", | ||
"number": "${{ github.event.pull_request.number }}" | ||
} | ||
} | ||
# Grant `secrets.GITHUB_TOKEN` only the minimum permissions needed by actions to function: | ||
# https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/controlling-permissions-for-github_token | ||
# NOTE: See the associated `preview` and `deploy` workflows called for when these permissions are needed. | ||
permissions: | ||
contents: read | ||
pull-requests: write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the only "dynamic" config for the workflow. Not sure if the repo.full_name
context could be used here for manipulation?
permissions
has been limited accordingly. The prepare
workflow that later runs the PR branch to build the docs only has contents: read
available, but AFAIK should not have access to GITHUB_TOKEN
on the PR author's side of things and they no longer have influence over the workflow itself.
That only leaves build-docs.sh
as a concern, but with the restricted permissions for GITHUB_TOKEN
and preventing any secrets from being in scope via workflow_call
trigger, I think it's very close to the restricted environment of pull_request
event triggered workflows? (provided any future additions don't introduce risks in docs-preview-prepare.yml
)
# The `prepare` job is for running steps in an untrusted context (necessary to build the docs): | ||
# CAUTION: This runs a build script which the PR could modify for malicious purposes. | ||
prepare: | ||
needs: [create-context] | ||
uses: docker-mailserver/docker-mailserver/.github/workflows/docs-preview-prepare.yml@main | ||
with: | ||
preview-context: ${{ needs.create-context.outputs.preview-context }} | ||
|
||
# The `deploy` job is for running the remaining steps in a trusted context after building the PR branch: | ||
# CAUTION: Do not execute any content from untrusted sources (the PR branch or the retrieved artifact from the `prepare` job) | ||
deploy: | ||
needs: [create-context, prepare] | ||
uses: docker-mailserver/docker-mailserver/.github/workflows/docs-preview-deploy.yml@main | ||
secrets: | ||
NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }} | ||
NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }} | ||
with: | ||
preview-context: ${{ needs.create-context.outputs.preview-context }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presently the prepare
+ deploy
jobs here have uses
with an explicit <org>/<repo> ... @<branch>
value.
I could relax that to omit those. PRs for branches from forks would not run changes on those branches AFAIK so that'd be ok. Might be better for any future maintenance to run the PR change, but I'm not sure how that may affect PRs from forks, especially with pull_request_target
, it'd be the workflow from the PR base branch (which most of the time would be master
).
EDIT: Fixed the @main
refs to point to @master
via follow-up commit.
# Prevent `secrets.GITHUB_TOKEN` from being stored in a `.git/config` file, which could otherwise | ||
# be compromised when executing untrusted code from a PR (as is done in the build step below). | ||
persist-credentials: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, my understanding is the equivalent of secrets.GITHUB_TOKEN
would be stored in .git/config
, so whatever permissions
grant, the PR author could leverage for malicious intent should they gain access to the token.
I'm not sure how many other actions may cache the token, or other secrets by default like actions/checkout
does implicitly. I don't think we would have many scenarios for workflows requiring to execute untrusted third-party content + access to secrets... presumably this is mostly a concern for PR oriented workflows when pull_request
isn't sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The token is available to an attacker by dumping the runner memory so you need to assume that the attacker will be able to get Pull Request write permissions
https://karimrahal.com/2023/01/05/github-actions-leaking-secrets/
The safest approach is to run all untrusted code in a pull_request
triggered workflow and pass any needed information to a workflow_run
privileged workflow but verifying this information to make sure its the expected one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The token is available to an attacker by dumping the runner memory
Are you saying that persist-credentials: false
is not worthwhile then?
EDIT: I have dumped memory as explained in the linked article and can confirm if those commands can be run then it's possible to get the GH token that way too 👍
you need to assume that the attacker will be able to get Pull Request write permissions
I don't understand how the attacker would get these permissions. pull_request_target
does have those permissions, but this prepare
job that calls the docs-preview-prepare.yml
workflow is via a separate runner instance and has permissions
set to only contents: read
.
So while it is grouped under the same workflow run ID it should not have the calling workflow (docs-preview.yml
) token in memory? Was this a misunderstanding during your review?
The safest approach is to run all untrusted code in a
pull_request
triggered workflow and pass any needed information to aworkflow_run
privileged workflow but verifying this information to make sure its the expected one.
I don't mind taking that approach, and I do agree with you that pull_request
+ workflow_run
is the approach I would generally advise anyone else due to the risks of pull_request_target
, but I would appreciate clarification on what I've done with this PR that does not mirror the equivalent pull_request
environment for running docs-preview-prepare.yml
?
The default permissions
for pull_request
when the PR is from a fork, to my knowledge is read
for all scopes, while secrets
is none
? Have I not secured docs-preview-prepare.yml
in the same way by restricting permissions
and secrets
via this separate workflow_call
workflow?
Is there something else that pull_request
does differently to be more secure?
With pull_request_target
the PR author cannot modify the workflow that is run, unlike pull_request
.
Presently that benefit isn't relevant due to build-docs.sh
, but otherwise the build process would be sandboxed within the Docker container used, thus even if the build process was manipulated (eg: compromise via Docker image or it's dependencies), it is very unlikely an attack could then affect something on the Docker host (runner) provided our only interaction with the build output is via artifact upload/download of the build directory.
# CAUTION: Security risk due to executing `build-docs.sh` (which a PR could modify to be malicious): | ||
# Care must be taken to run this workflow as if it were in an untrusted context (like the `on: pull_request` event trigger would). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason we allow this is so that an update to the Docker image tag via PR would trigger the workflow to detect any breakage to docs generation.
I could make another workflow_call
workflow file for this instead (docs-production-deploy.yml
requires calling build-docs.sh
too), and perhaps with uses:
excluding the repo and branch pin, we'd still be able to verify that version bump PR works correctly 🤔
AlternativeI recently came across how another project chose to resolve the PR metadata concern: zizmorcore/zizmor#156 (comment) From the links referenced diff:
- name: Save PR number
run: |
echo "${{ github.event.number }}" > .pr_number
- name: Validate and set PR number
run: |
PR_NUMBER=$(cat docs-preview/.pr_number)
if ! [[ "$PR_NUMBER" =~ ^[0-9]+$ ]]; then
echo "Invalid PR number: $PR_NUMBER"
exit 1
fi
echo "PR_NUMBER=$PR_NUMBER" >> $GITHUB_ENV Notes:
UPDATE: I assisted another project to apply this approach: nonebot/nonebot2#3124 Potential improvement. Instead of It may be worth considering the above as an alternative to what this PR proposes with
|
I had a somewhat superficial review because this is rather intricate, and I lack the time to perform an in-depth investigation. The changes look good to me, and when you resolve some of the discussions you started yourself (branch names, etc.), I think we can ship this. I cannot give an opinion on #4264 (comment). Unless you implement it because you know it's better, I'd go with the current set of changes. |
That alternative is to revert back to our original
I'm personally not a fan of the old approach we had.
However this is not an approach I'd contribute to other projects as there is some risk if maintainers make changes to the workflow without care. The To my knowledge |
# CAUTION: This runs a build script which the PR could modify for malicious purposes. | ||
prepare: | ||
needs: [create-context] | ||
uses: docker-mailserver/docker-mailserver/.github/workflows/docs-preview-prepare.yml@master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job is running untrusted code in the context of pull_request_target. There are no secrets shared with the job and only Pull Request write permissions so the risk is low, however there is still some risk. Cant this be moved to a pull_request
workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job is running untrusted code in the context of
pull_request_target
. There are no secrets shared with the job and only Pull Requestwrite
permissions so the risk is low,
It should only have contents: read
permission. I know that it inherits from this pull_request_target
permissions which have been limited, but the prepare workflow_call
itself is restricted to only contents: read
, pull_requests: write
is for the separate deploy workflow_call
.
The calling workflow docs-preview.yml
/ pull_request_target
must grant these permissions as the sub-workflows called can only reduce the permissions scope, not add extra permissions.
Unfortunately docs-preview-prepare.yml
cannot enforce secrets: none
, so there is potential risk there should a maintainer call the workflow with secrets: inherit
or pass a secret through.
Cant this be moved to a
pull_request
workflow?
That's what it was before this PR.
- This PR has 3 workflow files, but is run as a single workflow run.
- The approach prior to this is
pull_request
+workflow_run
, two separate workflow files, but also treated as two separate workflow runs on the actions tab, and that complicates the management a bit with PR context + commit status management.
If we change back to pull_request
, I need to run the deploy job with access to secrets and the PR metadata (the number to write the PR comment with preview link). Originally as you know we had a vulnerability there with how the PR metadata as passed via artifact and loaded into ENV, so there is still potential risk.
# Prevent `secrets.GITHUB_TOKEN` from being stored in a `.git/config` file, which could otherwise | ||
# be compromised when executing untrusted code from a PR (as is done in the build step below). | ||
persist-credentials: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The token is available to an attacker by dumping the runner memory so you need to assume that the attacker will be able to get Pull Request write permissions
https://karimrahal.com/2023/01/05/github-actions-leaking-secrets/
The safest approach is to run all untrusted code in a pull_request
triggered workflow and pass any needed information to a workflow_run
privileged workflow but verifying this information to make sure its the expected one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL:DR:
- The restricted workflow (
docs-preview-prepare.yml
) only hascontents: read
granted. As there is no secrets, it should roughly mirror thepull_request
workflow environment security wise? - Confirmed memory dump for accessing the GH token, provided PR author can run shell commands on the host.
While I'm fine to take a different approach than this PR proposes, I would greatly appreciate confirmation on pull_request_target
+ workflow_call
(as has been implemented here) as being secure, akin to a pull_request
event triggered for PRs from forks.
AFAIK with the proposed approach, the PR author has even less flexibility for an attack than they would with pull_request
.
# CAUTION: This runs a build script which the PR could modify for malicious purposes. | ||
prepare: | ||
needs: [create-context] | ||
uses: docker-mailserver/docker-mailserver/.github/workflows/docs-preview-prepare.yml@master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job is running untrusted code in the context of
pull_request_target
. There are no secrets shared with the job and only Pull Requestwrite
permissions so the risk is low,
It should only have contents: read
permission. I know that it inherits from this pull_request_target
permissions which have been limited, but the prepare workflow_call
itself is restricted to only contents: read
, pull_requests: write
is for the separate deploy workflow_call
.
The calling workflow docs-preview.yml
/ pull_request_target
must grant these permissions as the sub-workflows called can only reduce the permissions scope, not add extra permissions.
Unfortunately docs-preview-prepare.yml
cannot enforce secrets: none
, so there is potential risk there should a maintainer call the workflow with secrets: inherit
or pass a secret through.
Cant this be moved to a
pull_request
workflow?
That's what it was before this PR.
- This PR has 3 workflow files, but is run as a single workflow run.
- The approach prior to this is
pull_request
+workflow_run
, two separate workflow files, but also treated as two separate workflow runs on the actions tab, and that complicates the management a bit with PR context + commit status management.
If we change back to pull_request
, I need to run the deploy job with access to secrets and the PR metadata (the number to write the PR comment with preview link). Originally as you know we had a vulnerability there with how the PR metadata as passed via artifact and loaded into ENV, so there is still potential risk.
# Prevent `secrets.GITHUB_TOKEN` from being stored in a `.git/config` file, which could otherwise | ||
# be compromised when executing untrusted code from a PR (as is done in the build step below). | ||
persist-credentials: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The token is available to an attacker by dumping the runner memory
Are you saying that persist-credentials: false
is not worthwhile then?
EDIT: I have dumped memory as explained in the linked article and can confirm if those commands can be run then it's possible to get the GH token that way too 👍
you need to assume that the attacker will be able to get Pull Request write permissions
I don't understand how the attacker would get these permissions. pull_request_target
does have those permissions, but this prepare
job that calls the docs-preview-prepare.yml
workflow is via a separate runner instance and has permissions
set to only contents: read
.
So while it is grouped under the same workflow run ID it should not have the calling workflow (docs-preview.yml
) token in memory? Was this a misunderstanding during your review?
The safest approach is to run all untrusted code in a
pull_request
triggered workflow and pass any needed information to aworkflow_run
privileged workflow but verifying this information to make sure its the expected one.
I don't mind taking that approach, and I do agree with you that pull_request
+ workflow_run
is the approach I would generally advise anyone else due to the risks of pull_request_target
, but I would appreciate clarification on what I've done with this PR that does not mirror the equivalent pull_request
environment for running docs-preview-prepare.yml
?
The default permissions
for pull_request
when the PR is from a fork, to my knowledge is read
for all scopes, while secrets
is none
? Have I not secured docs-preview-prepare.yml
in the same way by restricting permissions
and secrets
via this separate workflow_call
workflow?
Is there something else that pull_request
does differently to be more secure?
With pull_request_target
the PR author cannot modify the workflow that is run, unlike pull_request
.
Presently that benefit isn't relevant due to build-docs.sh
, but otherwise the build process would be sandboxed within the Docker container used, thus even if the build process was manipulated (eg: compromise via Docker image or it's dependencies), it is very unlikely an attack could then affect something on the Docker host (runner) provided our only interaction with the build output is via artifact upload/download of the build directory.
UPDATE: #4267 As another alternative (Solution 4). I have recently documented This would be for # Minimum read permissions needed for `gh pr view`:
permissions:
pull-requests: read
contents: read
jobs:
# ...
steps:
- name: 'Get PR context'
id: pr-context
env:
# Token required for GH CLI:
GH_TOKEN: ${{ github.token }}
# Best practice for scripts is to reference via ENV at runtime. Avoid using the expression syntax in the script content directly:
PR_TARGET_REPO: ${{ github.repository }}
# If the PR is from a fork, prefix it with `<owner-login>:`, otherwise only the PR branch name is relevant:
PR_BRANCH: |-
${{
(github.event.workflow_run.head_repository.owner.login != github.event.workflow_run.repository.owner.login)
&& format('{0}:{1}', github.event.workflow_run.head_repository.owner.login, github.event.workflow_run.head_branch)
|| github.event.workflow_run.head_branch
}}
# Use GH CLI to query the PR branch for the PR number and assign it to the steps output:
# (`--jq` formats JSON to `key=value` pairs and renames `headRefOid` to `head-sha`)
run: |
gh pr view --repo "${PR_TARGET_REPO}" "${PR_BRANCH}" \
--json 'number,headRefOid' \
--jq '"number=\(.number)\nhead-sha=\(.headRefOid)"' \
>> $GITHUB_OUTPUT This then provides the PR context via outputs |
I think this is the safest approach. Unfortunately using |
Thanks for pointing this out! I had read similar caution from the GHA docs on A little unfortunate perhaps that we cannot opt-out of these attack vectors when we do not need the workflow to support that functionality. Technically if my concern with
Any specific reason for the Validating a PR number or SHA hash value are simple enough (I documented those here under "Solution A"), but gets a bit more complicated when the values aren't as simple to validate, and that validation alone does not ensure they're the expected values 🤔 Since these specific values can be reliably sourced via I'd really like to express my gratitude for your feedback and insights shared here. The articles you've linked are fantastic at demonstrating the dangers of While I do prefer this I will move forward with Solution B ( |
Closing in favor of #4267 Thank you so much @pwntester for your feedback! ❤️ I almost made an even bigger mistake here with If you engage with future projects, or someone from GHSL decides to write up another article on how to approach this workflow securely, I hope this reference I've pieced together can be helpful. It's the best I can do to give back :) |
Great write-up! Im planning a blog post about common pitfalls and the missing validations of artifacts in |
Description
This should restore our docs preview workflow to be compatible with PRs from forks.
pull_request_target
) discussed here: ci: Revisedocs-preview-deploy.yml
#4247 (comment)Context
Originally we had compatibility with PRs from forks with the
pull_request
event workflow, but relied on passing the PR metadata via ENV in a file that the PR author could manipulate which was a security risk.workflow_run
when PRs were from forks.pull_request_target
seems like the most appropriate choice, provided it's handled carefully by maintainers.Change overview
Changes from
pull_request
(untrusted) +workflow_run
(trusted) workflows, to use apull_request_target
workflow with separateprepare
(untrusted) +deploy
(trusted) re-usable workflows managed.docs-preview.yml
is now the entrypoint (viapull_request_target
event trigger).prepare
+deploy
receive any configurables and secrets via inputs (JSON) fromdocs-preview.yml
.pull_request_target
can safely be used to build the PR in a restricted trust environment. The only secret available isGITHUB_TOKEN
with minimal permissions, but this should not be accessible to the contributor as their PR cannot modify this workflow like they could withpull_request
event based workflows, and theactions/checkout
step usespersist-credentials: false
to ensure it's accidentally stored on disk.actions/downloads
step and removes the need for commit status management to have a check suite status for theworkflow_run
workflow previously used.docs-preview-prepare.yml
+docs-preview-deploy.yml
provide re-usable workflows (workflow_call
trigger) instead of inline jobs indocs-preview
to minimizepermissions
andsecrets
to each job accordingly. It also hopefully communicates the distinction of untrusted vs trusted contexts for maintainers.Type of change
Checklist
CHANGELOG.md