The key to this attack is: "The result of these settings is that, by default, any repository contributor can execute code on the self-hosted runner by submitting a malicious PR."
Problem: you need to be a "contributor" to the repo for your PR to trigger workflows without someone approving them first.
So: "We needed to be a contributor to the PyTorch repository to execute workflows, but we didn’t feel like spending time adding features to PyTorch. Instead, we found a typo in a markdown file and submitted a fix."
I really don't like this aspect of GitHub that people who have submitted a typo fix gain additional privileges on the repo by default. That's something GitHub can fix: I think "this user gets to trigger PRs without approval in the future" should be an active button repo administrators need to click, maybe in the PR flow there could be "Approve this run" and "Approve this run and all future runs by user X" buttons.
The vast majority of repos should be able to run CI on pull requests with no privileges at all. GitHub can manage any resource utilization issues on their end.
Is the issue here that a self-hosted runner was needed for some hardware tests?
The problem is that there are fundamentally 2 different kinds of builds, but the current tooling is weak:
* pre-merge builds on PRs. These should not have privileges, but making the distinction between the two cases requires a lot of care.
* official builds of "master" or a feature branch from the main repo. These "need" privileges to upload the resulting artifacts somewhere. Of course, if all it did was wake up some daemon elsewhere, which could download straight from the CI in a verified way based on the CI's notion of the job name, it would be secure without privileges, but most CI systems don't want to preserve huge artifacts, and maintaining the separate daemon is also annoying.
If you're using GHA to publish then you still need a trusted branch to provide secrets to. If you're not publishing using CI, then you can still upload to PyPI manually with an API Token.
> without needing to give microsoft total access to uploading to pypi
I assume you're referring to Trusted Publishers here? It's a per-project configuration using the industry standard of OIDC that you don't have to opt in to, so "total access" is a silly characterization. Also if you're insinuating that MS is going to generate fraudulent OIDC tokens to compromise a PyPI package, then you might want to start weaning yourself off the kool-aid.
Most likely the plan is to make it compulsory for all projects eventually, just like they made 2fa compulsory.
So less secure is considered as MORE secure by pypi :) Which is consistent with the idea that no PGP signature is more secure than signed uploads. Or the idea that a global token in a clear text file is somehow safer than a password that gets typed every time.
Ideally the builds for external PRs should not gain any access to any secret. But it is not practical. For example, you may want to use docker containers. Then you will need to download docker images from somewhere. For example, downloading ubuntu:20.04 from docker hub. That one requires a token, otherwise your request may get throttled. Even accessing most Github services requires a token. I agree with you that these things require a lot of care. In reality most dev teams are not willing to put enough time on securing their build system. That's why supply chain attacks are so common today.
Absolutely. The real difficulty is tests on PR are by definition remote code execution by an untrusted source, so a full risk analysis and hardening needs to be done.
Pytorch did use GH secrets for the valuables and you can see that this wasn't enough, right there in the OP, because the self-hosted runners are still shared
Yup! This is what makes this kind of attack scary and very unique to GitHub Actions. The baseline GITHUB_TOKEN just blows the door open on lateral movement via workflow_dispatch and and repository_dispatch events.
In several of our other operations, not just PyTorch, we leveraged workflow_dispatch to steal a PAT from another workflows. Developers tend to over-provision PATs so often. More often than not we'd end up with a PAT that has all scopes checked and org admin permissions. With that one could clean out all of the secrets from an organization in minutes using automated tools such as https://github.com/praetorian-inc/gato.
Correct. For fork PR workflows on the pull_request trigger the GITHUB_TOKEN has read only permissions, so you can’t do anything with it.
The key thing with a non-ephemeral runner is that (after obtaining persistence) you can grab the GITHUB_TOKEN from a subsequent non-fork PR build or a build on another trigger, which will have write permissions unless restricted by the repository maintainers.
I think 'environments' was meant, where diff GHA environments get diff secrets, and policies dictate who gets to run what actions with what envs.
But that is real work to setup, audit, and maintain. It'd be better if, like phone app capabilities, the default would be no privs, any privs are explicitly granted, and if they aren't being used, the system detects that and asks if you want to remove specific ones.
Ouch! This is why ephemeral runners should be used. Preferably virtual machines. On an infrastructure that can define security group rules to prevent lateral movement.
>The vast majority of repos should be able to run CI on pull requests with no privileges at all
When there are no side effects and no in-container secrets and the hosting is free or reasonably limited to prevent abusers, ideally yes.
Outside that, heck no, that'd be crazy. You're allowing randos to run arbitrary code on your budget. Locking it down until it's reviewed is like step 1, they can validate locally until then.
They are so difficult. I wanted to stop random people to run code on my repository… I don't have any secrets or write access or anything to exploit. Just to avoid burning quota.
The issue is that now the pull requests don't get tested at all. I have to manually, locally, get all the commits, make a branch on the main repository with them, and then the actions run.
That just plain sounds Bad™, yeah. In the buildkite setup I interact with, I can at least just hit the "continue" button on the paused pipeline, regardless of any other GitHub state... which feels like table stakes to me.
My prior is that GitHub is pretty competent in understanding malicious PRs to open source repos, and wouldn’t penalize the repo owner without other evidence of wrongdoing.
GitHub themselves don't seem to provide any mechanism to make runners ephemeral. It looks like all they allow you to do is flag a runner as ephemeral, meaning it will be de-registered once a job is completed - you need to write your own tooling to wipe it yourself (either via starting a whole new runner in a new environment and registering that or wiping the existing runner and re-registering it).
I've just made runs-on [1] for that purpose: self-hosted, ephemeral runners for GitHub Action workflows. Long-running self-hosted runners are simply too risky if your project is public.
(1) disclosure, maintainer
(2) zero implicit trust in this case = no open inbound ports on underlay; need to access via app-specific overlay which requires strong identity, authN, authZ
The default kubernetes implementation owned by github[1] assumes ephemeral runners by default. You can also specify what policies they should have using regular network policies provided by kubernetes. So, if you have a kubernetes cluster, that's the way to go.
They do care about this stuff though - a while ago I pointed out that `pr_closed` was run with the workflows from the PR(^); it was fixed promptly.
^So you could reject a PR that tries to add a bitcoin-mining workflow only to have it start bitcoin-mining on close. Or leaking secrets, releasing nonsense, etc. (I don't recall if non-/contributor status was also a factor. I think it was also ignoring that check.)
This is more subtle, but there is an “author_association”field within Actions event contexts that can be one of:
NONE, CONTRIBUTOR, COLLABORATOR, MEMBER, OWNER
There are some cases where people use checks for that as part of gating for workflows that run on pull_request_target/issue_comment, but might confuse contributor and collaborator (which requires explicitly adding someone to the repository). Ultimately this is a misconfiguration on part of the maintainer but another example where fixing a typo can play a part in an attack.
Problem: you need to be a "contributor" to the repo for your PR to trigger workflows without someone approving them first.
So: "We needed to be a contributor to the PyTorch repository to execute workflows, but we didn’t feel like spending time adding features to PyTorch. Instead, we found a typo in a markdown file and submitted a fix."
I really don't like this aspect of GitHub that people who have submitted a typo fix gain additional privileges on the repo by default. That's something GitHub can fix: I think "this user gets to trigger PRs without approval in the future" should be an active button repo administrators need to click, maybe in the PR flow there could be "Approve this run" and "Approve this run and all future runs by user X" buttons.