Being reminded that PyPI is a target for law enforcement makes me even more irked that they've removed end-to-end package signing without providing a replacement[0].
PGP signatures—even though rarely used—would allow someone to verify that a signed package was not modified by PyPI after being uploaded by its original author.
Without any sort of signing mechanism, we have to trust the U.S. Government to never demand that PyPI insert a backdoor, via a National Security Letter, FISA court order, or other kangaroo court process. Good luck with that.
The existing PGP signing mechanism had usability issues and security footguns[1], but was better than nothing. It's a shame they didn't roll out a more usable and secure alternative before removing the existing functionality.
If you want to start with tinfoil hat theories, think about this:
The PGP signatures were removed, nominally because few people used them. ...but the timing of the removals is coincidental, no?
"You need to have a backdoor that lets us see who's downloading what packages and let us inject custom code to particular targets"
"That's technically impossible because of..."
"Here is a court order. Implementation is your problem. You're not allowed to tell anyone you even received a court order."
"...well, I guess signed packages have to go then..."
(:
I don't actually believe that, since PGP signing was frankly, barely used and really there's hardly any meaningful difference between a PGP you can't verify (which was most of them) and not having it; in fact the illusion of security is probably worse than not having it at all.
...but still. As you say. It sucks there's no meaningful replacement for it.
PyPI is clearly a passion project for the team and Python community in general so I can't imagine that anyone would allow this or die on this hill to save their salary.
I've tried to dig around whether there's any history or potential of government stopping company from ceasing operation/resigning and honestly nothing came up that wasn't ww2 related. So, I think it's pretty safe to rule out PyPI from doing anything like this.
My comment was not meant to imply that PyPI admins would be OK with this, but the sad situation in the U.S. (and Australia, and other places) is that they'd probably face jail time if they refused to comply. You can't avoid complying with a court order by saying, "sorry, I quit." (And even if "sorry, I quit" was a valid response, you'd be facing tens of thousands of dollars in legal fees to justify it, with a gag order in place that meant you couldn't raise a legal defense fund.)
If you're looking for examples of what the NSL process is like, Nicholas Merrill's story[0] comes to mind.
Further, the fact that admins have this power—even if they'd never use it—makes them an attractive target for black hats. If backdooring packages was easier to detect, it'd be a less attractive option for those that might want to do so.
I'm still hopeful that they'll re-implement some sort of end-to-end signing mechanism, sooner rather than later. I trust PyPI and the people behind it, but I'd like to be able to verify.
Well, AFAIK it's not clear that in the US the courts have the right to compel someone to modify their software in that way. The FBI holds that it does, but so far it's been fought and they've given up when they've tried it. I think if such a thing were to happen, the fundamental ability to secure any software goes out the window. Even package signing, etc go out the window because they can just compel you to produce new software, signed with your existing key.
But let's step back a moment and presume that they do have that ability to compel. The first step here is that none of the PyPI Administrators are the legal owners of PyPI, so such an order would not be sent to any of us, but rather to the PSF itself. The PSF would then be on the hook to either comply or fight said hypothetical order, but individual members of the administration team would not be, and would be free to quit. They may not be able to say why they've quit, but quitting AFAIK would be entirely possible.
The PSF, while not having Apple's war chest, does retain counsel for dealing with things like this, and I can say personally I'd spend myself broke before I'd be willing to do so.
We are going to be implementing signing, and I'm hoping we'll be able to make strong progress on that soon.
I'm the author of that post. There is absolutely no meaningful sense in which PyPI's previous PGP support was (or ever did) provide end-to-end package signing. At the absolute most, when used correctly (which, overwhelmingly, it was not), it provided one half of package signing.
The other half (key retrieval and identity binding) was never provided, because PGP as an ecosystem made doing so intractable. It was not better than nothing, because it was nothing; anything you could have done with it can be done with your own sidecar signatures.
PGP didn't make it intractable, the problem is itself intractable... you're referring to the public key infrastructure (PKI)[1] problem, which many have tried to solve and failed.
PGP can use the only known solution to the problem, which is letting several key servers be configured by the user to import keys (which can then be verified by checking the key fingerprint on another source which is "trusted", like the publisher's own website).
You can still import keys by physically exchanging trusted keys with others (so called Key Signing Party[2]) but that obviously cannot scale... or using any innovative method you come up with, but no one has found a bullet proof way to do this that's usable.
But saying PGP only solves half the problem is wrong. It solves one problem: that of how to verify a publisher's artifacts were not modified, which is valuable.
The next problem to solve is how to obtain and vet public keys from publishers. The solution could work somewhat like TLS certificates (with certificate authorities playing the role of trusted key servers) or using blockchain (perhaps a rare problem for which blockchain could actually be helpful) but both of these bring their own issues with them. If you know of a better solution, though, do bring it up instead of throwing the bathwater out with the "baby"!
I think it’d behoove you to read the original thread from yesterday: all, and more, of this was covered!
PKI is indeed hard, but it’s not even remotely intractable. The Web PKI is a functioning PKI; yesterday’s thread explains how the codesigning scheme we’re building for PyPI is going to look very similar to the Web PKI.
At the ecosystem level, PGP was not providing resource integrity to PyPI: too many of the keys involved were weak, and only a tiny proportion of packages were even signed. Even if that proportion was 100%, PGP would have been the wrong tool for that job: PyPI already has transport and resource integrity via the right tools: TLS and digests. Using an untrusted signature for resource integrity is using the wrong tool for the job.
The original thread contains multiple references to Sigstore, which is the scheme we’re planning on building on for PyPI.
Signing is basically hashing + proof of who created the hash. You need either both, or a way to find which hash is correct according to someone, usually the owner of the artifact, and signing gives you just that.
Signing is only proof of identity if you (1) know the underlying identity, and (2) actually trust that identity for intelligible reasons (i.e., you can produce a formal description of the trust relationship).
Without those two conditions, a signature is a digest produced by an untrusted party. For PyPI, that means that PGP signatures are no better than (and in some senses, worse) than PyPI's own digests, since PyPI at least is a currently trusted party.
A centralized host can't ever be the only reasonable option for trust. They can be manipulated, technically or socially, and that makes everything vulnerable at once.
The Web PKI is built around centralized roots of trust, and survives because of concerted efforts to make those roots resilient, trustworthy (in terms of underlying ownership), and publicly auditable (with mechanisms like CT).
To the best of my knowledge, there has never been a successful decentralized PKI. Even the most successful uses of PGP are not decentralized; they're essentially private PKIs maintained by a small set of presumed trustworthy maintainers.
PGP absolutely is decentralized - I can trust or distrust key X without communicating at all with any external PKI.
I agree that's not all that useful on a global scale - it essentially degrades to the current PKI setup then, because validating everything is expensive and doesn't need to be done by everyone every time to get nearly all of the benefit. But it is a significant difference for individuals making individual decisions.
But a hash provides proof to the actual uploader of subsequent tampering. As you cannot modify the hash without the originator being aware, I think it is enough.
Every security design is built out of a matrix of factors, and some (but not all) of those factors can be made zero.
Being unable to verify your trusted identities in a PKI is one such “zero factor.” It makes the PKI strictly equivalent to (crappy) resource integrity at the best, which is when everything is signed. PGP on PyPI didn’t even manage to clear that hurdle; it was worse than nothing by virtue of advertising properties that it was incapable of providing. That too is a zero-able factor in a security design.
Actually, it very commonly is an all-or-nothing process. It doesn't matter how robust the lock on your front door is, if there is no lock on the back door, or if your window can be smashed. This especially true when it comes to cryptographic security, which is the subject at hand.
I suspect the source of your confusion comes from the idea of differential security, which is approximately "I don't need the best lock; I just need a better lock than the other guy". Again, note that this does not apply to cryptographic signing of packages. Note also that the question of whether or not your system actually is more secure than the other guy's is very much a binary distinction: it either is or it isn't. You can quantify this quite easily by counting vulnerabilities, or by analyzing the degree of access gained for each vulnerability that is encountered.
So yeah, it's one of the few things that tends to be all-or-nothing (up to some threat model, of course).
Perhaps look at Gentoo's model of a single monolithic Git repository. It is possibly the largest and most distributed Merkle tree of software distribution signatures in existence. It is updated a few times every hour by a diverse community and each commit has to be GPG signed so you have the opportunity to verify signatures by looking up developer websites, slides from FOSS conferences, etc to confirm whether the keys have been widely published.
There are some caveats:
* Avoid -9999 packages as you won't get any guarantee of authenticity of whatever will be obtained from the upstream repository, other than whatever trust you place in a X.509 certificate that in all likelihood is controlled by either Microsoft (GitHub) or otherwise accessible to Amazon, Google, etc by nature of common open source project hosting arrangements.
* When syncing your local repository, verify all changes since your last sync. This could be as simple as syncing to a point n-days ago, after which numerous developers you know have signed more recent commits on top (you at least know those developers have been impacted too if the whole repository was compromised and the compromise is now on the public record).
* You don't really know how many people are using the packages you care about, and thus how many other people across the world are also exposed to (and possibly reporting problems with) signatures that Gentoo developers have committed.
In addition to relying on existing sources such as the Gentoo Git repository, an additional way to build trust is setting up software "looking glass" tools in different jurisdictions to check that software downloaded from different carriers in different jurisdictions are all the same.
At least with these measures the attacker has to compromise everyone and make this compromise a public record, rather than just silently compromise one target.
Trust in PGP land is end to end. The keyservers don't matter. They are only a place to pick up keys. Your software verifies that the key is unchanged in that the fingerprint is unchanged. Otherwise it is treated as a separate key. Dead simple.
The confusion here comes from the confusion in the PyPI article about PGP. The article complained that many keys could not be found on keyservers as if that mattered.
The Debian web of trust is a good example of how this stuff actually works. Before you can submit packages to Debian you have to get an existing Debian developer to sign your PGP key. In Debian the trust flows downward from older developers to newer developers.
> Before you can submit packages to Debian you have to get an existing Debian developer to sign your PGP key. In Debian the trust flows downward from older developers to newer developers.
This is not how signing works in Debian at a technical level. At at technical level uploading to Debian requires them to add your key to a list of keys maintained by the archive administrators. As a matter of policy those administrators ask you to get your key signed by an existing Debian Developer, but at no point does their upload infrastructure check that or use the Web of Trust.
That list of keys maintained by the archive administrators are signed by debian developers. That is how the archive admins can be sure that the key is in some sense legit. Otherwise where would be the root of trust?
The root of trust for uploads is the listed of signatures maintained by the archive administrators, flat out.
The requirement for having individual keys signed by Debian Developers just makes it easier for the archive administrators to decipher which keys they want to add to their root of trust. The upload system does not check those signatures at all, they do not need to exist in the slightest as far as the upload system is concerned.
this seems motivated ulterior to the topic, or making a mountain out of a small hill for other reasons. The act of approval is done approximately manually at first, with automation supporting that decision over time. Perfect machines are in short-supply, so to this day there is some manual aspect to this, which is faulted with a tone that is dire ... doesn't add up based on my understanding of this
PGP signatures—even though rarely used—would allow someone to verify that a signed package was not modified by PyPI after being uploaded by its original author.
Without any sort of signing mechanism, we have to trust the U.S. Government to never demand that PyPI insert a backdoor, via a National Security Letter, FISA court order, or other kangaroo court process. Good luck with that.
The existing PGP signing mechanism had usability issues and security footguns[1], but was better than nothing. It's a shame they didn't roll out a more usable and secure alternative before removing the existing functionality.
[0]: https://news.ycombinator.com/item?id=36044543
[1]: https://blog.yossarian.net/2023/05/21/PGP-signatures-on-PyPI...