I don't think these questions are that interesting. SSHD shared an address space with xz-utils because xz utils provides a shared library, and that's how dynamic linking works. sshd uses libsystemd on platforms with systemd because systemd is the tool that manages daemons and services like sshd, and libsystemd is the bespoke way for daemons to talk to it (and more importantly, it is already there in the distro - so you're not "adding a million line dependency" so much as linking against a system library from the OS developers that you need).
Linking against libsystemd on Debian is about as suspicious as linking against libsystem on MacOS. It's a userspace library that you can hypothetically avoid, but you shouldn't.
As for why systemd links against xz, I don't know, and it's a bit surprising that an init system needs compression utils but not particularly surprising given the kitchen sink architecture of systemd.
> As for why systemd links against xz, I don't know, and it's a bit surprising that an init system needs compression utils
It's for the journal. It can be optionally compressed with zlib, lzma, or zstd. That library had not only the sd_notify function which sshd needed, but also several functions to manipulate the journal.
and that's how dynamic linking works -- really ignorant comment
Read the lobste.rs thread for some quotes on process separation and the Unix philosophy.
There are mechanisms other than dynamic linking -- this is precisely the question.
Also, Unix supported remote logins BEFORE dynamic linking existed.
---
What about sshd didn't work prior to 2015?
Was the dependency worth it?
not particularly surprising given the kitchen sink architecture of systemd
That's exactly the point -- does systemd need a kitchen sink architecture?
---
The questions are interesting because they likely lead to simple and effective mitigations.
They are interesting because critical dependencies on poorly maintained projects may cause nation states to attack single maintainers with social engineering.
Solutions like "let's create a Big tech funded consortium for security" already exist (Linux foundation threw some money at bash in 2014 after ShellShock).
That can be part of the solution, but I doubt it's the most effective one.
I don't think it's acceptable to create a subprocess for what's effectively a library function call because it comes from a dependency.
The problem is the design of rtld and the dynamic linking model, where one shared library can detect and hijack the function calls of another by using the auditing features of rtld. Hardened environments already forbid LD_PRELOAD for injection attacks like this, but miss audit hooks.
My point is that just saying we should use the Unix process model as the defense for supply chain attacks is like using a hammer to fix a problem that needs a scalpel.
I don't agree -- process isolation is a simple, effective, and traditional mechanism. It has downsides (e.g. parsing and serializing), but they don't apply here.
Many mitigations work at the process level, like ASLR, cgroups, and more.
There's a reason that Chrome doesn't allow parsing and rendering in the same process:
The fundamental difference is that processes are used to sandbox sections of the program that run untrusted code. A subroutine from a library isn't untrusted code.
Having written multi-process programs used for this purpose, I disagree that it's simple or traditional for the purposes of hardening against supply chain attacks in shared libraries. Only at the most naive level is it appropriate.
And on top of that, it would not have prevented the backdoor. It would only have changed the mechanism used to invoke it.
It's certainly a reasonable question to ask in this specific case. In hindsight Debian and Red Hat both bet badly when patching OpenSSH in a way that introduced the possibility of this specific supply chain attack.
> If so, we should be proactively removing and locking down other dependencies, because it will likely be an effective and simple mitigation.
I think this has always been important, and it remains so, but incidents like this really drive the point home. For any piece of software that has such a huge attack surface as ssh does, the stakes are even higher, and so the idea of introducing extra dependencies here should be approached with extreme caution indeed.
> It's certainly a reasonable question to ask in this specific case. In hindsight Debian and Red Hat both bet badly when patching OpenSSH in a way that introduced the possibility of this specific supply chain attack.
Notably, Debian bet badly again. They already had this mistake pointed out to them very publicly with the OpenSSL random generator fiasco. Yet they chose to continue applying patches that are not accepted upstream while evidently not understanding the rammifications. Why? Shouldn't there have been a policy change in the Debian project to prevent this from happening again?
Another point relevant on the timeline is when downstream starts using binaries instead of source.
I think people are flying past that important piece of the hack. Without that this would not have been possible. If there is a trusted source in the middle building the binaries instead of the single maintainer and the hacker this attack becomes extremely hard to slip by people.
I'm not familiar with how distros get the source code for upstream dependencies. I'm trying to understand what Andres meant when he said this:
> One portion of the backdoor is solely in the distributed tarballs
Is it that the tarball created and signed by Jia had the backdoor, but this backdoor wasn't present in the repo on github? And the Debian (or any distro) maintainers use the source code from tarball without comparing against what is in the public github repo? And how does that tarball get to Debian?
The threat actor had signed and uploaded the compromised source tarball to GitHub as a release artifact.
They then applied for an NMU (non-maintainer upload) with Debian, which got accepted, and that's how the tarball ended up on Debian's infrastructure.
Thanks for the extra explanation. I guess this is harder to protect against than I thought and it's more that some distro's got somewhat lucky than debian and fedora doing something that is out of the ordinary.
That's not what happened. Downstream was building from source, that source just had malicious code in it.
One part was binary, the test file (pretty common), but checked into the repo. One part was in the build config/script, but was in the source tarball and not in the repo.
> Another point relevant on the timeline is when downstream starts using binaries instead of source.
No downstream was using binaries instead of source. Debian and Fedora rebuild everything from source, they don't use the binaries supplied by the maintainer. The backdoor was inserted into the build system.
It only happened in the last 10 years apparently.
Why do sshd and xz-utils share an address space?
When was the sshd -> systemd dependency introduced?
When was the systemd -> xz-utils dependency introduced?
---
To me this ARCHITECTURE issue is actually bigger than the social engineering, the details of the shell script, and the details of the payload.
I believe that for most of the life of xz-utils, it was a "harmless" command line tool.
In the last 10 years, a dependency was silently introduced on some distros, like Debian and Fedora.
Now maintainer Lasse Collin becomes a target of Jia Tan.
If the dependency didn't exist, then I don't think anyone would be bothering Collin.
---
I asked this same question here, and got some good answers:
https://lobste.rs/s/uihyvs/backdoor_upstream_xz_liblzma_lead...
Probably around 2015?
So it took ~9 years for attackers to find this avenue, develop an exploit, and create accounts for social engieering?
If so, we should be proactively removing and locking down other dependencies, because it will likely be an effective and simple mitigation.