What to learn from the 'xz' incident // BLEEN.DEV

On march 29th 2024 a backdoor in an obscure open source package called xz was discovered. xz is a library containing several compression algorithms together with a few utilities to use that library. It is one (but not the only) option openssh uses to provide support for connection compression.

The way the backdoor was inserted is a bit unconventional, to say the least. It’s provided as a binary blob used for testing, and as such the code to get the script that inserts the backdoor is not really visible in the code itself. Not even the testing code.

Of course, having a backdoor in the most used remote access software is a really bad thing. While it’s unclear (at this point) how and if the backdoor has been used, the more cautious people amongst us recommend reinstalling all systems which had the ssh port open while the backdoor was present.

The aftermath for the xz project

As of now, the xz github repository is disabled. There are forks still open, but those are somewhat out of date. I have not been able to find the actually exploit file (bad-3-corrupt_lzma2.xz) in the forks I looked at, probably just because of that.

There is also still a bit of confusion about which distributions are hit by this, the only thing I am sure about is that RHEL is not affected because they always use ancient libraries and utilities (I’ll probably rant about their inability to provide even remotely recent versions of PHP another time).

How can we prevent vulnerabilities like this in future

While it is a good idea to test decompression/decryption by running the code on compressed/encrypted files, I always cringe at binary files in source repositories. There is both an encoder and a decoder in the xz package, so one method of testing would be to use the encoder to create the file synthetically and then decrypt it. But sinc this file is supposed to be a bad file, ie. some corruption should have occurred, that may not be an option here. What is an option is to specifically create the corruption, eg. with a small program that writes very visible content to the compressed/encrypted file at a predefined position. The other option would be to use a fuzzing tool to introduce that corruption. Which is a technique eg. the openssl project is using (and it is one reason for why their CI runs are going on and on for 45 minutes or so).

The other problem is the vetting of open source contributors, and this incident will have a really bad impact on the already bad situation regarding recruitment of contributors. A case of one bad apple spoiling the harvest, so to say. The contributor introducing the exploit file into the codebase has been making contributions for years. How do you detect this kind of bad apple?

And then you have the problem of developer machines (and testing rigs) being connected to the internet. Can’t really do anything about that. Developers will have an internet connection, and most of the time they need it. They need external resources like github or similar, and most will not have a second machine to do their development-related web searches (insert stackoverflow meme here) or related stuff, nor will they run their tests in an isolated container (see the first reason this will almost never work).

One of the theories about who and how and why is that the bad contributor has a nation state actor (because they are slow-rolling and got discovered right before “the big launch”, just like your typical government IT project), but well, until a connection between this developer and a nation state can at least be shown to be reasonable I am holding my horses.

So, to put it in one sentence: no binary files in repo, synthetically create test data or use fuzzing tools to test.

Disclaimer

This has been written with minimal input. I’ve read the original report on ycombinator and watched Low Level Learning’s video on the subject (malicious backdoor found in widely used library) which I recommend for anybody wanting to learn a bit more about how it all worked. My job is not directly related to security, but as a system administrator/devops engineer security is very much near the top of my responsibility list, so yes, this problem will have an impact on my work.