An update on rust/coreutils

Janvier 29th, 2022

TLDR: we are making progress on the Rust implementation of the GNU coreutils.

Well, it is an understatement to say my previous blog post interested many people. Many articles, blog posts and some podcasts talked about it! As we pushed coreutils 0.0.12 a few days ago and getting closer to the 10 000 stars on github, it is now time to give an update!

This has brought a lot of new contributors to this project. Instead of 30 to 60 patches per month, we jumped to 400 to 472 patches every month. Similarly, we saw an increase in the number of contributors (20 to 50 per month from 3 to 8). Two new maintainers (Michael Debertol & Terts Diepraam) stepped in and have been doing a much better job than myself as reviewers now! As a silly metric, according to github, we had 5 561 clones of the repository over the last 2 weeks!

The new contributors focused on:

  • Performances. Now, some binaries are significantly faster than GNU (ex: head, cut, etc)
  • Adding missing binaries or options (see below)
  • Improve the testsuite: we grew the overall code coverage from 55% to 75% (in general, we consider that a 80% code coverage on a project is excellent).
  • Refactoring the code to simplify the maintenance. Examples:

    • Using the same code for permissions for chgrp and chown
    • Managing error the same way in the various binaries - (Kudos to Jeffrey Finkelstein for the huge work)
    • Improving the GNU compatibility (thanks to Jan Verbeek, Jan Scheer, kimono-koans and many others)
    • Move to clap 3. Upgrade by Terts which unblocks us on various problems.
  • ...

Closing the gap with GNU

As far as I know, we are only missing stty (change and print terminal line settings) as a program.

Thanks to some heroes, basenc, pr, chcon and runcon have been implemented. For example, for the two last programs, Koutheir Attouchi wrote new crates to manage SELinux properly. This crate has been used for some other utilities like cp, ls or id.

Leveraging the GNU testsuite to test this implementation

Because the GNU testsuite is excellent, we now have a proper CI using it to run the tests. It is pretty long on the Github action CI (almost two hours to run it) but it is an amazing improvement to the way we work. It was a joint work from a bunch of folks (James Robson, Roy Ivy III, etc). To achieve this, we also made it easier to run the GNU testsuite locally with the Rust implementation but also to ignore some tests or adjust some error messages (see build-gnu.sh and run-gnu-test.sh).

Following a suggestion of Brian G, a colleague at Mozilla (he did the same for some Firefox major change), we are now collecting the history of fail/pass/error into a separate repository and generating a daily graph showing the evolution of regression. Evolution over time At this date, we have, with GNU/Coreutils 9.0:

Total 611 tests
Pass 214
Skip 84
Fail 298
Error 15

We are now automatically identifying new passing tests and regressions in the CI.

For example:

Warning: Congrats! The gnu test tests/chmod/c-option is now passing!
<br />Warning: Congrats! The gnu test tests/chmod/silent is now passing!
<br />Warning: Congrats! The gnu test tests/chmod/umask-x is now passing!
<br />Error: GNU test failed: tests/du/long-from-unreadable. tests/du/long-from-unreadable is passing on 'master'. Maybe you have to rebase?
[...]
<br />Warning: Changes from master: PASS +4 / FAIL +0 / ERROR -4 / SKIP +0

This is also beneficial to GNU as, by implementing some options, Michael Debertol noticed some incorrect behaviors (with sort and cat) or an uninitialized variable (with chmod).

Documentations

Every day, we are generating the user documentation and of the internal coreutils.

User documentation: https://uutils.github.io/coreutils-docs/user/ Example: ls or cp

The internal documentation can be seen on: https://uutils.github.io/coreutils-docs/dev/uucore/
For example, the backup style is documented here: https://uutils.github.io/coreutils-docs/dev/uucore/backup_control/index.html

More?

Besides my work on Debian/Ubuntu, I have also noticed that more and more operating systems are starting to look at this:

In parallel, https://github.com/uutils/findutils/, a rust dropped-in replacement for find, is getting more attention lately! Here, the graph showing the evolution of the program using the BFS testsuite (much better than GNU's).

Evolution over time - BFS testsuite

What is next?

  1. stty needs to be implemented
  2. Improve the GNU compatibility on key programs and reduce the gap
  3. Investigate how to reduce the size of the binaries
  4. Allow Debian and Ubuntu to switch by default without tricky manipulation

How to help?

I have been maintaining a list of good first bugs for new comers in the repo!

Don't hesitate to contribute, it is much easier than it seems and a terrific way to learn Rust!

 

Debian running on Rust coreutils

Mars 9th, 2021

tldr: Rust/coreutils ( https://github.com/uutils/coreutils/ ) is now available in Debian, good enough to boot a Debian with GNOME, install the top 1000 packages, build Firefox, the Linux Kernel and LLVM/Clang. Even if I wrote more than 100 patches to achieve that, it will probably be a bumpy ride for many other use cases.
It is also a terrific project to learn Rust. See the list of good first bugs.

Even if I see Rust code every day at Mozilla, I was looking for an actual personal project (i.e. this isn't a Mozilla project) to learn Rust during the various COVID lockdowns.

I started contributing to the alternative Coreutils developed in Rust. The project aims at proposing a drop-in replacement of the C-based GNU Coreutils, and I wanted to evaluate if this could be used to run a regular Debian. Similar to what I have done with clang.debian.net a few years ago (rebuilding the Debian archive using clang instead of gcc).

I expect that most of the readers know what is the Coreutils. It is a set of programs performing simple operations (copy/move file, change permissions/ownership, etc). Even if some commands are from the 70s, they are at the base of Linux, Unix and macOS. While different implementations can be found, they are trying to remain compatible in terms of arguments, options, etc. This implementation of Coreutils isn’t different!

If you want to learn more about the history of Unix, I recommend this great Corecursive podcast with Brian Kernighan.

While a lot of people contributed to this project, much was left to be done:

  • missing programs to be implemented. See https://github.com/uutils/coreutils#utilities
  • missing options in the various programs
  • code not following latest Rust best practices
  • lack of consistency in the code base (ex: functions with too many arguments)
  • lack of tests/low code coverage
  • Lots of failures when running the GNU Coreutils testsuite (141 tests pass on 613) - Some trivial to fix, some others harder… A good way to start on a Rust project!

To start easy, I defined 4 goals for this work:

  1. Package Coreutils in Debian/Ubuntu
  2. Boot a Debian system with a Rust-based coreutils
  3. Install the top 1000 packages in Debian - including GNOME
  4. Build Firefox, the Linux Kernel and LLV/Clang

Packaging of Coreutils in Debian

Packaging in Debian isn't a trivial or even simple task. It requires uploading independently all the dependencies in the archive. Rust, with its new ecosystem and small crates, is making this task significantly harder.

The package is called rust-coreutils - https://tracker.debian.org/pkg/rust-coreutils

For Debian/Ubuntu users, to have an idea of the complexity of packaging such applications, just run
debtree --build-dep rust-coreutils | dot -Tsvg > coreutils.svg (should be around 1M).

Since it isn't production ready, the rust-coreutils is installable in parallel with coreutils. This package does NOT replace the GNU/coreutils files (yet?), the new files are installed in /usr/lib/cargo/bin/.

They can be used with:

export PATH=/usr/lib/cargo/bin/:$PATH

Or, uglier, overriding the files with the new ones.

Booting Debian with rust-coreutils

To achieve this, because I knew I would likely break the image a few times, I created a new project to quickly install a full Debian with PXE and preseed.

The project is available here:

https://github.com/opencollab/qemu-debian-install-pxe-preseed/

A script to create the full qemu image: build_qemu_debian_image.sh

A second script to boot on the newly created image: boot.sh

Then, building and installing coreutils on the system (yeah, it is ugly - don’t do that at home):

apt install rust-coreutils
cd /usr/lib/cargo/bin/
for f in *; do
cp -f $f /usr/bin/
done

First surprise, unlike the old init.d init system, as systemd is not relying on a series of scripts (it is mostly written in C), replacing the coreutils did not have an impact. Therefore, I didn't experience any issue during the boot process

Debian packages rely a lot on post-install scripts (stored in /var/lib/dpkg/info/*) to finalize and configure packages. They are (almost?) all using /bin/sh (or /bin/bash) to perform these actions. They intensively call coreutils applications.

For example, in /var/lib/dpkg/info/exim4-base.postinst, we can find:

install -d -oDebian-exim -gadm -m2750 /var/log/exim4

With an ugly script, we can test the installations of the 1000 most popular packages one by one.

Running this, some classes of issues in Rust/coreutils could be easily identified.

Implementing missing options

A significant number of problems could be easily identified as a lack of support for some options.

Here is a list of most of the fixes I had to implement to make this plan work:

Different behavior

Most of the programs behaved as expected. Here is a list of differences:

  • install doesn't support using /dev/null as source file
    Setting up libreoffice-common (1:6.1.5-3+deb10u6) ...
    install: error: install: cannot install ‘/dev/null’ to ‘/etc/apparmor.d/local/usr.lib.libreoffice.program.oosplash’: the source path is not an existing regular file
    A limitation of rust itself https://github.com/rust-lang/rust/issues/79390

Compile Firefox, Clang and the Linux Kernel

Build systems can vary significantly one from the other.

To verify their usage of coreutils, I built these three major projects

Firefox

As Firefox relies mostly on Python as a build system, it went smoothly. I didn’t encounter any issue.

The only unrelated issue that I noticed working on it was apt-key was broken because the script relied on a buggy option of mktemp.

Linux Kernel

I identified only two issues compared to GNU Coreutils:

  • The chown command on a non-existing symlink target doesn’t fail on the GNU version, the Rust one was triggering an error.
    https://github.com/uutils/coreutils/pull/1694
  • Linux kernel
    ln -fsn ../../x86/boot/bzImage ./arch/x86_64/boot/bzImage
    ln: error: Unrecognized option: 'n'

LLVM/Clang

The llvm toolchain relies on Cmake. Just like for Firefox, I didn’t face any issue.

Comparing with GNU coreutils using its testsuite

Recently, James Robson added a new test to run the GNU testsuite on the Rust/coreutils.

# TOTAL: 611
# PASS: 144
# SKIP: 86
# XFAIL: 0
# FAIL: 342
# XPASS: 0
# ERROR: 39
compared to 546 test passing with the GNU version. Even if a bunch of errors are just different outputs, it demonstrates that there is still a long road ahead.

Next steps & contribute

First, we will need more motivated contributors to work on this project. Many features remain to be implemented, optimizations to be done (e.g. decreasing the memory usage), etc.
I started to create a list of good first bugs for newcomers:

https://github.com/uutils/coreutils/issues?q=is%3Aissue+is%3Aopen+label%3A%22Good+first+bug%22
I will update this list of there is some interest for this project.

Helping improve the support of the GNU coreutils testsuite would be a huge step while being a great way to learn Rust!

Then, once it is in a better state, we will be able to make it a reliable alternative in Debian/Ubuntu to the GNU/Coreutils.

This might be also interesting for other folks who prefer a BSD license over a GPL.



Debian rebuild with clang 10 + some patches

Juin 2nd, 2020

Because of the lock-down in France and thanks to Lucas, I have been able to make some progress rebuilding Debian with clang instead of gcc.

TLDR

Instead of patching clang itself, I used a different approach this time: patching Debian tools or implementing some workaround to mitigate an issue.
The percentage of packages failing drop from 4.5% to 3.6% (1400 packages to 1110 - on a total of 31014).

I focused on two classes of issues:

Qmake

As I have no intention to merge the patch upstream, I used a very dirty workaround. I overwrote the g++ qmake file by clang's:
https://salsa.debian.org/lucas/collab-qa-tools/-/blob/master/modes/clang10#L44-47

I dropped the number of this failure to 0, making some packages build flawlessly (example: qtcreator, chessx, fwbuilder, etc).

However, some packages are still failing later and therefore increasing the number of failures in some other categories like link error. For example, qtads fails because of ordered comparison between pointer and zero or oscar fails on a -Werror,-Wdeprecated-copy error.

Breaking the build later also highlighted some new classes of issues which didn't occur with clang < 10.
For example, warnings related to C++ range loop or implicit int float conversion (I fixed a bunch of them in Firefox) .

Symbol differences

Historically, symbol management for C++ in Debian has been a pain. Russ Allbery wrote a blog post in 2012 explaining the situation. AFAIK, it hasn't changed much.
Once more, I took the dirty approach: if there new or missing symbols, don't fail the build.
The rational is the following: Packages in the Debian archive are supposed to build without any issue. If there is new or missing symbols, it is probably clang generating a different library but this library is very likely working as expected (and usable by a program compiled with g++ or clang). It is purely a different approach taken by the compiler developer.

In order to mitigate this issue, before the build starts, I am modifying dpkg-gensymbols to transform the error into a warning.
So, the typical Debian error some new symbols appeared in the symbols file or some symbols or patterns disappeared in the symbols file will NOT fail the build.

Unsurprisingly, all but one package (libktorrent) build.

Even if I am pessimistic, I reported a bug on dpkg-dev to evaluate if we could improve dpkg-gensymbol not to fail on these cases.

Next steps

The next offender is Imake.tmpl:2243:10: fatal error: ' X11 .rules' file not found with more than an hundred occurrences, reported upstream quite sometime ago.

Then, the big issues are going to be much harder to fix as they are real issues/warnings (with -Werror) in the code of the packages. Example: -Wc++11-narrowing & Wreserved-user-defined-literal... The list is long.
I will probably work on that when llvm/clang 11 are in RC phase.

For maintainers & upstream

Maintainer of Debian/Ubuntu packages? I am providing a list of failing packages per maintainer: https://clang.debian.net/maintainers.php
For upstream, it is also easy to test with clang. Usually, apt install clang && CC=clang CXX=clang++ <build step> is good enough.

Conclusion

With these two changes, I have been able to fix about 290 packages. I think I will be able to get that down a bit more but we will soon reach a plateau as many warnings/issues will have to fix in the C/C++ code itself.

Some clang rebuild results (8.0.1, 9.0.1 & 10rc2)

Mars 22nd, 2020

As part of the LLVM release cycle, I am continuing rebuilding the Debian archive with clang instead of gcc to evaluate potential regressions.

Processed results are available on the website: https://clang.debian.net/status.php - Now includes some fancy graphs to show the evolution
Raw logs are published on github: https://github.com/opencollab/clang.debian.net/tree/master/logs

Since my last blog post on the subject (August 2017), Clang is more and more present in the tech ecosystem. It is now the compiler used to build Firefox and Chrome upstream binaries on all the supported architectures/operating systems. More architectures are supported, it has a new linker (lld), a new hybrid IR (MLIR), a lot of checkers in clang-tidy, cross-language linking with Rust, etc.


Results

Now, about Debian results, we rebuilt using 8.0.1, 9.0.1 and 10.0rc2. Results are pretty similar to what we had with previous versions: between 4 to 5% of packages are failing when gcc is replaced by clang.

Some clang rebuild results (8.0.1, 9.0.1 &amp; 10rc2)

Even if most of the software are still using gcc as compiler, we can see that clang has a positive effect on code quality. With many different kinds of errors and warnings found clang over the years, we noticed a steady decline of the number of errors. For example, the number of incorrect C/C++ main declarations has been decreasing years after years:

Some clang rebuild results (8.0.1, 9.0.1 &amp; 10rc2)

Errors found

The biggest offender is still the qmake changes which doesn't allow the used workaround (replacing /usr/bin/gcc by /usr/bin/clang) - about 250 errors. Most of these packages would probably compile fine with clang. More on the Qt bug tracker. The workaround proposed in the bug isn't applicable for us as we use the dropped-in replacement of the compiler.

The second error is still some differences in symbol generation. Unlike gcc, it seems that clang doesn't generate some symbols (or adds some). As a bunch of Debian packages are checking the list of symbols in the library (for ABI management), the build fails on purpose. For example, with libcec, the symbol _ZN10P8PLATFORM14CConditionImplD1Ev@Base 3.1.0 isn't generated anymore. I am not expecting this to be a big deal: the generated libraries probably works most of the time. More on C++ symbol management in Debian.
I reported this bug upstream a while back: https://bugs.llvm.org/show_bug.cgi?id=30441

Current status

As previously said in a blog post, I don't think there is a strong intensive to go away from gcc for most of the Linux distributions. The big reason for BSD was the license (even if the move to the Apache 2 license wasn't received positively by some of them).
While the LLVM/clang ecosystem clearly won the tooling battle, as a C/C++ compiler, gcc is still an excellent compiler which supports more architecture and more languages.
In term of new warnings and checks, as the clang community moved the efforts in clang-tidy (which requires more complex tooling), out of the box, gcc provides a better experience (as example, see the Firefox meta bug to build with -Werror with the default warnings using gcc 9, gcc 10 and clang trunk for example).

Next steps

I see some potential next steps to decrease the number of failure:

  • Workaround the Qt/Qmake issue
  • Fix the objective-c header include issues (echo "#include <objc/objc.h>" > foo.m && clang -c foo.m is currently failing)
  • Identify why clang generates more/less symbols that gcc in the library and try to fix that
  • Rebuild the archive with clang-7 - Seems that I have some data problem

Many thanks to Lucas Nussbaum for the rebuilds.

Rebuild of Debian using Clang 3.9, 4.0 and 5.0

Août 24th, 2017

tldr: The percentage of failure is decreasing, Clang support is improving but there is a long way to go.

The goal of this initiative is to rebuild Debian using Clang as a compiler instead of gcc. I have been doing this analysis for the last 6 years.

Recently, we rebuilt the archive of the Debian archive with Clang 3.9.1 (July 6th), 4.0.1 (July 6th) and 5.0 rc2 (August 20th).

For various reasons, we didn't perform a rebuild since June 2016 with version 3.8. Therefor, we took the opportunity to do three over the last month.

Now, the 3.9 & 4.0 results are impacted by a build failure when building all haskell packages (the -no-pie option in Clang doesn't exist - I introduced it in clang 5.0). Fixing this issue with 5.0 removed more than 860 failures.

Also, for the same versions, a Qt compiler detection is considering that Clang is not a C++11 compiler because clang++, by default, defines __cplusplus as 199711L (-std=c++11 has to be added to define a correct __cplusplus). See https://bugreports.qt.io/browse/QTBUG-62535 for more information. Some discussions happened on the upstream mailing list about changing the default C++ dialect.
For example, with 4.0, this is causing 132 errors. With 5.0, probably thanks to a new Qt version, roughly the same number of packages are failing but because gcc just triggers a warning with the "nodiscard" attribute being incorrectly used when clang triggers an error.

In parallel, ignoring the haskell build failures, the numbers sightly increased since last year even if the overall percentage decreased (new packages being uploaded in the archive).

VersionBuild failuresIgnoring haskell pkgs
3.81367 / 5.6%
3.92274 / 8.1%1618 / 5.8%
4.02311 / 8.3%1655 / 5.9%
5.01445 / 5.1%

In parallel, new warnings and errors showed up in Clang.
This is causing a new set of build failures (especially with the usage of -Werror).

As few examples:
* Starting with 4.0, clang triggers an error ordered comparison between pointer and zero ('char *' and 'int').
* Similarly, with this version, -Wmain introduces a new warning which will trigger a warning when a bool literal is returned from main.
* clang also introduced a new warning called -Waddress-of-packed-member causing 5 new errors.
* With the same version, clang can trigger a new error when auto is used in function return type.

Now, as a conclusion, having Debian being built with clang by default is still a long shot.
First, when Clang became usable for a general audience, gcc was lagging in term of warning and error detections. Now, gcc is in a much better position than it was, decreasing the interest to have clang replacing gcc. In parallel, most of the efforts in term of warnings
and mistake detections are currently done under the clang tidy umbrella, making them less intrusive as part of this initiative (but harder to use and to deploy).
As an example, the gcc warning -Wmisleading-indentation has been implemented under a clang-tidy checker.
Second, the very permissive license of clang has been a key factor for some operating systems to switch like the PS4, Mac OS X or FreeBSD. With Debian, the community is generally happy with the GPL.
Third, the performances are similar enough that it is not worth the work, except for some projects with very special needs.

Last, despite that it is much easier to contribute to llvm/clang than gcc (not copyright assignment or actual review system for example), this isn't a big differentiator for most of the projects.

Of course, I will continue to run and analysis these rebuilds as this is a great source of information for clang upstream developers to improve the compatibility with gcc and understand some impacts. However, until there is a big game changer, I will stop pursuing the goal of having Debian switching to clang instead of gcc. I will stop effort on the debile project (which was aiming to rebuild in the background packages).