Catégorie: "Développement"

Debian running on Rust coreutils

Mars 9th, 2021

tldr: Rust/coreutils ( https://github.com/uutils/coreutils/ ) is now available in Debian, good enough to boot a Debian with GNOME, install the top 1000 packages, build Firefox, the Linux Kernel and LLVM/Clang. Even if I wrote more than 100 patches to achieve that, it will probably be a bumpy ride for many other use cases.
It is also a terrific project to learn Rust. See the list of good first bugs.

Even if I see Rust code every day at Mozilla, I was looking for an actual personal project (i.e. this isn't a Mozilla project) to learn Rust during the various COVID lockdowns.

I started contributing to the alternative Coreutils developed in Rust. The project aims at proposing a drop-in replacement of the C-based GNU Coreutils, and I wanted to evaluate if this could be used to run a regular Debian. Similar to what I have done with clang.debian.net a few years ago (rebuilding the Debian archive using clang instead of gcc).

I expect that most of the readers know what is the Coreutils. It is a set of programs performing simple operations (copy/move file, change permissions/ownership, etc). Even if some commands are from the 70s, they are at the base of Linux, Unix and macOS. While different implementations can be found, they are trying to remain compatible in terms of arguments, options, etc. This implementation of Coreutils isn’t different!

If you want to learn more about the history of Unix, I recommend this great Corecursive podcast with Brian Kernighan.

While a lot of people contributed to this project, much was left to be done:

  • missing programs to be implemented. See https://github.com/uutils/coreutils#utilities
  • missing options in the various programs
  • code not following latest Rust best practices
  • lack of consistency in the code base (ex: functions with too many arguments)
  • lack of tests/low code coverage
  • Lots of failures when running the GNU Coreutils testsuite (141 tests pass on 613) - Some trivial to fix, some others harder… A good way to start on a Rust project!

To start easy, I defined 4 goals for this work:

  1. Package Coreutils in Debian/Ubuntu
  2. Boot a Debian system with a Rust-based coreutils
  3. Install the top 1000 packages in Debian - including GNOME
  4. Build Firefox, the Linux Kernel and LLV/Clang

Packaging of Coreutils in Debian

Packaging in Debian isn't a trivial or even simple task. It requires uploading independently all the dependencies in the archive. Rust, with its new ecosystem and small crates, is making this task significantly harder.

The package is called rust-coreutils - https://tracker.debian.org/pkg/rust-coreutils

For Debian/Ubuntu users, to have an idea of the complexity of packaging such applications, just run
debtree --build-dep rust-coreutils | dot -Tsvg > coreutils.svg (should be around 1M).

Since it isn't production ready, the rust-coreutils is installable in parallel with coreutils. This package does NOT replace the GNU/coreutils files (yet?), the new files are installed in /usr/lib/cargo/bin/.

They can be used with:

export PATH=/usr/lib/cargo/bin/:$PATH

Or, uglier, overriding the files with the new ones.

Booting Debian with rust-coreutils

To achieve this, because I knew I would likely break the image a few times, I created a new project to quickly install a full Debian with PXE and preseed.

The project is available here:

https://github.com/opencollab/qemu-debian-install-pxe-preseed/

A script to create the full qemu image: build_qemu_debian_image.sh

A second script to boot on the newly created image: boot.sh

Then, building and installing coreutils on the system (yeah, it is ugly - don’t do that at home):

apt install rust-coreutils
cd /usr/lib/cargo/bin/
for f in *; do
cp -f $f /usr/bin/
done

First surprise, unlike the old init.d init system, as systemd is not relying on a series of scripts (it is mostly written in C), replacing the coreutils did not have an impact. Therefore, I didn't experience any issue during the boot process

Debian packages rely a lot on post-install scripts (stored in /var/lib/dpkg/info/*) to finalize and configure packages. They are (almost?) all using /bin/sh (or /bin/bash) to perform these actions. They intensively call coreutils applications.

For example, in /var/lib/dpkg/info/exim4-base.postinst, we can find:

install -d -oDebian-exim -gadm -m2750 /var/log/exim4

With an ugly script, we can test the installations of the 1000 most popular packages one by one.

Running this, some classes of issues in Rust/coreutils could be easily identified.

Implementing missing options

A significant number of problems could be easily identified as a lack of support for some options.

Here is a list of most of the fixes I had to implement to make this plan work:

Different behavior

Most of the programs behaved as expected. Here is a list of differences:

  • install doesn't support using /dev/null as source file
    Setting up libreoffice-common (1:6.1.5-3+deb10u6) ...
    install: error: install: cannot install ‘/dev/null’ to ‘/etc/apparmor.d/local/usr.lib.libreoffice.program.oosplash’: the source path is not an existing regular file
    A limitation of rust itself https://github.com/rust-lang/rust/issues/79390

Compile Firefox, Clang and the Linux Kernel

Build systems can vary significantly one from the other.

To verify their usage of coreutils, I built these three major projects

Firefox

As Firefox relies mostly on Python as a build system, it went smoothly. I didn’t encounter any issue.

The only unrelated issue that I noticed working on it was apt-key was broken because the script relied on a buggy option of mktemp.

Linux Kernel

I identified only two issues compared to GNU Coreutils:

  • The chown command on a non-existing symlink target doesn’t fail on the GNU version, the Rust one was triggering an error.
    https://github.com/uutils/coreutils/pull/1694
  • Linux kernel
    ln -fsn ../../x86/boot/bzImage ./arch/x86_64/boot/bzImage
    ln: error: Unrecognized option: 'n'

LLVM/Clang

The llvm toolchain relies on Cmake. Just like for Firefox, I didn’t face any issue.

Comparing with GNU coreutils using its testsuite

Recently, James Robson added a new test to run the GNU testsuite on the Rust/coreutils.

# TOTAL: 611
# PASS: 144
# SKIP: 86
# XFAIL: 0
# FAIL: 342
# XPASS: 0
# ERROR: 39
compared to 546 test passing with the GNU version. Even if a bunch of errors are just different outputs, it demonstrates that there is still a long road ahead.

Next steps & contribute

First, we will need more motivated contributors to work on this project. Many features remain to be implemented, optimizations to be done (e.g. decreasing the memory usage), etc.
I started to create a list of good first bugs for newcomers:

https://github.com/uutils/coreutils/issues?q=is%3Aissue+is%3Aopen+label%3A%22Good+first+bug%22
I will update this list of there is some interest for this project.

Helping improve the support of the GNU coreutils testsuite would be a huge step while being a great way to learn Rust!

Then, once it is in a better state, we will be able to make it a reliable alternative in Debian/Ubuntu to the GNU/Coreutils.

This might be also interesting for other folks who prefer a BSD license over a GPL.



About optimization flags in software

Septembre 6th, 2013

Following a thread on the Clang mailing list, I did a quick analysis of the optimization flags used to build C or C++ code.

I used a rebuild result from last mid July of all Debian native packages (meaning that I don't have the log of full Java, Python or other packages without native code) and wrote a small script to count all their occurrences.
By default, deb packages uses the command "dpkg-buildflags --get CFLAGS" which will set the optimization level to -O2. However, many packages and upstream build systems do not respect such flags.

On 10320 binary-only packages (in total, Debian has around 19000), the results are the following:

Optimization levelNumber of occurrencesNumber of packages
-O35597269
-Os30849145
-O013993294
-O125494809
-O29958037685
-O3106048531
-O4140216
-O5
-O6693549

Since a graphic talks more than numbers (note that the Y axis is logarithmic):

Disclaimer:
Some packages are using several optimization flags during the same build.
Some packages are not showing the full command line (see jessie release goal: verbose build logs)
The results might be approximate since I am basically grepping "-OX " on the log files.

Clang 3.3 and Debian

Août 19th, 2013

The LLVM toolchain version 3.3 has been released a couple months ago.
Here are now the result of the rebuild of Debian archive using this version of the compiler.
Like the previous releases, we are at a bit less than 12% of packages failing (2188 packages on a total of 18854).
More warnings / errors detections have been added to the software (for example: like this defect of the C++ standard or the detection of unused linker option) causing more build failures, but, in the mean time, we fixed some issues in the Debian packages...

As usual, the following image shows clearly the evolution of the build failures over time.

As stated in my blog post for the release 3.2, this rebuilds prove that Clang is ready for production in term of support of the C, C++ and Objective C languages.

With the performance results showed by Chandler Carruth from Google at the last Euro LLVM summit (see this video from 5:40), I believe that it is now time to report and fix the bugs in the upstream packages.

I also presented this work (video) at the Debconf 13 last week in Vaumarcus (Switzerland) and I will be also presenting this work at the Linux Plumbers Conference, New Orleans.

With Léo Cavaille (as part of his GSOC) and Paul Tagliamonte, we are also working on providing a better automatic rebuild infrastructure for clang-built packages (and other static analyzers). More in the next few weeks.

Finally, I would like to thank folks at AWS for the Debian credit and David Suarez for helping on with the Ruby segfaults.

Rebuild of Debian using clang 3.2

Février 6th, 2013

After the studies of the rebuild of the Debian archive using clang 2.9 & 3.0 and 3.1, the results of the 3.2 rebuild have been published on http://clang.debian.net.

The percentage of failure is the same as clang 3.1: 12.1% of failure. That means that on 18264 packages, 2204 failed to be built with clang (it was 17710/2137 with the version 3.1).

The fact that the percentage is the same can be explained by at least two reasons:
First, some upstreams packages have been fixed to be built correctly with clang.
Second, the new warnings + -Werror (example: -Wsometimes-uninitialized) introduced in clang 3.2 triggered some new build failures.

To conclude, I think that the 3.2 release confirms the turning point of the 3.1. In term of support of the C and C++ standard, clang has now reached an equivalent quality to gcc (with better detection of errors and an extended set of warnings in -Wall).
Now, from the perspective of the Debian rebuilds, the decrease of number of failures will come from the upstream developers improving their codes (exemple of the error non-void function should return a value) or the Debian Developer/maintainer fixing the programming errors.

libc++: New C++ standard library in Debian

Août 15th, 2012

Thanks to the work of Andrej Belym, Debian has now a new C++ standard library. This work has been done in the context of the Google Summer of Code 2012.
Available in Debian Experimental, this new packages provides both the runtime libraries (libc++abi1) and the C++ headers (libc++-dev).

With this library, it is possible to build a C++-based program without any dependency on libstdc++.

For example, as detailed in README.Debian, with the (amazing) C++ code:

// foo.cpp
#include <iostream>

int main() {
std::cout < < "plop" << std ::endl;
return 0;
}

with clang++ (or g++) will give:

$ clang++ -o plop foo.cpp
$ ldd plop |grep c++
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f15791cb000)

Using libc++, it will drop the dependency on libstdc++ to use lib++

$ clang++ -stdlib=libc++ -o plop foo.cpp
$ ldd plop |grep c++
libc++.so.1 => /usr/lib/x86_64-linux-gnu/libc++.so.1 (0x00007f87464df000)

For the record, it is not that trivial to do with g++. The command being:

g++ -nostdlib -lc++ -lc++abi -std=c++11 -o plop /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o foo.cpp /usr/lib/x86_64-linux-gnu/crtn.o -isystem/usr/include/c++/v1 -lc -lgcc_s

Besides this change, libc++ provides a support of C++ 11, considered as "correct" against the C++11 standard by upstream.

// bar.cpp
#include <chrono>
int main() {
return 0;
}

clang++ -stdlib=libc++ -o bar bar.cpp

More information: