Wednesday, September 7, 2022

nix-serve-ng: A faster, more reliable, drop-in replacement for nix-serve

nix-serve-ng

Our team at Arista Networks is happy to announce nix-serve-ng, a backwards-compatible Haskell rewrite of nix-serve (a service for hosting a /nix/store as a binary cache). It provides better reliability and performance than nix-serve (ranging from ≈ 1.5× to 32× faster). We wrote nix-serve-ng to fix scaling bottlenecks in our cache and we expect other large-scale deployments might be interested in this project, too.

This post will focus more on the background behind the development process and comparisons to other Nix cache implementations. If you don’t care about any of that then you can get started by following the instructions in the repository’s README.

Background

Before we began this project there were at least two other open source rewrites of nix-serve-ng that we could have adopted instead of nix-serve:

  • eris - A Perl rewrite of nix-serve

    Note: the original nix-serve is implemented in Perl, and eris is also implemented in Perl using a different framework.

  • harmonia - A Rust rewrite of nix-serve

The main reason we did not go with these two alternatives is because they are not drop-in replacements for the original nix-serve. We could have fixed that, but given how simple nix-serve is I figured that it would be simpler to just create our own. nix-serve-ng only took a couple of days for the initial version and maybe a week of follow-up fixes and performance tuning.

We did not evaluate the performance or reliability of eris or harmonia before embarking on our own nix-serve replacement. However, after nix-serve-ng was done we learned that it was significantly faster than the alternatives (see the Performance section below). Some of those performance differences are probably fixable, especially for harmonia. That said, we are very happy with the quality of our solution.

Backwards compatibility

One important design goal for this project is to be significantly backwards compatible with nix-serve. We went to great lengths to preserve compatibility, including:

  • Naming the built executable nix-serve

    Yes, even though the project name is nix-serve-ng, the executable built by the project is named nix-serve.

  • Preserving most of the original command-line options, including legacy options

    … even though some are unused.

In most cases you can literally replace pkgs.nix-serve with pkgs.nix-serve-ng and it will “just work”. You can even continue to use the existing services.nix-serve NixOS options.

The biggest compatibility regression is that nix-serve-ng cannot be built on MacOS. It is extremely close to supporting MacOS save for this one bug in Haskell’s hsc2hs tool: haskell/hsc2hs - #26. We left in all of the MacOS shims so that if that bug is ever fixed then we can get MacOS support easily.

For more details on the exact differences compared to nix-serve, see the Result / Backwards-compatibility section of the README.

Performance

nix-serve-ng is faster than all of the alternatives according to both our formal benchmarks and also informal testing. The “Benchmarks” section of our README has the complete breakdown but the relevant part is this table:

Speedups (compared to nix-serve):

Benchmark nix-serve eris harmonia nix-serve-ng
Fetch present NAR info ×10 1.0 0.05 1.33 1.58
Fetch absent NAR info ×1 1.0 0.06 1.53 1.84
Fetch empty NAR ×10 1.0 0.67 0.59 31.80
Fetch 10 MB NAR ×10 1.0 0.64 0.60 3.35

… which I can summarize like this:

  • nix-serve-ng is faster than all of the alternatives across all use cases
  • eris is slower than the original nix-serve across all use cases
  • harmonia is faster than the original nix-serve for NAR info lookups, but slower for fetching NARs

These performance results were surprising for a few reasons:

  • I was not expecting eris to be slower than the original nix-serve implementation

    … especially not NAR info lookups to be ≈ 20× slower. This is significant because NAR info lookups typically dominate a Nix cache’s performance. In my (informal) experience, the majority of a Nix cache’s time is spent addressing failed cache lookups.

  • I was not expecting harmonia (the Rust rewrite) to be slower than the original nix-serve for fetching NARs

    This seems like something that should be fixable. harmonia will probably eventually match our performance because Rust has a high performance ceiling.

  • I was not expecting a ≈ 30x speedup for nix-serve-ng fetching small NARs

    I had to triple-check that neither nix-serve-ng nor the benchmark were broken when I saw this speedup.

So I investigated these performance differences to help inform other implementations what to be mindful of.

Performance insights

We didn’t get these kinds of speed-ups by being completely oblivious to performance. Here are the things that we paid special attention to to keep things efficient, in order of lowest-hanging to highest-hanging fruit:

  • Don’t read the secret key file on every NAR fetch

    This is a silly thing that the original nix-serve does that is the easiest thing to fix.

    eris and harmonia also fix this, so this optimization is not unique to our rewrite.

  • We bind directly to the Nix C++ API for fetching NARs

    nix-serve, eris, and harmonia all shell out to a subprocess to fetch NARs, by invoking either nix dump-path or nix-store --dump to do the heavy lifting. In contrast, nix-serve-ng binds to the Nix C++ API for this purpose.

    This would definitely explain some of the performance difference when fetching NARs. Creating a subprocess has a fixed overhead regardless of the size of the NAR, which explains why we see the largest performance difference when fetching tiny NARs since the overhead of creating a subprocess would dominate the response time.

    This may also affect throughput for serving large NAR files, too, by adding unnecessary memory copies/buffering as part of streaming the subprocess output.

  • We minimize memory copies when fetching NARs

    We go to great lengths to minimize the number of intermediate buffers and copies when streaming the contents of a NAR to a client. To do this, we exploit the fact that Haskell’s foreign function interface works in both directions: Haskell code can call C++ code but also C++ code can call Haskell code. This means that we can create a Nix C++ streaming sink from a Haskell callback function and this eliminates the need for intermediate buffers.

    This likely also improves the throughput for serving NAR files. Only nix-serve-ng performs this optimization (since nix-serve-ng is the only one that uses the C++ API for streaming NAR contents).

  • Hand-write the API routing logic

    We hand-write all of the API routing logic to prioritize and optimize the hot path (fetching NAR info).

    For example, a really simple thing that the original nix-serve does inefficiently is to check if the path matches /nix-cache-info first, even though that is an extremely infrequently used path. In our API routing logic we move that check straight to the very end.

    These optimizations likely improve the performance of NAR info requests. As far as I can tell, only nix-serve-ng performs these optimizations.

I have not benchmarked the performance impact of each of these changes in isolation, though. These observations are purely based on my intuition.

Features

nix-serve-ng is not all upsides. In particular, nix-serve-ng is missing features that some of the other rewrites provide, such as:

  • Greater configurability
  • Improved authentication support
  • Monitoring/diagnostics/status APIs

Our focus was entirely on scalability, so the primary reason to use nix-serve-ng is if you prioritize performance and uptime.

Conclusion

We’ve been using nix-serve-ng long enough internally that we feel confident endorsing its use outside our company. We run a particularly large Nix deployment internally (which is why we needed this in the first place), so we have stress tested nix-serve-ng considerably under heavy and realistic usage patterns.

You can get started by following these these instructions and let us know if you run into any issues or difficulties.

Also, I want to thank Arista Networks for graciously sponsoring our team to work on and open source this project