Our team at Arista Networks
is happy to announce nix-serve-ng, a backwards-compatible Haskell
rewrite of nix-serve
(a service for hosting a
/nix/store as a binary cache). It provides better reliability and performance
than nix-serve (ranging from ≈ 1.5× to 32× faster). We wrote
nix-serve-ng to fix scaling bottlenecks in our cache and we expect other
large-scale deployments might be interested in this project, too.
This post will focus more on the background behind the development
process and comparisons to other Nix cache implementations. If you don’t
care about any of that then you can get started by following the
instructions in the
Before we began this project there were at least two other open
source rewrites of
nix-serve-ng that we could have adopted
eris- A Perl rewrite of
Note: the original
nix-serveis implemented in Perl, and eris is also implemented in Perl using a different framework.
harmonia- A Rust rewrite of
The main reason we did not go with these two alternatives is because
they are not drop-in replacements for the original
nix-serve. We could have fixed that, but given how
nix-serve is I figured that it would be simpler
to just create our own.
nix-serve-ng only took a couple of
days for the initial version and maybe a week of follow-up fixes and
We did not evaluate the performance or reliability of
harmonia before embarking on our own
nix-serve replacement. However, after
nix-serve-ng was done we learned that it was significantly
faster than the alternatives (see the Performance section below). Some of those
performance differences are probably fixable, especially for
harmonia. That said, we are very happy with the quality of
One important design goal for this project is to be significantly
backwards compatible with
nix-serve. We went to great
lengths to preserve compatibility, including:
Naming the built executable
Yes, even though the project name is
nix-serve-ng, the executable built by the project is named
Preserving most of the original command-line options, including legacy options
… even though some are unused.
In most cases you can literally replace
pkgs.nix-serve-ng and it will “just work”. You can
even continue to use the existing
The biggest compatibility regression is that
nix-serve-ng cannot be built on MacOS. It is extremely
close to supporting MacOS save for this one bug in Haskell’s
- #26. We left in all of the MacOS shims so that if that bug is ever
fixed then we can get MacOS support easily.
For more details on the exact differences compared to
nix-serve, see the Result /
Backwards-compatibility section of the
nix-serve-ng is faster than all of the alternatives
according to both our formal benchmarks and also informal testing. The
section of our
README has the complete breakdown but
the relevant part is this table:
Speedups (compared to
|Fetch present NAR info ×10||1.0||0.05||1.33||1.58|
|Fetch absent NAR info ×1||1.0||0.06||1.53||1.84|
|Fetch empty NAR ×10||1.0||0.67||0.59||31.80|
|Fetch 10 MB NAR ×10||1.0||0.64||0.60||3.35|
… which I can summarize like this:
nix-serve-ngis faster than all of the alternatives across all use cases
erisis slower than the original
nix-serveacross all use cases
harmoniais faster than the original
nix-servefor NAR info lookups, but slower for fetching NARs
These performance results were surprising for a few reasons:
I was not expecting
eristo be slower than the original
… especially not NAR info lookups to be ≈ 20× slower. This is significant because NAR info lookups typically dominate a Nix cache’s performance. In my (informal) experience, the majority of a Nix cache’s time is spent addressing failed cache lookups.
I was not expecting
harmonia(the Rust rewrite) to be slower than the original
nix-servefor fetching NARs
This seems like something that should be fixable.
harmoniawill probably eventually match our performance because Rust has a high performance ceiling.
I was not expecting a ≈ 30x speedup for
nix-serve-ngfetching small NARs
I had to triple-check that neither
nix-serve-ngnor the benchmark were broken when I saw this speedup.
So I investigated these performance differences to help inform other implementations what to be mindful of.
We didn’t get these kinds of speed-ups by being completely oblivious to performance. Here are the things that we paid special attention to to keep things efficient, in order of lowest-hanging to highest-hanging fruit:
Don’t read the secret key file on every NAR fetch
This is a silly thing that the original
nix-servedoes that is the easiest thing to fix.
harmoniaalso fix this, so this optimization is not unique to our rewrite.
We bind directly to the Nix C++ API for fetching NARs
harmoniaall shell out to a subprocess to fetch NARs, by invoking either
nix-store --dumpto do the heavy lifting. In contrast,
nix-serve-ngbinds to the Nix C++ API for this purpose.
This would definitely explain some of the performance difference when fetching NARs. Creating a subprocess has a fixed overhead regardless of the size of the NAR, which explains why we see the largest performance difference when fetching tiny NARs since the overhead of creating a subprocess would dominate the response time.
This may also affect throughput for serving large NAR files, too, by adding unnecessary memory copies/buffering as part of streaming the subprocess output.
We minimize memory copies when fetching NARs
We go to great lengths to minimize the number of intermediate buffers and copies when streaming the contents of a NAR to a client. To do this, we exploit the fact that Haskell’s foreign function interface works in both directions: Haskell code can call C++ code but also C++ code can call Haskell code. This means that we can create a Nix C++ streaming sink from a Haskell callback function and this eliminates the need for intermediate buffers.
This likely also improves the throughput for serving NAR files. Only
nix-serve-ngperforms this optimization (since
nix-serve-ngis the only one that uses the C++ API for streaming NAR contents).
Hand-write the API routing logic
We hand-write all of the API routing logic to prioritize and optimize the hot path (fetching NAR info).
For example, a really simple thing that the original
nix-servedoes inefficiently is to check if the path matches
/nix-cache-infofirst, even though that is an extremely infrequently used path. In our API routing logic we move that check straight to the very end.
These optimizations likely improve the performance of NAR info requests. As far as I can tell, only
nix-serve-ngperforms these optimizations.
I have not benchmarked the performance impact of each of these changes in isolation, though. These observations are purely based on my intuition.
nix-serve-ng is not all upsides. In particular,
nix-serve-ng is missing features that some of the other
rewrites provide, such as:
- Greater configurability
- Improved authentication support
- Monitoring/diagnostics/status APIs
Our focus was entirely on scalability, so the primary reason to use
nix-serve-ng is if you prioritize performance and
We’ve been using
nix-serve-ng long enough internally
that we feel confident endorsing its use outside our company. We run a
particularly large Nix deployment internally (which is why we needed
this in the first place), so we have stress tested
nix-serve-ng considerably under heavy and realistic usage
You can get started by following these these instructions and let us know if you run into any issues or difficulties.
Also, I want to thank Arista Networks for graciously sponsoring our team to work on and open source this project