How to use NixOS for lightweight integration tests
If you use Nix in some capacity then you should check out the NixOS integration test system, which provides an easy way to test services that run inside one or more QEMU virtual machines.
NixOS tests are (in my opinion) way ahead of other integration test systems, but the only way to properly illustrate their value is to walk through a real-world example to highlight their salient features.
The motivating example
This post will motivate NixOS tests by using them to detect an error in the official postgrest
tutorial.
You can skim the above tutorial to get a sense of the steps involved, but I’ll also summarize them here:
Run postgres
in a docker
container
Download postgrest
Set up the database by running these commands:
create table api.todos (
id serial primary key,
done boolean not null default false,
task text not null,
due timestamptz
);
insert into api.todos (task) values
('finish tutorial 0'), ('pat self on back');
create role web_anon nologin;
grant usage on schema api to web_anon;
grant select on api.todos to web_anon;
create role authenticator noinherit login password 'mysecretpassword';
grant web_anon to authenticator;
Save the following configuration to tutorial.conf
:
db-uri = "postgres://authenticator:mysecretpassword@localhost:5433/postgres"
db-schema = "api"
db-anon-role = "web_anon"
Run ./postgrest tutorial.conf
Check that it’s working using:
$ curl http://localhost:3000/todos
… which should return:
[
{
"id": 1,
"done": false,
"task": "finish tutorial 0",
"due": null
},
{
"id": 2,
"done": false,
"task": "pat self on back",
"due": null
}
]
These are quite a few manual steps, and if I were a postgrest
maintainer then it would be a pain to check that they still work for every new software release. In practice, most maintainers write and check a tutorial once and then never check again unless users report errors. This is a shame, because one of the most important functions of a tutorial is to inspire confidence:
Make sure that your tutorial works
One of your jobs as a tutor is to inspire the beginner’s confidence: in the software, in the tutorial, in the tutor and, of course, in their own ability to achieve what’s being asked of them.
There are many things that contribute to this. A friendly tone helps, as does consistent use of language, and a logical progression through the material. But the single most important thing is that what you ask the beginner to do must work. The learner needs to see that the actions you ask them to take have the effect you say they will have.
If the learner’s actions produce an error or unexpected results, your tutorial has failed - even if it’s not your fault. When your students are there with you, you can rescue them; if they’re reading your documentation on their own you can’t - so you have to prevent that from happening in advance. This is without doubt easier said than done.
Fortunately, we can codify the manual steps from the tutorial into a NixOS configuration for a virtual machine, which is a declarative specification of our system’s desired state:
# ./postgrest-tutorial.nix
let
# For extra determinism
nixpkgs =
builtins.fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/58f9c4c7d3a42c912362ca68577162e38ea8edfb.tar.gz";
sha256 = "1517dy07jf4zhzknqbgm617lgjxsn7a6k1vgq61c67f6h55qs5ij";
};
# Single source of truth for all tutorial constants
database = "postgres";
schema = "api";
table = "todos";
username = "authenticator";
password = "mysecretpassword";
webRole = "web_anon";
nixos =
import "${nixpkgs}/nixos" {
system = "x86_64-linux";
configuration = { config, pkgs, ... }: {
# Open the default port for `postgrest` in the firewall
networking.firewall.allowedTCPPorts = [ 3000 ];
services.postgresql = {
enable = true;
initialScript = pkgs.writeText "initialScript.sql" ''
create schema ${schema};
create table ${schema}.${table} (
id serial primary key,
done boolean not null default false,
task text not null,
due timestamptz
);
insert into ${schema}.${table} (task) values
('finish tutorial 0'), ('pat self on back');
create role ${webRole} nologin;
grant usage on schema ${schema} to ${webRole};
grant select on ${schema}.${table} to ${webRole};
create role ${username} noinherit login password '${password}';
grant ${webRole} to ${username};
'';
};
users = {
mutableUsers = false;
users = {
# For ease of debugging the VM as the `root` user
root.password = "";
# Create a system user that matches the database user so that we
# can use peer authentication. The tutorial defines a password,
# but it's not necessary.
"${username}".isSystemUser = true;
};
};
systemd.services.postgrest = {
wantedBy = [ "multi-user.target" ];
after = [ "postgresql.service" ];
script =
let
configuration = pkgs.writeText "tutorial.conf" ''
db-uri = "postgres://${username}:${password}@localhost:${toString config.services.postgresql.port}/${database}"
db-schema = "${schema}"
db-anon-role = "${username}"
'';
in
''
${pkgs.haskellPackages.postgrest}/bin/postgrest ${configuration}
'';
serviceConfig.User = username;
};
# Uncomment the next line for running QEMU on a non-graphical system
# virtualisation.graphics = false;
};
};
in
nixos.vm
We can then build and run this tutorial virtual machine by running the following commands:
$ nix build --file ./postgrest-tutorial.nix
$ QEMU_NET_OPTS='hostfwd=tcp::3000-:3000' result/bin/run-nixos-vm
That spins up a VM and prompts us to log in when the VM is ready:
<<< Welcome to NixOS 20.09pre-git (x86_64) - ttyS0 >>>
Run 'nixos-help' for the NixOS manual.
nixos login:
However, before we log in, we can test if postgrest
is working using the same curl
command from the tutorial:
$ curl http://localhost:3000/todos
{"hint":null,"details":null,"code":"42501","message":"permission denied for schema api"}
Wait, what? We were supposed to get:
[
{
"id": 1,
"done": false,
"task": "finish tutorial 0",
"due": null
},
{
"id": 2,
"done": false,
"task": "pat self on back",
"due": null
}
]
… but apparently something is wrong with the database’s permissions.
Fortunately, we can log into the VM as the root
user with an empty password to test the database permissions. Once we log into the system we can further log into the database as the authenticator
user:
<<< Welcome to NixOS 20.09pre-git (x86_64) - ttyS0 >>>
Run 'nixos-help' for the NixOS manual.
nixos login: root<Enter>
Password: <Enter>
[root@nixos:~]# sudo --user authenticator psql postgres
psql (11.9)
Type "help" for help.
postgres=>
Now we can test to see if the authenticator
user is able to access the api.todos
table:
postgres=> SELECT * FROM api.todos;
ERROR: permission denied for schema api
LINE 1: SELECT * FROM api.todos;
Good: we can reproduce the problem, but what might be the cause?
As it turns out, the tutorial instructions appear to not configure the authenticator
role correctly. Specifically, the noinherit
in the following commands is the reason we can’t directly access the schema
api:
create role authenticator noinherit login password 'mysecretpassword';
grant web_anon to authenticator;
The noinherit
setting prevents the authenticator
user from automatically assuming all permissions associated with the web_anon
user. Instead, the authenticator
user has to explicitly use the SET ROLE
command to assume such permissions, and we can verify that at the database prompt:
postgres=> SET ROLE web_anon;
SET
postgres=> SELECT * FROM api.todos;
id | done | task | due
----+------+-------------------+-----
1 | f | finish tutorial 0 |
2 | f | pat self on back |
(2 rows)
Mystery solved! We can test our hypothesis by changing that noinherit
to inherit
:
create role authenticator inherit login password 'mysecretpassword';
grant web_anon to authenticator;
… then we can restart the VM to check that things now work by:
… and now the curl
example from the tutorial works:
$ curl http://localhost:3000/todos
[{"id":1,"done":false,"task":"finish tutorial 0","due":null},
{"id":2,"done":false,"task":"pat self on back","due":null}]
But wait, there’s more!
Automated testing
We don’t have to manually setup/teardown VMs and run curl
commands. We can automate the entire process from end-to-end by using NixOS’s support for automated integration tests.
If we follow the instructions from the NixOS manual, then the automated integration test looks like this:
# ./postgrest-tutorial.nix
let
# For extra determinism
nixpkgs =
builtins.fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/58f9c4c7d3a42c912362ca68577162e38ea8edfb.tar.gz";
sha256 = "1517dy07jf4zhzknqbgm617lgjxsn7a6k1vgq61c67f6h55qs5ij";
};
# Single source of truth for all tutorial constants
database = "postgres";
schema = "api";
table = "todos";
username = "authenticator";
password = "mysecretpassword";
webRole = "web_anon";
postgrestPort = 3000;
in
import "${nixpkgs}/nixos/tests/make-test-python.nix" ({ pkgs, ...}: {
system = "x86_64-linux";
nodes = {
server = { config, pkgs, ... }: {
# Open the default port for `postgrest` in the firewall
networking.firewall.allowedTCPPorts = [ postgrestPort ];
services.postgresql = {
enable = true;
initialScript = pkgs.writeText "initialScript.sql" ''
create schema ${schema};
create table ${schema}.${table} (
id serial primary key,
done boolean not null default false,
task text not null,
due timestamptz
);
insert into ${schema}.${table} (task) values
('finish tutorial 0'), ('pat self on back');
create role ${webRole} nologin;
grant usage on schema ${schema} to ${webRole};
grant select on ${schema}.${table} to ${webRole};
create role ${username} inherit login password '${password}';
grant ${webRole} to ${username};
'';
};
users = {
mutableUsers = false;
users = {
# For ease of debugging the VM as the `root` user
root.password = "";
# Create a system user that matches the database user so that we
# can use peer authentication. The tutorial defines a password,
# but it's not necessary.
"${username}".isSystemUser = true;
};
};
systemd.services.postgrest = {
wantedBy = [ "multi-user.target" ];
after = [ "postgresql.service" ];
script =
let
configuration = pkgs.writeText "tutorial.conf" ''
db-uri = "postgres://${username}:${password}@localhost:${toString config.services.postgresql.port}/${database}"
db-schema = "${schema}"
db-anon-role = "${username}"
'';
in
''
${pkgs.haskellPackages.postgrest}/bin/postgrest ${configuration}
'';
serviceConfig.User = username;
};
# Uncomment the next line for running QEMU on a non-graphical system
# virtualisation.graphics = false;
};
client = { };
};
testScript =
''
import json
import sys
start_all()
server.wait_for_open_port(${toString postgrestPort})
expected = [
{"id": 1, "done": False, "task": "finish tutorial 0", "due": None},
{"id": 2, "done": False, "task": "pat self on back", "due": None},
]
actual = json.loads(
client.succeed(
"${pkgs.curl}/bin/curl http://server:${toString postgrestPort}/${table}"
)
)
if expected != actual:
sys.exit(1)
'';
})
… and you can run the test with the following command:
$ nix build --file ./postgrest-tutorial.nix
… which will silently succeed with a 0
exit code if the test passes, or fail with an error message otherwise.
The above example highlights a few neat aspects of the NixOS test framework:
You can test more than one VM at a time
The above test creates two VMs:
… so that we can verify that everything works even when curl
is run from a separate machine. For example, this comes in handy for testing firewall rules.
You can write the test and orchestration logic in Python
This means that we can use Python not only to run the curl
subprocess, but to also compare the result against a golden JSON output.
Conclusion
This NixOS test framework is streets ahead of other integration test frameworks that I’ve worked with:
The test is deterministic
The above example will continue to work a decade from now because all transitive dependencies are fully pinned by the NixOS specification.
The test is reproducible
We don’t need to specify out-of-band instructions for how to obtain or install test dependencies. The only thing users globally install is Nix.
The test is compact
The whole thing fits in a single 120-line file with generous whitespace and formatting (although you have the option of splitting into more files if you prefer)
The test is fully isolated
The test does not mutate any shared resources or files and the test runs within an isolated network, so we can run multiple integration tests in parallel on the same machine for building a test matrix.
The test is fast
You might think that a VM-based test is slow compare to a container-based one, but the entire test run, including VM setup and teardown, only takes about 10 seconds.
The test is written in a fully-featured language
We can use Nix’s support for programming language features to reduce repetition. For example, this is why we can consolidate all test constants to be defined in one place so that there is a single source of truth for everything.
So if you’re already trying out Nix, I highly encourage you to give the NixOS integration test framework a try for the above reasons.