Saturday, January 28, 2017

Typed Nix programming using Dhall

I recently released a typed and total configuration language named Dhall that you can use to configure Haskell programs. However, Dhall would be more useful if you could configure other programming languages, like Nix.

Nix powers the Nix package manager and the Nix operating system and if you've never used Nix before then please give Nix a try. We use Nix heavily at Awake Networks for managing our deployments but one of our biggest complaints about Nix is the type system:

  • Nix expressions cannot be annotated with types to guide the user
  • You can't use sum types to make invalid states unrepresentable
  • Type errors are not very helpful and poorly located
  • Many Nix builtins and idioms are inherently difficult to statically analyze

To mitigate this problem I contributed a new Dhall-to-Nix compiler that lets you carve out a restricted subset of Nix with a more user-friendly type system.

This post covers:

  • How to use this in your own Nix projects
  • Details of the translation process from Dhall to Nix
  • Benefits and drawbacks of using Dhall to program Nix

User interface

I began by creating a Dhall to Nix compiler that can translate arbitrary Dhall expressions to equivalent Nix expressions. This compiler is not limited to simple values like records or literals: you can even compile Dhall functions to Nix functions.

The compiler takes a Dhall expression on standard input and emits the corresponding Nix expression on standard output:

$ dhall-nix <<< "{ foo = 1, bar = True }"
{ bar = true; foo = 1; }
$ dhall-nix <<< "λ(x : Bool) → x == False"
x: x == false

However, you do not need to install or manually run the compiler to use Dhall within your Nix project. I went a step further and added a dhallToNix utility to nixpkgs which automatically converts Dhall expressions to Nix expressions for you. This utility automatically bootstraps and caches the dhall-to-nix compiler as part of the evaluation process.

Here's an example of how you can use the dhallToNix function to embed arbitrary Dhall expressions inside of Nix:

let
  pkgs = import <nixpkgs> { };

  inherit (pkgs) dhallToNix;

in
  { example0 = dhallToNix "λ(x : Bool) → x == False" false;
    example1 = (dhallToNix "{ foo = 1, bar = True }").foo;
    example2 = dhallToNix "List/reverse Integer" [1 2 3];
  }

... and that will evaluate to:

{ example0 = true;
  example1 = 1;
  example2 = [3 2 1];
}

You can find a larger test suite here exercising all the features of the Dhall language by compiling them to Nix.

Compilation

The Dhall language is heavily inspired by Nix, so many Dhall language features translate mechanically to Nix language features. For example:

  • Dhall's records translate to Nix's records
  • Primitive Dhall values (like integers) translate to primitive Nix values
  • Dhall's anonymous functions translate to Nix's anonymous functions

... and so on

However, Dhall does some things differently from Nix, and these differences fell into four categories:

  • Dhall supports anonymous sum types, while Nix does not
  • Dhall provides a different set of built-in functions from Nix
  • Dhall is explicitly typed and uses type abstraction and type application
  • Dhall supports floating point numbers, while Nix does not

Compiling sum types was pretty simple: just Church-encode them. For example, a sum type like this:

< Left = 2 | Right : Bool >

... translates into the following Nix expression:

{ Left, Right }: Left 2

In other words, you can encode a sum type as a "pre-formed pattern match", which you consume in Nix by providing one function per alternative to handle:

dhallToNix "< Left = | Right : Bool >" {
  Left  = n : n == 0;  # Handler for the `Left`  case
  Right = b : b;       # Handler for the `Right` case
}

The dhallToNix invocation evaluates to:

({ Left, Right }: Left 2) {
  Left  = n : n == 0;
  Right = b : b;
}

... which in turn reduces to:

(n : n == 0) 2

... which in turn reduces to:

false

Built-in functions which were missing in Nix required the most effort. I had to translate them to efficient implementations based on other Nix-builtins. For example, Dhall's List/reverse primitive uses Nix's builtins.genList and builtins.elemAt under the hood:

dhall-to-nix <<< "List/reverse"
t: xs: let
      n = builtins.length xs;
      in builtins.genList (i:
      builtins.elemAt xs (n - i - 1)) n

The most complex builtin to translate was Dhall's (∧) operation for merging records, which is similar to Nix's (//) operator except that (∧) also merges children recursively:

$ dhall-to-nix <<< "λ(x : { foo : Bool }) → λ(y : { bar : Text }) → x ∧ y"
x: y: let
      combine = kvsL: kvsR: let
            ks = builtins.attrNames kvsL ++ builtins.attrNames kvsR;
            toKeyVals = k:
            if builtins.hasAttr k kvsL
                then if builtins.hasAttr k kvsR
                  then if builtins.isAttrs (builtins.getAttr k kvsL) && builtins.isAttrs (builtins.getAttr k kvsR)
                    then [
                      {
                        name = k;
                        value = combine (builtins.getAttr k kvsL) (builtins.getAttr k kvsR);
                      }
                    ]
                    else [
                      {
                        name = k;
                        value = builtins.getAttr k kvsL;
                      }
                    ]
                  else [
                    {
                      name = k;
                      value = builtins.getAttr k kvsL;
                    }
                  ]
                else if builtins.hasAttr k kvsR
                  then [
                    {
                      name = k;
                      value = builtins.getAttr k kvsR;
                    }
                  ]
                  else [];
            in builtins.listToAttrs (builtins.concatLists (map toKeyVals ks));
      in combine x y

The last tricky part was translating Dhall's explicit type abstraction and type application. "Explicit type abstraction" means that polymorphic (or "generic") functions in Dhall are encoded as ordinary functions that take a type as a function argument. For example, this is how you encode the polymorphic identity function in Dhall:

λ(a : Type) → λ(x : a) → x

"Explicit type application" means that you specialize polymorphic functions by applying them to a type argument specifying which type to use. For example, this is how you use the polymorphic identity function:

$ echo "λ(a : Type) → λ(x : a) → x" > id
$ dhall <<< "./id Integer 4"
Integer

4
$ dhall <<< "./id Integer"  # Just specialize the function(x : Integer)Integer

λ(x : Integer)x

dhall-to-nix translates polymorphic functions to functions that just ignore their type argument. For example, the polymorphic identity function becomes:

$ dhall-to-nix <<< "./id"
a : x : x

The first argument named a is the type and the corresponding Nix function still expects the argument but never uses it.

For type application, dhall-to-nix translate all types to an empty Nix record:

$ dhall-to-nix <<< "Integer"
{}

... which is then ignored by any polymorphic function:

$ dhall-to-nix <<< "./id"
a : x : x
$ dhall-to-nix <<< "./id Integer"
x : x
$ dhall-to-nix <<< "./id Integer 4"
4

Some Dhall built-in functions are also polymorphic, and we treat them the same way. For example, the List/reverse function is polymorphic, which is why the first argument in the corresponding Nix expression is an unused type argument named t:

$ dhall-to-nix <<< "List/reverse"
t: xs: let
      n = builtins.length xs;
      in builtins.genList (i:
      builtins.elemAt xs (n - i - 1)) n
$ dhall-to-nix <<< "List/reverse Integer"
(t: xs: let
      n = builtins.length xs;
      in builtins.genList (i:
      builtins.elemAt xs (n - i - 1)) n) {}
$ dhall-to-nix <<< "List/reverse Integer ([1, 2, 3] : List Integer)"
[3 2 1]

Finally, floating point numbers are not supported in Nix at all, so the dhall-to-nix compiler must reject Double values:

$ dhall-to-nix <<< "1.0"

Error: No doubles

Explanation: Dhall values of type ❰Double❱ cannot be converted to Nix
expressions because the Nix language provides no way to represent floating point
values

You provided the following value:

↳ 1.0

... which cannot be translated to Nix

Benefits

When first learning Nix, particularly NixOS, you'll frequently run into the issue where you're not sure what values you're expected to provide due to the lack of a type system. Dhall can fix that in several ways:

  • You can request the inferred types of functions so that you know what type of function argument you need to supply
  • You can also provide users with a "schema" for an expected value, which in Dhall is just an ordinary type annotation pointing to a remote path
  • You can replaced weakly typed values (like strings) with more strongly typed representations

The following example will illustrate all of the above points

For example a derivation in Nix can be minimally represented in Dhall as the following type:

{ name : Text, builder : Text, system : Text }

... which you can save to a file named derivation and use to check if other expressions match the expected type:

$ echo "{ name : Text, builder : Text, system : Text }" > derivation
$ dhall <<EOF
{ name = "empty"

  -- Dhall supports Nix-style multi-line string literals, too
, builder = ''
    touch $out
  ''

, system = "x86_64-linux"
} : ./derivation
EOF
{ builder : Text, name : Text, system : Text }

{ builder = "\ntouch $out\n\n", name = "empty", system = "x86_64-linux" }

If we mistype a field name the type annotation will flag the error:

$ dhall <<EOF
{ name = "empty", builder = "touch $out", sytem = "x86_64-linux" }
: ./derivation
EOF
Use "dhall --explain" for detailed errors

Error: Expression doesn't match annotation

... and we can ask for more detailed error messages:

$ dhall --explain <<EOF
{ name = "empty", builder = "touch $out", sytem = "x86_64-linux" }
: ./derivation
EOF

Error: Expression doesn't match annotation

Explanation: You can annotate an expression with its type or kind using the
❰:❱ symbol, like this:


    ┌───────┐
    │ x : t │  ❰x❱ is an expression and ❰t❱ is the annotated type or kind of ❰x❱
    └───────┘

The type checker verifies that the expression's type or kind matches the
provided annotation

For example, all of the following are valid annotations that the type checker
accepts:


    ┌─────────────┐
    │ 1 : Integer │  ❰1❱ is an expression that has type ❰Integer❱, so the type
    └─────────────┘  checker accepts the annotation


    ┌────────────────────────┐
    │ Natural/even +2 : Bool │  ❰Natural/even +2❱ has type ❰Bool❱, so the type
    └────────────────────────┘  checker accepts the annotation


    ┌────────────────────┐
    │ List : TypeType │  ❰List❱ is an expression that has kind ❰Type → Type❱,
    └────────────────────┘  so the type checker accepts the annotation


    ┌──────────────────┐
    │ List Text : Type │  ❰List Text❱ is an expression that has kind ❰Type❱, so
    └──────────────────┘  the type checker accepts the annotation


However, the following annotations are not valid and the type checker will
reject them:


    ┌──────────┐
    │ 1 : TextThe type checker rejects this because ❰1❱ does not have type
    └──────────┘  ❰Text❱


    ┌─────────────┐
    │ List : Type │  ❰List❱ does not have kind ❰Type❱
    └─────────────┘


You or the interpreter annotated this expression:

↳ { builder = "touch $out", name = "empty", sytem = "x86_64-linux" }

... with this type or kind:

↳ { builder : Text, name : Text, system : Text }

... but the inferred type or kind of the expression is actually:

↳ { builder : Text, name : Text, sytem : Text }

Some common reasons why you might get this error:

● The Haskell Dhall interpreter implicitly inserts a top-level annotation
  matching the expected type

  For example, if you run the following Haskell code:


    ┌───────────────────────────────┐
    │ >>> input auto "1" :: IO Text │
    └───────────────────────────────┘


  ... then the interpreter will actually type check the following annotated
  expression:


    ┌──────────┐
    │ 1 : Text │
    └──────────┘


  ... and then type-checking will fail

────────────────────────────────────────────────────────────────────────────────

We can also take advantage of the fact that Dhall supports sum types so that we can make invalid states unrepresentable. For example, the system field really shouldn't be Text because not all strings are valid systems.

We can fix this by first creating a type more accurately representing all supported platforms. First, we just need to create a sum type for all supported operating systems:

$ dhall > OperatingSystem <<EOF
< cygwin  : {}
| darwin  : {}
| linux   : {}
| netbsd  : {}
| openbsd : {}
| solaris : {}
>
EOF

... along with helper utilities to build each operating system:

$ dhall > linux <<EOF
< linux   = {=}
| cygwin  : {}
| darwin  : {}
| netbsd  : {}
| openbsd : {}
| solaris : {}
>

Then we can create a sum type for each supported architecture:

$ dhall > Architecture <<EOF
< aarch64  : {}
| armv5tel : {}
| armv6l   : {}
| armv7l   : {}
| i686     : {}
| mips64el : {}
| x86_64   : {}
>
EOF

... as well as helper constructors for each architecture type:

$ dhall > x86_64 <<EOF
< x86_64   = {=}
| aarch64  : {}
| armv5tel : {}
| armv6l   : {}
| armv7l   : {}
| i686     : {}
| mips64el : {}
>
EOF

... and then we can create a type for supported Nix platforms:

$ dhall > Platform <<EOF
{ operatingSystem : ./OperatingSystem
, architecture    : ./Architecture
}
EOF

... and helper utilities for each platform:

$ dhall > x86_64-linux <<EOF
{ architecture    = ./x86_64
, operatingSystem = ./linux
}
EOF

... and verify that they type-check against our Platform type:

$ dhall <<< "./x86_64-linux : ./Platform"
{ architecture : < aarch64 : {} | armv5tel : {} | armv6l : {} | armv7l : {} | i686 : {} | mips64el : {} | x86_64 : {} >, operatingSystem : < cygwin : {} | darwin : {} | linux : {} | netbsd : {} | openbsd : {} | solaris : {} > }

{ architecture = < x86_64 = {=} | aarch64 : {} | armv5tel : {} | armv6l : {} | armv7l : {} | i686 : {} | mips64el : {} >, operatingSystem = < linux = {=} | cygwin : {} | darwin : {} | netbsd : {} | openbsd : {} | solaris : {} > }

Then we can always add a Nix translation layer that converts the strongly typed Dhall version to the weakly typed Nix string:

# platform.nix

let
  pkgs = import <nixpkgs> { };

  inherit (pkgs) dhallToNix;

  architectureToText =
    architecture:
      architecture {
        aarch64  = _ : "aarch64" ;
        armv5tel = _ : "armv5tel";
        armv6l   = _ : "armv6l"  ;
        armv7l   = _ : "armv7l"  ;
        i686     = _ : "i686"    ;
        mips64el = _ : "mips64el";
        x86_64   = _ : "x86_64"  ;
      };

  operatingSystemToText =
    operatingSystem:
      operatingSystem {
        cygwin  = _ : "cygwin" ;
        darwin  = _ : "darwin" ;
        linux   = _ : "linux"  ;
        netbsd  = _ : "netbsd" ;
        openbsd = _ : "openbsd";
        solaris = _ : "solaris";
      };

  platformToText =
    {architecture, operatingSystem}:
      let
        arch = architectureToText    architecture   ;
        os   = operatingSystemToText operatingSystem;
      in
        "${arch}-${os}";
in
  platformToText (dhallToNix "${./x86_64-linux} : ${./Platform}")

... which would type-check our ./x86_64-linux file against the ./Platform and return the following result:

"x86_64-linux"

We can even go a step further and implement the intermediate functions in Dhall, too:

let
  pkgs = import <nixpkgs> { };

  inherit (pkgs) dhallToNix;

  platformToText =
    dhallToNix ''
        let architectureToText
            =   λ(x : ${./Architecture} )merge
                { aarch64  = λ(_ : {}) → "aarch64"
                , armv5tel = λ(_ : {}) → "armv5tel"
                , armv6l   = λ(_ : {}) → "armv6l"
                , armv7l   = λ(_ : {}) → "armv7l"
                , i686     = λ(_ : {}) → "i686"
                , mips64el = λ(_ : {}) → "mips64el"
                , x86_64   = λ(_ : {}) → "x86_64"
                }
                x : Text
    in  let operatingSystemToText
            =   λ(x : ${./OperatingSystem} )merge
                { cygwin  = λ(_ : {}) → "cygwin"
                , darwin  = λ(_ : {}) → "darwin"
                , linux   = λ(_ : {}) → "linux"
                , netbsd  = λ(_ : {}) → "netbsd"
                , openbsd = λ(_ : {}) → "openbsd"
                , solaris = λ(_ : {}) → "solaris"
                }
                x : Text
    in  let platformToText
            =   λ(x : ${./Platform} )architectureToText    x.architecture
            ++  "-"
            ++  operatingSystemToText x.operatingSystem

    in  platformToText
    '';
in
  platformToText (dhallToNix "${./x86_64-linux} : ${./Platform}")

However, in practice you'd like to keep the platform expression as the original strongly typed record instead of converting the platform to a string. The original record lets you more easily extract the architecture and operating system fields and make decisions on their values using exhaustive pattern matching.

Drawbacks

The largest drawback of using Dhall to program Nix is that Dhall cannot encode many common idioms used in nixpkgs. Some examples of idioms that do not translate well to Nix are:

  • The callPackage idiom that nixpkgs uses very heavily for easily updating dependencies. This relies on reflection and recursive fixpoints, neither of which Dhall supports
  • Anything which uses builtins.listToAttrs or does reflection on record field names

I don't expect Dhall to be used at all in nixpkgs, but I do believe Dhall can benefit end users or companies for their own internal Nix projects.

Conclusion

The dhallToNix utility is available right now in the nixpkgs-unstable channel if you would like to try this out in your own project:

Also, if you would like to use the dhall-to-nix compiler for other purposes you can find the compiler on Hackage or Github:

If you're new to Dhall you can learn more about the configuration language by reading the Dhall tutorial.