Monday, November 27, 2017

Compare Nix derivations using nix-diff

I'm announcing a small nix-diff utility I wrote for comparing Nix derivations. This post will walk through two use cases for how you might use this utility.


This section provides some required background for understanding this post if you're new to Nix.

There are three stages to a Nix build:

  • Nix source code (i.e. *.nix files)
    • This corresponds to a source distribution in a typical package manager
  • Nix derivations (i.e. /nix/store/*.drv files)
    • This is the stage that caching works at
  • Nix build products (i.e. /nix/store/* files that are not derivations)
    • This corresponds to a binary distribution in a typical package manager

You can convert between these stages using the following command-line tools:

  • nix-instantiate converts Nix source code to Nix derivations
    • i.e. *.nix → /nix/store/*.drv
  • nix-store --realise converts Nix derivations to Nix build products
    • i.e. /nix/store/*.drv → /nix/store/*
  • nix-build is a convenience utility which combines the two preceding steps to go straight from source code to build products
    • i.e. *.nix → /nix/store/*

Nix supports caching binary build products so if you try to build the same derivation twice then the second build will reuse the result of the first build (i.e. a "cache hit"). If the derivation changes in any way, you get a "cache miss" and you need to build the derivation.

Carefully note that caching works at the level of Nix derivations and not at the level of Nix source code. For example, the following two Nix files differ at the source code level:

$ cat example0.nix 
  pkgs = import <nixpkgs> { };


$ cat example1.nix 
(import <nixpkgs> { }).hello

... but they produce the exact same derivation file:

$ nix-instantiate example0.nix 

$ nix-instantiate example1.nix 

... which means that if you try to build both example0.nix and example1.nix the build will only occur once since they share the same derivation.

You can think of the derivation file as a language-independent description of how to build something:

$ fold /nix/store/ajypjz54a8rn1qxsnhyr8m87w6hd7ghp-hello-2.10.drv 
orks/CoreFoundation.framework/CoreFoundation /dev/zero /dev/random /dev/urandom 
__sandboxProfile","(allow file-read* (literal \"/usr/lib/libncurses.5.4.dylib\")
)\n(import \"/nix/store/\")\

These *.drv files use the ATerm file format and are Nix-independent. Conceptually, Nix is just a domain-specific language for generating these ATerm files. That means, for example, that you could replace Nix with any front-end language or tool that can generate these ATerm files. In fact, this is how Guix works, by replacing Nix with Guile Scheme as the front-end language.

Understanding how Nix derivations work is fundamental to understanding the Nix ecosystem. nix-diff is one tool that aids this learning process as the following sections will illustrate.

Cache misses

nix-diff is a tool that I wish I had back when Awake Security first adopted Nix. We frequently ran into cache misses when using Nix because of subtle differences in Nix derivations in different development environments.

We can understand why we got cache misses by referring back to the three stages of a Nix build:

  • Nix source code (i.e. *.nix files)
  • Nix derivations (i.e. /nix/store/*.drv files)
  • Nix build products (i.e. /nix/store/* files that are not derivations)

For production we prefer to distribute Nix build products (i.e. binary distributions), but internally for development we distribute Nix source code. We prefer Nix code internally because this gives developers complete control over all of their transitive dependencies. For example, a developer can easily patch the systemd executable used on the virtual machine that runs their integration tests.

However, this flexibility comes at a price: if you don't know what you are doing you can easily accidentally change the derivation. This is because Nix and Nixpkgs are customizable to a fault and they have all sorts of "impure" defaults that change depending on the development environment. If you trip over one of these pitfalls you end up with a cache miss, which is a poor user experience.

The most common pitfalls we ran into early on in our Nix adoption were:

  • Not pinning nixpkgs
    • Note: We publicly shared our recipe for pinning nixpkgs here
  • Not pinning the system field for a derivation
    • This field defaults to the impure builtins.currentSystem in many cases
  • Impure surprises in nixpkgs

Let's motivate this with a real example. Suppose that I have the following derivation to build the Glasgow Haskell compiler (ghc):

$ cat example0.nix
  pkgs = import <nixpkgs> { };


This Nix expression is "impure" because the expression depends on the ambient nixpkgs channel that the user has installed. Compare this to the following expression which pins nixpkgs to a specific revision protected by a hash:

$ cat example1.nix
  fetchNixpkgs = import ./fetchNixpkgs.nix;

  nixpkgs = fetchNixpkgs {
     rev = "76d649b59484607901f0c1b8f737d8376a904019";
     sha256 = "01c2f4mj4ahir0sxk9kxbymg2pki1pc9a3y6r9x6ridry75fzb8h";

  pkgs = import nixpkgs { };


Let's instantiate the two expressions to compute their derivations:

$ nix-instantiate example0.nix 
$ nix-instantiate example1.nix 

Note that you may get a different result for the first derivation depending on what version of the nixpkgs channel you have installed.

Visually comparing the two derivation files is tedious and time-consuming:

$ fold /nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv | head

$ fold /nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv | head

If we use nix-diff, then we can pull out the differences immediately:

$ nix-diff /nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv /nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv 
- /nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv:{out}
+ /nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv:{out}The set of outputs do not match:
    + {man}The builders do not match
    - /nix/store/hsk82g493i7r496ghs0y61m6yvknxcml-bash-4.4-p5/bin/bash
    + /nix/store/axikcsz4wh2qpi5zmlfsmm4jx8wm8s1g-bash-4.4-p12/bin/bash
• The set of input names do not match:
    - bash-4.4-p5
    - clang-wrapper-3.7.1
    - coreutils-8.26
    - gmp-6.1.1
    - perl-5.22.3
    - python2.7-Sphinx-1.5.2
    + bash-4.4-p12
    + clang-wrapper-4.0.1
    + coreutils-8.28
    + gmp-6.1.2
    + perl-5.24.3
    + python2.7-Sphinx-1.6.5

Now we can see at a glance that the versions of several dependencies changed and GHC has split out its man pages into a new man output for better granularity of the build graph.

Note that these are not the only differences between the two derivations. However, all of the other differences are downstream of the above differences. For example, the two derivations have different out paths, but we expect them to differ for any two derivations that are not identical so there's no point including that in the diff. nix-diff makes an effort to highlight the root cause of the difference.

Understanding differences

Nix is more than just a package manager. You can use Nix to build and deploy an entire machine, which is how NixOS (the Nix operating system) works. The machine configuration is a Nix expression that you can instantiate and build like any other Nix expression.

This means that we can also use nix-diff to compare two machine configurations and understand how they differ. For example, when we change our production systems at Awake Security we sometimes run the change through nix-diff during code review to ensure that reviewers understand every change being made to the system.

We can illustrate this with a small example comparing two NixOS system specifications. The first system specification is a mostly blank system:

$ cat example0.nix
  nixos = import <nixpkgs/nixos> {
    system = "x86_64-linux";

    configuration = {
      boot.loader.grub.devices = [ "/dev/sda" ];

      fileSystems."/" = {
        device = "/dev/sda";


... and the second specification enables Kafka on the system:

$ cat example1.nix
  nixos = import <nixpkgs/nixos> {
    system = "x86_64-linux";

    configuration = {
      boot.loader.grub.devices = [ "/dev/sda" ];

      fileSystems."/" = {
        device = "/dev/sda";

      services.apache-kafka.enable = true;


We can differentiate the two derivations in one step like this:

$ nix-diff $(nix-instantiate example0.nix) $(nix-instantiate example1.nix)
- /nix/store/6z9nr5pzs4j1v9mld517dmlcz61zy78z-nixos-system-nixos-18.03pre119245.
+ /nix/store/k05ibijg0kknvwrgfyb7dxwjrs8qrlbj-nixos-system-nixos-18.03pre119245.
5cfd049a03.drv:{out}The input named `etc` differs
  - /nix/store/05c0v10pla0v8rfl44rs744m6wr729jy-etc.drv:{out}
  + /nix/store/8waqvzjg7bazzfzr49m89q299kz972wv-etc.drv:{out}The input named `dbus-1` differs
    - /nix/store/a16j2snzz25dhh96jriv3p6cgkc0vhxr-dbus-1.drv:{out}
    + /nix/store/mliabzdkqaayya67xiwfhwkg4gs9k0cg-dbus-1.drv:{out}The input named `system-path` differs
      - /nix/store/jcf6q7na01j8k9xcmqxykl62k4x6zwiv-system-path.drv:{out}
      + /nix/store/kh4kgsms24d02bxlrxb062pgsbs3riws-system-path.drv:{out}The set of input names do not match:
          + apache-kafka-2.12-
  • The input named `system-path` differs
    • These two derivations have already been compared
  • The input named `system-units` differs
    - /nix/store/yqnqdajd4664rvycrnwxwaj0mxp7602c-system-units.drv:{out}
    + /nix/store/2p5c4arwqphdz5wsvz6dbrgv0vhgf5qh-system-units.drv:{out}The set of input names do not match:
        + unit-apache-kafka.service
  • The input named `user-units` differs
    - /nix/store/x34dqw5y34dq6fj5brj2b5qf0nvglql9-user-units.drv:{out}
    + /nix/store/4iplnk260q2dpr8b8ajrjkrn44yk06aq-user-units.drv:{out}The input named `unit-dbus.service` differs
      - /nix/store/fd6j972zn1hfvqslxc8c64xxaf1wg475-unit-dbus.service.drv:{out}
      + /nix/store/s7rpgwbald9qx8rwlw4v276wj2x3ld8r-unit-dbus.service.drv:{out}The input named `dbus-1` differs
        • These two derivations have already been compared
• The input named `system-path` differs
  • These two derivations have already been compared
• The input named `users-groups.json` differs
  - /nix/store/x6c7pqx40wfdzwf96jfi1l0hzxjgypri-users-groups.json.drv:{out}
  + /nix/store/gk5yyjw579hgyxgwbrh1kzb3hbdbzgbq-users-groups.json.drv:{out}The environments do not match:
"mutableUsers":true,"users":[{"createHome":false,"description":"→Apache Kafka 
daemon user","group":"nogroup","hashedPassword":null,"home":"/tmp/kafka-logs","i
/bin/nologin","uid":169},{"createHome":false,"description":"→D-Bus system mess

However, this doesn't do the diff justice because the output is actually colorized, like this:

From the diff we can see that:

  • This change adds Kafka executables to the system PATH
  • This change adds a new apache-kafka systemd service
  • This change adds a new apache-kafka user to the system

Note how nix-diff does more than diffing the two root derivations. If the two derivations differ on a shared input then nix-diff will descend into that input and diff that and repeat the process until the root cause of the change is found. This works because Nix's dependency graph is complete and reachable from the root derivation.


You can find the nix-diff utility on Hackage or GitHub if you would like to use this in your own development workflow. Hopefully nix-diff will help you better understand how Nix works under the hood and also help you pin Nix derivations more robustly.

Friday, November 3, 2017

Semantic integrity checks are the next generation of semantic versioning

The Dhall configuration language just added support for "semantic integrity checks". This post explains what "semantic integrity check" means, motivates the new feature, and compares to semantic versioning.

The problem

I added this feature in response to user concerns about code injection in Dhall configuration files.

We'll illustrate the problem using the following example.dhall configuration file which derives a summary of student information from a list of students:

    -- Example of an expression imported by URL
    let map =

    -- Example of an expression imported by path
in  let students = ./students.dhall

in  let getName = λ(student : { name : Text, age : Natural })

in  { classSize = List/length { name : Text, age : Natural } students
    , names     = map { name : Text, age : Natural } Text getName students

This configuration imports a helper function named map from the Dhall Prelude by URL:

    let map =

in  ...

... and that URL currently hosts a text file encoding the following Dhall function:

$ curl -L
Tranform a list by applying a function to each element


./map Natural Bool Natural/even ([+2, +3, +5] : List Natural)
= [True, False, False] : List Bool

./map Natural Bool Natural/even ([] : List Natural)
= [] : List Bool
let map : ∀(a : Type) → ∀(b : Type) → (a → b) → List a → List b
    =   λ(a : Type)
    →   λ(b : Type)
    →   λ(f : a → b)
    →   λ(xs : List a)
    →   List/build
        (   λ(list : Type)
        →   λ(cons : b → list → list)
        →   List/fold a xs list (λ(x : a) → cons (f x))

in  map

Similarly, our example configuration imports student data from another configuration file by path:


in  let students = ./students.dhall


... and we'll assume that file contains the following list of student records:

[ { name = "Jane Doe"    , age = +19 }
, { name = "John Rivera" , age = +18 }
, { name = "Alice O'Hare", age = +19 }

Values, functions, and types are all Dhall expressions, so we can inject all of them in our code via URLs or paths. When we interpret a Dhall configuration file these imports get substituted with their contents and then we evaluate the fully resolved configuration file as an expression in a functional language:

$ dhall <<< './example.dhall'  | dhall-format
{ classSize : Natural, names : List Text }

{ classSize = +3
, names     = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text

Users were concerned that these imports could be compromised, resulting in malicious code injection

The solution

The latest release of Dhall added support for import integrity checks to address user concerns about malicious tampering. We can use these integrity checks to "freeze" our imports by adding a SHA-256 hash after each import.

First, we ask the dhall-hash utility to compute the current hash for our imports:

$ dhall-hash <<< ''
$ dhall-hash <<< './students.dhall' 

... and then we append the hash after each import to freeze the import:

    let map =

in  let students =
          ./students.dhall sha256:6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06

in  let getName = λ(student : { name : Text, age : Natural })

in  { classSize = length { name : Text, age : Natural } students
    , names     = map { name : Text, age : Natural } Text getName students

Once you add these integrity checks the Dhall interpreter will enforce them when resolving imports. In this case, the example configuration still successfully evaluates to the same result after adding the integrity checks:

$ dhall <<< './example.dhall'  | dhall-format
{ classSize : Natural, names : List Text }

{ classSize = +3
, names     = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text

The integrity check passes because we haven't yet modified any of our imports.

Semantic integrity

Once you freeze an import with a hash, Dhall guarantees that the meaning of the import never changes. These are semantic hashes, not textual hashes.

For example, suppose that we modify ./students.dhall to add a comment, reorder record fields, and modify the formatting, like this:

-- Class of 2017

[ { age = +19, name = "Jane Doe" },
  { name = "John Rivera" , age = +18 },
  { name = "Alice O'Hare", age = +19 } ]

These changes do not affect the computed hash of the file and the interpreter still accepts the ./students.dhall import that we protected with an integrity check:

$ dhall <<< './example.dhall'  | dhall-format  # Still succeeds
{ classSize : Natural, names : List Text }

{ classSize = +3
, names     = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text

The Dhall interpreter accepted the import of ./students.dhall because the semantic hash never changed:

$ dhall-hash <<< './students.dhall' 

However, now suppose we try to change the substance of the file by modifying John's age:

-- Class of 2017

[ { age = +19, name = "Jane Doe" },
  { name = "John Rivera" , age = +20 },
  { name = "Alice O'Hare", age = +19 } ]

Now the semantic integrity check fails:

$ dhall <<< './example.dhall'

Error: Import integrity check failed

Expected hash:

↳ 6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06

Actual hash:

↳ 808d921914de5349f50ac656bed93c2894dfe35401991e1ca0c89861834023fb

Dhall recognizes that this is no longer the same expression and rejects the import. Only an import that represents the same value can pass the check.

This means, for example, that malicious users cannot tamper with our imports, even if we were to distribute the imported code over an insecure channel. The worst that an attacker can do is cause our configuration to reject the import, but they cannot trick the configuration into silently accepting the wrong expression.


We can use these integrity checks to do more than just secure code. We can also repurpose these checks to assert that our code refactors are safe and behavior-preserving.

For example, suppose that we change the student list to:

-- Class of 2017

    let double = λ(x : Natural)  x * +2

in  [ { name = "Jane Doe"    , age = +19       }
    , { name = "John Rivera" , age = double +9 }
    , { name = "Alice O'Hare", age = +19       }

This will still pass the integrity check because the student list still evaluates to the same expected result.

We can also refactor our project layout, too. For example, we could modify the student list to import the double function from another file:

-- Class of 2017

[ { name = "Jane Doe"    , age = +19               }
, { name = "John Rivera" , age = ./double.dhall +9 }
, { name = "Alice O'Hare", age = +19               }

... where ./double.dhall has the following contents:

λ(x : Natural)  x * +2

... and the integrity check would still pass.

I originally introduced semantic integrity checks to protect against malicious code modification then later realized that they can also be used to protect against non-malicious modifications (such as a refactor gone wrong).

Textual hashes

The semantic hash provides a more information than a textual hash of the import. For example, suppose we changed our ./double.dhall function to triple the argument:

λ(x : Natural)  x * +3

A textual hash of the ./students.dhall import would not detect this change because the real change took place in the text of another file that ./students.dhall imported. However, A semantic hash can follow these imports to detect transitive changes to dependencies.

The semantic hash is also more flexible than a textual hash because the semantic hash does not change when we make cosmetic changes like refactoring, reformatting, or commenting code.


Dhall's semantic versioning can reject some behavior-preserving changes to functions. Dhall only attempts to detect if two functions are β-equivalent (i.e. the same if fully β-reduced).

For example, the following two functions are equivalent, but will not produce the same hash:

λ(x : Bool)  x
λ(x : Bool)  if x then True else False

Similarly, Dhall's semantic hash cannot detect that these two functions are the same:

λ(x : Natural)  x * +2
λ(x : Natural)  x + x

On the other hand, Dhall will (almost) never give two semantically distinct expressions the same hash. Only an astronomically improbable hash collision can cause this and at the time of this writing there is no known vulnerability in the SHA-256 hash algorithm.

Dhall will support other hash algorithms should SHA-256 ever be broken. This is why Dhall prefixes the hash with the algorithm to leave the door open for new hash algorithms.

Semantic versioning

You might wonder how semantic integrity checks compare to semantic versioning. I like to think of semantic integrity checks and semantic versions as two special cases of the following abstract interface:

  • a package publishes a version string for each official release
  • you can compare two version strings to detect a breaking change to the package

Semantic versioning is one special case of that abstract interface where:

  • the version string has a major number and minor number
  • a difference in major version numbers signals a breaking change

Some variations on semantic versioning propose independently versioning each exported function/value/type instead of versioning the package as a whole. Also, some languages (like Elm) mechanically enforce semantic versioning by detecting API changes programmatically and forcing a major version bump if there is a breaking change.

A semantic integrity check is another special case of that abstract interface where:

  • the version string is a SHA-256 hash
  • if two hashes are different then that signals a breaking change

The key difference between semantic versioning and semantic integrity checks is how we define "a breaking change". Semantic version numbers (usually) treat changes to types as breaking changes whereas semantic integrity checks treat changes to values as breaking changes. (To be totally pedantic: semantic integrity checks treat changes to expressions as breaking changes, and in a language like Dhall everything is an expression, including types).

This does not imply that semantic integrity checks are better than semantic version numbers. Sometimes you want to automatically pick up small changes or improvements from your dependencies without adjusting a hash. In cases like those you want the expected type to be the contract with your dependency and you don't want to pin the exact value.

For example, we could "simulate" semantic versioning in Dhall by attaching a type annotation to our ./students.dhall import like this:

    let map =

in  let students =
          ./students.dhall : List { name : Text, age : Natural }

in  let getName = λ(student : { name : Text, age : Natural })

in  { classSize = List/length { name : Text, age : Natural } students
    , names     = map { name : Text, age : Natural } Text getName students

... and now we can add or remove students from our imported list without breaking anything. We've used the type system as a coarser integrity check to state that certain changes to our configuration file's meaning are okay.


You can think of a semantic integrity check as a "value annotation" (i.e. the term-level equivalent of a type annotation). Instead of declaring an expected type we declare an expected value summarized as a hash.

This is why the title of this post declares that "semantic integrity checks are the next generation of semantic versioning". If you think of a semantic version as a concise summary of an imported package's type, then a semantic integrity check is a concise summary of an imported package's value.

Monday, October 16, 2017

Advice for Haskell beginners

This post summarizes advice that I frequently give to Haskell beginners asking how to start out learning the language

First, in general I recommend reading the Haskell Programming from first principles book, mainly because the book teaches Haskell without leaving out details and also provides plenty of exercises to test your understanding. This is usually good enough if you are learning Haskell as your first language.

However, I would like to give a few additional tips for programmers who are approaching Haskell from other programming languages.

Learn Haskell for the right reasons

Some people learn Haskell with the expectation that they will achieve some sort of programming enlightenment or nirvana. You will be disappointed if you bring these unrealistic expectations to the language. Haskell is not an achievement to unlock or a trophy to be won because learning is a never-ending process and not a finish line.

I think a realistic expectation is to treat Haskell as a pleasant language to use that lets you focus on solving real problems (as opposed to wasting your time fixing silly self-induced problems like null pointers and "undefined is not a function").

Avoid big-design-up-front

Haskell beginners commonly make the mistake of trying to learn as much of the language as possible before writing their first program and overengineering the first draft. This will quickly burn you out.

You might come to Haskell from a dynamically typed background like JavaScript, Python, or Ruby where you learned to avoid refactoring large code bases due to the lack of static types. This aversion to refactoring promotes a culture of "big-design-up-front" where you try to get your project as close to correct on the first try so that you don't have to refactor your code later.

This is a terrible way to learn Haskell, for two reasons. First, Haskell has a much higher ceiling than most other programming languages, so if you wait until you hit that ceiling before building something you will wait a looooooong time. Second, refactoring is cheap in Haskell so you don't need to get things right the first time.

You will accelerate your learning process if you get dirty and make mistakes. Write really ugly and embarrassing code and then iteratively refine your implementation. There is no royal road to learning Haskell.

Avoid typeclass abuse

Specifically, avoid creating new typeclasses until you are more comfortable with the language.

Functional programming languages excel because many language features are "first class". For example, functions and effects are first class in Haskell, meaning that you can stick them in a list, add them, nest them, or pass them as arguments, which you can't (easily) do in imperative languages.

However, typeclasses are not first-class, which means that if you use them excessively you will quickly depend on advanced language features to do even simple things. Programming functionally at the term-level is much simpler and more enjoyable than the type-level Prolog that type-classes encourage.

Begin by learning how to solve problems with ordinary functions and ordinary data structures. Once you feel like you understand how to solve most useful problems with these simple tools then you can graduate to more powerful tools like typeclasses. Typeclasses can reduce a lot of boilerplate in proficient hands, but I like to think of them as more of a convenience than a necessity.

You can also take this approach with you to other functional languages (like Elm, Clojure, Elixir, or Nix). You can think of "functions + data structures" as a simple and portable programming style that will improve all the code that you write, Haskell or not.

Build something useful

Necessity is the mother of invention, and you will learn more quickly if you try to build something that you actually need. You will quickly convince yourself that Haskell is useless if you only use the language to solve Project Euler exercises or Sudoku puzzles.

You are also much more likely to get a Haskell job if you have a portfolio of one or two useful projects to show for your time. These sorts of projects demonstrate that you learned Haskell in order to build something instead of learning Haskell for its own sake.


Hopefully these tips will help provide some guard rails for learning the language for the first time. That's not to say that Haskell is perfect, but I think you will enjoy the language if you avoid these common beginner pitfalls.

Saturday, October 7, 2017

Why do our programs need to read input and write output?

I wrote this post to challenge basic assumptions that people make about software architecture, which is why I chose a deliberately provocative title. You might not agree with all the points that I am about to make, but I do hope that this post changes the way that you think about programming

This post is an attempt to restate in my own words what Conal Elliot (and others before him) have been saying for a while: modern programming is a Rube-Goldberg machine that could be much simpler if we change the way we compose code.

Most programmers already intuitively understand this at some level. They will tell you that the programming ecosystem is deeply flawed, fragmented, and overly complex. I know that I felt that way myself for a long time, but in retrospect I believe that my objections at the time were superficial. There were deeper issues with programming that I was blind to because they are so ingrained in our collective psyche.

Disclaimer: This post uses my pet configuration language Dhall to illustrate several points, mainly because Dhall is a constructive proof-of-concept of these ideas. The purpose of this post is not so much to convince you to use Dhall but rather to convince you to think about software in a new way

Input and output

Consider the title of this post for example:

"Why do our programs need to read input and write output?"

Most people will answer the title with something like:

  • "Our programs need a way to communicate with the outside world"
  • "Programs need to do something other than heat up CPUs"

Now suppose I phrased the question in a different way:

"What if only the compiler could read input and write output?"

"What's the difference?", you might ask. "Surely you mean that the language provides some library function that I can call to read or write input to some handles, whether they be file handles or standard input/output."

No, I mean that only the compiler implementation is allowed to read input or write output, but programs written within the compiled language cannot read input or write output. You can only compute pure functions within the language proper.

Again, this probably seems ridiculous. How could you communicate at all with the program?


Most languages have some way to import code, typically bundled in the form of packages. An imported function or value is something that our compiler reads as input statically at compile time as opposed to a value read by our program dynamically at run time.

Suppose I told you that our hypothetical programming language could only read input values by importing them

"Ridiculous!" you exclaim as you spit out your coffee. "Nobody would ever use such a language." You probably wouldn't even know where to begin since so many things seem wrong with that proposal

Perhaps you would object to the heavyweight process for publishing and subscribing to new values. You would recite the package management process for your favorite programming language:

  • Create a source module containing your value
  • Create a standard project layout
  • Create a package description file
  • Check your project into version control
  • Publish your package to a package repository

Perhaps you would object to the heavyweight process for configuring programs via imports? Your favorite programming language would typically require you to:

  • Retrieve the relevant program from version control
  • Modify the project description file to reference the newly published dependency
  • Modify project code to import your newly published value
  • Compile the program
  • Run the program

Why would a non-technical end user do any of that just to read and write values?

This is exactly the Rube-Goldberg machine I'm referring to. We have come to expect a heavyweight process for source code to depend on other source code

Importing paths

Distributing code doesn't have to be heavyweight, though. Consider Dhall's import system which lets you reference expressions directly by their paths. For example, suppose we saved the value True to a file named example.dhall:

$ echo 'True' > example.dhall

Another Dhall program can reference the above file anywhere the program expects a boolean value:

$ dhall <<< './example.dhall || False'


This is the exact same as if we had just replaced the path with the file's contents:

$ dhall <<< 'True || False'


Dhall doesn't need to support an explicit operation to read input because Dhall can read values by just importing them

Similarly, Dhall doesn't need to support an explicit write operation either. Just save a Dhall expression to a file using your favorite text editor.

"What if I need a way to automate the generation of files?"

You don't need to automate the process of saving a file because one file is always sufficiently rich to store as much information as you need. Dhall is a programmable configuration language which supports lists and records so any one file can store or programmatically generate any number of values. Files are human-generated artifacts which exist purely for our convenience but Dhall code does not behave any differently whether or not the program spans multiple files or a single file.

Most of the time people need to automate reads and writes because they are using non-programmable configuration file formats or data storage formats

Programmable configuration

You might object: "Configuration files shouldn't be Turing-complete!"

However, Dhall is not Turing-complete. Dhall is a total programming language, meaning that evaluation eventually halts. In practice, we don't actually care that much if Dhall halts, since we cannot guarantee that the program halts on reasonable human timescales. However, we can statically analyze Dhall and most of the Halting Problem objections to static analysis don't apply.

For example, Dhall can statically guarantee that programs never fail or throw exceptions (because obviously a configuration file should never crash). Dhall also lets you simplify confusing files by eliminating all indirection because Dhall can reduce every program to a canonical normal form.

In fact, most objections to programmable configuration files are actually objections to Turing-completeness

Importing URLs

Dhall also lets you import code by URL instead of path. Dhall hosts the Prelude of standard utilities online using IPFS (a distributed hashtable for the web),and you can browse the Prelude using this link, which redirects to the latest version of the Prelude:

For example, you can browse the above Prelude to find the not function hosted here:

... which has the following contents:

$ curl
Flip the value of a `Bool`


./not True = False

./not False = True
let not : Bool → Bool
    =   λ(b : Bool) → b == False

in  not

... and we can "apply" our URL to our file as if they were both just ordinary functions and values:

$ dhall <<< ' ./example.dhall'


... or we could use let-expressions to improve readability without changing the result:

    let not =

in  let example = ./example.dhall

in  not example

"That's a security vulnerability!", you protest. "You are ... literally ... injecting remote code into your program."

Playing the devil's advocate, I ask you what is wrong with remote code injection

"Well, for starters, if the URL is compromised the attacker can run arbitrary code like ..."

... reading and writing files? Dhall is totally pure and doesn't support any effects at all (besides heating up CPUs ☺).

This brings us full circle back to our original question:

"Why do our programs need to read input and write output?"


Software doesn't have to be so complex if we challenge our assumptions that got us into this mess

These ideas were heavily inspired the following post by Conal Elliot:

... and this post is not about Dhall so much as Conal's vision of an effect-free purely functional future for programming. I believe explicitly reading input and writing output will eventually become low-level implementation details of higher-level languages, analogous to allocating stack registers or managing memory.

Tuesday, September 26, 2017

Type-driven strictness

I was recently trying to optimize Dhall's performance because the interpreter was performing poorly on some simple examples.

For example, consider this tiny expression that prints 3000 exclamation marks:

Natural/fold +3000 Text (λ(x : Text)  x ++ "!") ""

The above Dhall expression takes over 14 seconds to evaluate to normal form, which is not acceptable:

$ bench 'dhall <<< './exclaim'
benchmarking dhall <<< ./exclaim
time                 14.42 s    (14.23 s .. 14.57 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 14.62 s    (14.54 s .. 14.66 s)
std dev              70.22 ms   (0.0 s .. 77.84 ms)
variance introduced by outliers: 19% (moderately inflated)


The performance suffers because Dhall is lazy in the accumulator, meaning that the accumulator builds up a gigantic expression like this:

(λ(x : Text)  x ++ "!")
( (λ(x : Text)  x ++ "!")
  ( (λ(x : Text)  x ++ "!")
    ( (λ(x : Text)  x ++ "!")
      ( (λ(x : Text)  x ++ "!")
        {Repeat this nesting 3000 times}

... and then attempts to normalize the entire expression, which takes a long time and wastes a lot of memory.

The reason for this lazy behavior is the following code for evaluating Natural/fold in the Dhall interpreter:

normalize (App (App (App (App NaturalFold (NaturalLit n0)) _) succ') zero)
    = normalize (go n0)
    go !0 = zero
    go !n = App succ' (go (n - 1))

You can read that as saying:

  • Given an expression of the form Natural/Fold n succ zero
  • Wrap the value zero in n calls of the function succ
    • i.e. succ (succ (succ (... {n times} ... (succ zero) ...)))
  • Then normalize that

A smarter approach would be to keep the accumulator strict, which means that we evaluate as we go instead of deferring all evaluation to the end. For example, the accumulator starts off as just the empty string:


... then after one iteration of the loop we get the following accumulator:

(λ(x : Text)  x ++ "!") ""

... and if we evaluate that accumulator immediately we get:


Then the next iteration of the loop produces the following accumulator:

(λ(x : Text)  x ++ "!") "!"

... which we can again immediately evaluate to get:


This is significantly more efficient than leaving the expression unevaluated.

We can easily implement such a strict loop by making the following change to the interpreter:

normalize (App (App (App (App NaturalFold (NaturalLit n0)) _) succ') zero)
    = go n0
    go !0 = normalize zero
    go !n = normalize (App succ' (go (n - 1)))

The difference here is that we still build up a chain of n calls to succ but now we normalize our expression in between each call to the succ function instead of waiting until the end to normalize.

Once we do this runtime improves dramatically, going down from 15 seconds to 90 milliseconds:

$ bench 'dhall <<< './example'
benchmarking dhall <<< ./example
time                 88.92 ms   (87.14 ms .. 90.74 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 86.06 ms   (84.98 ms .. 87.15 ms)
std dev              1.734 ms   (1.086 ms .. 2.504 ms)

... or in other words about 30 microseconds per element. We could still do more to optimize this but at least we're now in the right ballpark for an interpreter. For reference, Python is 4x faster on my machine for the following equivalent program:

print ("!" * 3000)
$ bench 'python'
benchmarking python
time                 24.55 ms   (24.09 ms .. 25.02 ms)
                     0.998 R²   (0.996 R² .. 1.000 R²)
mean                 24.53 ms   (24.16 ms .. 24.88 ms)
std dev              798.4 μs   (559.8 μs .. 1.087 ms)

However, these results don't necessarily imply that a strict accumulator is always better.


Sometimes laziness is more efficient, though. Consider this program:

(   λ(list : Type)
   λ(cons : Integer  list  list)
   Natural/fold +6000 list (cons 1)

The above example uses Natural/fold to build a list of 6000 1s.

In this case the accumulator of the fold is a list that grows by one element after each step of the fold. We don't want to normalize the list on each iteration because that would lead to quadratic time complexity. Instead we prefer to defer normalization to the end of the loop so that we get linear time complexity.

We can measure the difference pretty easily. A strict loop takes over 6 seconds to complete:

bench 'dhall <<< ./ones'
benchmarking dhall <<< ./ones
time                 6.625 s    (6.175 s .. 7.067 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 6.656 s    (6.551 s .. 6.719 s)
std dev              95.98 ms   (0.0 s .. 108.3 ms)
variance introduced by outliers: 19% (moderately inflated)

... whereas a lazy loop completes in about 180 milliseconds:

$ bench 'dhall <<< ./g'
benchmarking dhall <<< ./g
time                 182.5 ms   (175.1 ms .. 191.3 ms)
                     0.998 R²   (0.995 R² .. 1.000 R²)
mean                 177.5 ms   (172.1 ms .. 180.8 ms)
std dev              5.589 ms   (2.674 ms .. 8.273 ms)
variance introduced by outliers: 12% (moderately inflated)

Moreover, the difference in performance will only worsen with larger list sizes due to the difference in time complexity.

Also, in case you were wondering, Python is about 7x faster:

print ([1] * 6000)
$ bench 'python'
benchmarking python
time                 25.36 ms   (24.75 ms .. 25.92 ms)
                     0.998 R²   (0.996 R² .. 0.999 R²)
mean                 25.64 ms   (25.16 ms .. 26.03 ms)
std dev              917.8 μs   (685.7 μs .. 1.348 ms)

Why not both?

This poses a conundrum because we'd like to efficiently support both of these use cases. How can we know when to be lazy or strict?

We can use Dhall's type system to guide whether or not we keep the accumulator strict. We already have access to the type of the accumulator for our loop, so we can define a function that tells us if our accumulator type is compact or not:

compact :: Expr s a -> Bool
compact Bool             = True
compact Natural          = True
compact Integer          = True
compact Double           = True
compact Text             = True
compact (App List _)     = False
compact (App Optional t) = compact t
compact (Record kvs)     = all compact kvs
compact (Union kvs)      = all compact kvs
compact _                = False

You can read this function as saying:

  • primitive types are compact
  • lists are not compact
  • optional types are compact if the corresponding non-optional type is compact
  • a record type is compact if all the field types are compact
  • a union type is compact if all the alternative types are compact

Now, all we need to do is modify our fold logic to use this compact function to decide whether or not we use a strict or lazy loop:

normalize (App (App (App (App NaturalFold (NaturalLit n0)) t) succ') zero) =
    if compact (normalize t) then strict else lazy
    strict =            strictLoop n0
    lazy   = normalize (  lazyLoop n0)

    strictLoop !0 = normalize zero
    strictLoop !n = normalize (App succ' (strictLoop (n - 1)))

    lazyLoop !0 = zero
    lazyLoop !n = App succ' (lazyLoop (n - 1))

Now we get the best of both worlds and our interpreter gives excellent performance in both of the above examples.


Here's a bonus example for people who got this far!

The original Dhall expression that motivated this post was an attempt to implement FizzBuzz in Dhall. The program I ended up writing was:

    let pred =
            λ(n : Natural)
               let start = { next = +0, prev = +0 }
            in  let step =
                        λ(state : { next : Natural, prev : Natural })
                       { next = + +1, prev = }
            in  let result =
                      { next : Natural, prev : Natural }
            in  result.prev

in  let not = λ(b : Bool)  b == False

in  let succ =
            λ ( state
              : { buzz : Natural, fizz : Natural, index : Natural, text : Text }
               let fizzy = Natural/isZero state.fizz
            in  let buzzy = Natural/isZero
            in  let line =
                            if fizzy && buzzy then "FizzBuzz"
                      else  if fizzy && not buzzy then "Fizz"
                      else  if not fizzy && buzzy then "Buzz"
                      else  Integer/show (Natural/toInteger state.index)
            in  { buzz  = pred (if buzzy then +5 else
                , fizz  = pred (if fizzy then +3 else state.fizz)
                , index = state.index + +1
                , text  = state.text ++ line ++ "\n"

in  let zero = { buzz = +5, fizz = +3, index = +0, text = "" }

in  let fizzBuzz =
            λ(n : Natural)
               let result =
                      { buzz  : Natural
                      , fizz  : Natural
                      , index : Natural
                      , text  : Text
            in  result.text

in  fizzBuzz

However, this program runs incredibly slowly, taking over 7 seconds just to compute 20 elements:

bench 'dhall <<< "./fizzbuzz +20"'
benchmarking dhall <<< "./fizzbuzz +20"
time                 7.450 s    (7.194 s .. 7.962 s)
                     0.999 R²   (NaN R² .. 1.000 R²)
mean                 7.643 s    (7.512 s .. 7.739 s)
std dev              145.0 ms   (0.0 s .. 165.6 ms)
variance introduced by outliers: 19% (moderately inflated)

However, if you use a strict fold then the program takes half a second to go through 10,000 elements:

$ bench 'dhall <<< "./fizzbuzz +10000"'
benchmarking dhall <<< "./fizzbuzz +10000"
time                 591.5 ms   (567.3 ms .. NaN s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 583.4 ms   (574.0 ms .. 588.8 ms)
std dev              8.418 ms   (0.0 s .. 9.301 ms)
variance introduced by outliers: 19% (moderately inflated)


Many people associate dynamic languages with interpreters, but Dhall is an example of a statically typed interpreter. Dhall's evaluator is not sophisticated at all but can still take advantage of static type information to achieve comparable performance with Python (which is a significantly more mature interpreter). This makes me wonder if the next generation of interpreters will be statically typed in order to enable better optimizations.