Monday, February 20, 2017

The Curry-Howard correspondence between programs and proofs

This post will explain the connection between programming languages and logical proofs, known as the Curry-Howard correspondence. I will provide several examples of this correspondence to help you build a working intuition for how these two fields relate to one another.

The Curry-Howard correspondence states that:

  • Logical propositions correspond to programming types
  • Logical proofs correspond to programming values
  • Proving a proposition corresponds to creating a value of a given type

I'll use my Dhall programming language to illustrate the above connections since Dhall is an explicitly typed total programming language without any escape hatches. If you're not familiar with Dhall, you can read the Dhall tutorial and I will also make analogies to Haskell along the way, too.

Propositions

Let's begin with the following logical proposition:

∀(a ∈ Prop): a ⇒ a

You can read that as "for any proposition (which we will denote using the letter a), if the proposition a is true then the proposition a is true". This is true no matter what the proposition a is. For example, suppose that a were the following proposition:

a = {The sky is blue}

Then we could conclude that "if the sky is blue then the sky is blue":

{The sky is blue} ⇒ {The sky is blue}

Here the right double arrow () denotes logical implication. Anywhere you see (A ⇒ B) you should read that as "if the proposition A is true then the proposition B is true". You can also say "the proposition A implies the proposition B" or just "A implies B" for short.

However, what if a were another proposition like:

a = {The sky is green}

giving us:

{The sky is green} ⇒ {The sky is green}

This is true, even though the sky might not be green. We only state that "if the sky is green then the sky is green", which is definitely a true statement, whether or not the sky is green.

The upside down A (i.e. ) in our original proposition is mathematical shorthand that means "for all" (although I sometimes describe this as "for any"). The symbol is shorthand for "in" (i.e. set membership). So whenever you see this:

∀(a ∈ S): p

... you can read that as "for any element a in the set S the proposition p is true". Usually the set S is Prop (i.e. the set of all possible propositions) but we will see some examples below where S can be another set.

Types

This logical proposition:

∀(a ∈ Prop): a ⇒ a

... corresponds to the following Dhall type:

(a : Type)  a  a

... which you can read as "for any type (which we will denote using the letter a), this is the type of a function that transforms an input of type a to an output of type a". Here's the corresponding function that has this type:

λ(a : Type)  λ(x : a)  x

This is an anonymous function of two arguments:

  • the first argument is named a and a is a Type
  • the second argument is named x and x has type a

The result is the function's second argument (i.e. x) which still has type a.

The equivalent Haskell function is id, the polymorphic identity function:

id :: a -> a
id x = x

-- or using the `ExplicitForAll` language extension
id :: forall a . a -> a
id x = x

The main difference is that Haskell does not require you to explicitly bind the polymorphic type a as an additional argument. Dhall does require this because Dhall is an explicitly typed language.

Correspondence

The Curry-Howard correspondence says that we can use the type checker of a programming language as a proof checker. Any time we want to prove a logical proposition, we:

  • translate the logical proposition to the corresponding type in our programming language
  • create a value in our programming language that has the given type
  • use the type-checker to verify that our value has the specified type

If we find a value of the given type then that completes the proof of our original proposition.

For example, if we want to prove this proposition:

∀(a ∈ Prop): a ⇒ a

... then we translate the proposition to the corresponding type in the Dhall programming language:

(a : Type)  a  a

... then define a value in the Dhall language that has this type:

λ(a : Type)  λ(x : a)  x

... and then use Dhall's type checker to verify that this value has the given type:

$ dhall <<< 'λ(a : Type) → λ(x : a) → x'(a : Type) → ∀(x : a)a

λ(a : Type) → λ(x : a)x

The first line is the inferred type of our value and the second line is the value's normal form (which in this case is the same as the original value).

Note that the inferred type slightly differs from the original type we expected:

-- What we expected:
(a : Type)  a  a

-- What the compiler inferred:
(a : Type)  (x : a)  a

These are both still the same type. The difference is that the compiler's inferred type also includes the name of the second argument: x. If we were to translate this back to logical proposition notation, we might write:

∀(a ∈ Prop): ∀(x ∈ a): a

... which you could read as: "for any proposition named a, given a proof of a named x, the proposition a is true". This is equivalent to our original proposition but now we've given a name (i.e. x) to the proof of a.

The reason this works is that you can think of a proposition as a set, too, where the elements of the set are the proofs of that proposition. Some propositions are false and have no elements (i.e. no valid proofs), while other propositions can have multiple elements (i.e. multiple valid proofs).

Similarly, you can think of a type as a set, where the elements of the set are the values that have that type. Some types are empty and have no elements (i.e. no values of that type), while other types can have multiple elements (i.e. multiple values of that type).

Here's an example of a proposition that has multiple valid proofs:

∀(a ∈ Prop): a ⇒ a ⇒ a

The corresponding type is:

(a : Type)  a  a  a

... and there are two values that have the above type. The first value is:

λ(a : Type)  λ(x : a)  λ(y : a)  x

... and the second value is:

λ(a : Type)  λ(x : a)  λ(y : a)  y

We can even translate these two values back into informal logical proofs. For example, this value:

λ(a : Type)  λ(x : a)  λ(y : a)  x

... corresponds to this informal proof:

For each "a" in the set of all propositions:

Given a proof that "a" is true [Let's call this proof "x"]

Given a proof that "a" is true [Let's call this proof "y"]

We can conclude that "a" is true, according to the proof "x"

QED

Similarly, this value:

λ(a : Type)  λ(x : a)  λ(y : a)  y

... corresponds to this informal proof:

For each "a" in the set of all propositions:

Given a proof that "a" is true [Let's call this proof "x"]

Given a proof that "a" is true [Let's call this proof "y"]

We can conclude that "a" is true, according to the proof "y"

QED

We can actually give these proofs a formal structure that parallels our code but that is outside of the scope of this post.

Function composition

Now consider this more complex proposition:

∀(a ∈ Prop):
∀(b ∈ Prop):
∀(c ∈ Prop):
(b ⇒ c) ⇒ (a ⇒ b) ⇒ (a ⇒ c)

You can read that as saying: "for any three propositions named a, b, and c, if b implies c, then if a implies b, then a implies c".

The corresponding type is:

    (a : Type)
   (b : Type)
   (c : Type)
   (b  c)  (a  b)  (a  c)

... and we can define a value that has this type in order to prove that the proposition is true:

    λ(a : Type)
   λ(b : Type)
   λ(c : Type)
   λ(f : b  c)
   λ(g : a  b)
   λ(x : a)
   f (g x)

... and the compiler will infer that our value has the correct type:

$ dhall
    λ(a : Type)
→   λ(b : Type)
→   λ(c : Type)
→   λ(f : b → c)
→   λ(g : a → b)
→   λ(x : a)f (g x)
<Ctrl-D>(a : Type) → ∀(b : Type) → ∀(c : Type) → ∀(f : b → c) → ∀(g : a → b) → ∀(x : a)c

λ(a : Type) → λ(b : Type) → λ(c : Type) → λ(f : b → c) → λ(g : a → b) → λ(x : a)f (g x)

Note that this function is equivalent to Haskell's function composition operator (ie. (.)):

(.) :: (b -> c) -> (a -> b) -> (a -> c)
(f . g) x = f (g x)

This is because proofs and programs are equivalent to one another: whenever we prove a logical proposition we get a potentially useful function for free.

Circular reasoning

There are some propositions that we cannot prove, like this one:

∀(a ∈ Prop): a

... which you can read as saying "all propositions are true". This proposition is false, because if we translate the proposition to the corresponding type:

(a : Type)  a

... then there is no value in the Dhall language which has that type. If there were such a value then we could automatically create any value of any type.

However, Haskell does have values which inhabit the above type, such as this one:

example :: a
example = let x = x in x

-- or using the `ExplicitForAll` language extension:
example :: forall a . a
example = let x = x in x

This value just cheats and endlessly loops, which satisfies Haskell's type checker but doesn't produce a useful program. Unrestricted recursion like this is the programming equivalent of "circular reasoning" (i.e. trying to claim that a is true because a is true).

We can't cheat like this in Dhall and if we try we just get this error:

$ dhall <<< 'let x = x in x'

Use "dhall --explain" for detailed errors

Error: Unbound variable

x

(stdin):1:9

You cannot define x in terms of itself because Dhall does not permit any recursion and therefore does not permit circular reasoning when used as a proof checker.

And

We will use the symbol to denote logical "and", so you can read the following proposition:

∀(a ∈ Prop): ∀(b ∈ Prop): (a ∧ b) ⇒ a

... as saying "for any two propositions a and b, if a and b are both true then a is true"

The type-level equivalent of logical "and" is a record with two fields:

(a : Type)  (b : Type)  { fst : a, snd : b }  a

... which is equivalent to this Haskell type:

(a, b) -> a

-- or using `ExplicitForAll`:
forall a b . (a, b) -> a

The programming value that has this type is:

λ(a : Type)  λ(b : Type)  λ(r : { fst : a, snd : b })  r.fst

... which is equivalent to this Haskell value:

fst

Similarly, we can conclude that:

∀(a ∈ Prop): ∀(b ∈ Prop): (a ∧ b) ⇒ b

... which corresponds to this type:

(a : Type)  (b : Type)  { fst : a, snd : b }  b

... and this value of that type as the proof:

λ(a : Type)  λ(b : Type)  λ(r : { fst : a, snd : b })  r.snd

Now let's try a slightly more complex proposition:

∀(a ∈ Prop): ∀(b ∈ Prop): (a ∧ (a ⇒ b)) ⇒ b

You can read this as saying "for any propositions a and b, if the proposition a is true, and a implies b, then the proposition b is true".

That corresponds to this type:

(a : Type)  (b : Type)  { fst : a, snd : a  b }  b

... and the following value of that type which proves that the proposition is true:

    λ(a : Type)
   λ(b : Type)
   λ(r : { fst : a, snd : a  b })
   r.snd r.fst

Here's a slightly more complex example:

∀(a ∈ Prop): ∀(b ∈ Prop): ∀(c ∈ Prop): ((a ∧ b) ⇒ c) ⇒ (a ⇒ b ⇒ c)

You can read that as saying: "for any three propositions named a, b, and c, if a and b imply c, then a implies that b implies c".

Here's the corresponding type:

    (a : Type)
   (b : Type)
   (c : Type)
   ({ fst : a, snd : b }  c)  (a  b  c)

... and the corresponding value of that type:

    λ(a : Type)
   λ(b : Type)
   λ(c : Type)
   λ(f : { fst : a, snd : b }  c)
   λ(x : a)
   λ(y : b)
   f { fst = x, snd = y }

Note that this is equivalent to Haskell's curry function:

curry :: ((a, b) -> c) -> (a -> b -> c)
curry f x y = f (x, y)

Or

We will use the symbol to denote logical "or", so you can read the following proposition:

∀(a ∈ Prop): ∀(b ∈ Prop): a ⇒ a ∨ b

... as saying "for any propositions a and b, if a is true, then either a is true or b is true".

The type-level equivalent of logical "or" is a sum type:

(a : Type)  (b : Type)  a  < Left : a | Right : b >

... which is equivalent to this Haskell type:

a -> Either a b

-- or using `ExplicitForAll`
forall a b . a -> Either a b

... and the value that has this type is:

λ(a : Type)  λ(b : Type)  λ(x : a)  < Left = x | Right : b >

... which is equivalent to this Haskell value:

Left

Similarly, for this proposition:

∀(a ∈ Prop): ∀(b ∈ Prop): b ⇒ a ∨ b

... the equivalent type is:

(a : Type)  (b : Type)  b  < Left : a | Right : b >

... and the value that has this type is:

λ(a : Type)  λ(b : Type)  λ(x : b)  < Right = x | Left : a >

Let's consider a more complex example:

∀(a ∈ Prop): a ∨ a ⇒ a

... which says: "for any proposition a, if a is true or a is true then a is true".

The corresponding type is:

(a : Type)  < Left : a | Right : a >  a

... which is equivalent to this Haskell type:

Either a a -> a

-- or using `ExplicitForAll`
forall a . Either a a -> a

... and we can create a value of this type by pattern matching on the sum type:

    λ(a : Type)
   λ(s : < Left : a | Right : a >)
   merge
    { Left  = λ(x : a)  x
    , Right = λ(x : a)  x
    }
    s
    : a

... which is equivalent to this Haskell code:

example :: Either a a -> a
example (Left  x) = x
example (Right x) = x

You can read this "proof" as saying: "There are two possibilities. The first possibility (named "Left") is that a is true, therefore a must be true in that case. The second possibility (named "Right") is that a is true, therefore a must be true in that case. Since a is true in both cases we can conclude that a is true, period."

True

We'll use True to denote a logical proposition that is always true:

True

... and the corresponding type is an empty record with no fields:

{}

... which is equivalent to Haskell's "unit" type:

()

An empty record literal is the only value that has the type of an empty record:

{=}

... which is equivalent to Haskell's "unit" value:

()

False

We'll use False to denote a logical proposition that is never true:

False

... and the corresponding type is an empty sum type (i.e. a type with no constructors/alternatives):

<>

... which is equivalent to Haskell's Void type.

There is no value that has the above type, so you cannot prove a False proposition.

The logical "principle of explosion" states that if you assume a False proposition then you can prove any other proposition. In other words:

False ⇒ ∀(a ∈ Prop): a

... which you can read as saying: "if you assume a false proposition, then for any proposition named a you can prove that a is true".

The corresponding type is:

<>  (a : Type)  a

... and the value that inhabits the above type is:

λ(s : <>)  λ(a : Type)  merge {=} s : a

You can read that "proof" as saying: "There are zero possibilities. Since we handled all possibilities then the proposition a must be true." This line of reasoning is called a "vacuous truth", meaning that if there are no cases to consider then any statement you make is technically true for all cases.

The equivalent Haskell type would be:

Void -> a

... and the equivalent Haskell value of that type would be:

absurd

Not

We'll use not to negate a logical proposition, meaning that:

not(True) = False

not(False) = True

We can encode logical not in terms of logical implication and logical False, like this:

not(a) = a ⇒ False

not(True) = True ⇒ False = False

not(False) = False ⇒ False = True

If logical not is a function from a proposition to another proposition then type-level not must be a function from a type to a type:

λ(a : Type)  (a  <>)

We can save the above type-level function to a file to create a convenient ./not utility we will reuse later on:

$ echo 'λ(a : Type) → a → <>' > ./not

Now let's try to prove a proposition like:

not(True) ⇒ False

The corresponding type is:

./not {}  <>

... which expands out to:

({}  <>)  <>

... and the value that has this type is:

λ(f : {}  <>)  f {=}

Double negation

However, suppose we try to prove "double negation":

∀(a ∈ Prop): not(not(a)) ⇒ a

... which says that "if a is not false, then a must be true".

The corresponding type would be:

(a : Type)  ./not (./not a)  a

... which expands out to:

(a : Type)  ((a  <>)  <>)  a

... but there is no value in the Dhall language that has this type!

The reason why is that every type checker corresponds to a specific logic system and not all logic systems are the same. Each logic system has different rules and axioms about what you can prove within the system.

Dhall's type checker is based on System Fω which corresponds to an intuitionistic (or constructive) logic. This sort of logic system does not assume the law of excluded middle.

The law of excluded middle says that every proposition must be true or false, which we can write like this:

∀(a ∈ Prop): a ∨ not(a)

You can read that as saying: "any proposition a is either true or false". This seems pretty reasonable until you try to translate the proposition to the corresponding type:

(a : Type)  < Left : a | Right : a  <> >

... which in Haskell would be:

example :: Either a (a -> Void)

We cannot create such a value because if we could then that would imply that for any type we could either:

  • produce a value of that type, or:
  • produce a "counter-example" by creating Void from any value of that type

While we can do this for some types, our type checker does not let us do this automatically for every type.

This is the same reason that we can't prove double negation, which implicitly assumes that there are only two choices (i.e. "true or false"). However, if we don't assume the law of the excluded middle then there might be other choices like: "I don't know".

Not all hope is lost, though! Even though our type checker doesn't support the axiom of choice, we can still add new axioms freely to our logic system. All we have to do is assume them the same way we assume other propositions.

For example, we can modify our original proposition to now say:

(∀(b ∈ Prop): b ∨ not(b)) ⇒ (∀(a ∈ Prop): not(not(a)) ⇒ a)

... which you can read as saying: "if we assume that all propositions are either true or false, then if a proposition is not false it must be true".

That corresponds to this type:

  ((b : Type)  < Left : b | Right : b  <> >)
 ((a : Type)  ((a  <>)  <>)  a)

... and we can create a value of that type:

    λ(noMiddle : (b : Type)  < Left : b | Right : b  <> >)
   λ(a : Type)
   λ(doubleNegation : (a  <>)  <>)
   merge
    { Left  = λ(x : a)  x
    , Right = λ(x : a  <>)  merge {=} (doubleNegation x) : a
    }
    (noMiddle a)
    : a

... which the type checker confirms has the correct type:

$ dhall
    λ(noMiddle : ∀(b : Type)< Left : b | Right : b → <> >)
→   λ(a : Type)
→   λ(doubleNegation : (a → <>)<>)
→   merge
    { Left  = λ(x : a) → x
    , Right = λ(x : a → <>) → merge {=} (doubleNegation x) : a
    }
    (noMiddle a)
    : a
<Ctrl-D>(noMiddle : ∀(b : Type)< Left : b | Right : b → <> >) → ∀(a : Type) → ∀(doubleNegation : (a → <>)<>) → a

λ(noMiddle : ∀(b : Type)< Left : b | Right : b → <> >) → λ(a : Type) → λ(doubleNegation : (a → <>)<>) → merge { Left = λ(x : a) → x, Right = λ(x : a → <>) → merge {=} (doubleNegation x) : a } (noMiddle a) : a

You can read that proof as saying:

  • The law of the excluded middle says that there are two possibilities: either a is true or a is false
    • If a is true then we're done: we trivially conclude a is true
    • If a is false then our assumption of not(not(a)) is also false
      • we can conclude anything from a false assumption, therefore we conclude that a is true

What's neat about this is that the compiler mechanically checks this reasoning process. You don't have to understand or trust my explanation of how the proof works because you can delegate your trust to the compiler, which does all the work for you.

Conclusion

This post gives a brief overview of how you can concretely translate logical propositions and proofs to types and programs. This can let you leverage your logical intuitions to reason about types or, vice versa, leverage your programming intuitions to reason about propositions. For example, if you can prove a false proposition in a logic system, then that's typically an escape hatch in the corresponding type system. Similarly, if you can create a value of the empty type in a programming language, that implies that the corresponding logic is not sound.

There are many kinds of type systems just like there are many kinds of logic systems. For every new logic system (like linear logic) you get a type system for free (like linear types). For example, Rust's type checker is an example of an affine type system which corresponds to an affine logic system.

To my knowledge, there are more logic systems in academic literature than there are type systems that have been implemented for programming languages. This in turn suggests that there are many awesome type systems waiting to be implemented.

Sunday, February 5, 2017

Program JSON and YAML with Dhall

This is short post to announce a new dhall-json library which lets you compile the Dhall configuration language to both JSON and YAML. This in turn means that you can now program JSON and YAML using a non-Turing-complete programming language.

Here's a quick example of the library in action:

$ dhall-to-json <<< "{ foo = 1, bar = [1, 2, 3] }"
{"foo":1,"bar":[1,2,3]}
$ dhall-to-yaml <<< "{ foo = 1, bar = [1, 2, 3] }"
foo: 1
bar:
- 1
- 2
- 3
$ cat example
    let indexed
        = https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/List/indexed
in  let map
        = https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/List/map
in  let words
        =   [ "Lorem"
            , "ipsum"
            , "dolor"
            , "sit"
            , "amet"
            , "consectetur"
            , "adipiscing"
            , "elit"
            ]
in  let makeRow
        =   λ(r : { index : Natural, value : Text }){ index      = r.index
            , background = if Natural/even r.index then "Green" else "Red"
            , contents   = r.value
            }
in map
   { index : Natural, value : Text }
   { index : Natural, background : Text, contents : Text }
   makeRow
   (indexed Text words)
$ dhall-to-json <<< "./example"
[{"contents":"Lorem","index":0,"background":"Green"},{"contents":"ipsum","index":1,"background":"Red"},{"contents":"dolor","index":2,"background":"Green"},{"contents":"sit","index":3,"background":"Red"},{"contents":"amet","index":4,"background":"Green"},{"contents":"consectetur","index":5,"background":"Red"},{"contents":"adipiscing","index":6,"background":"Green"},{"contents":"elit","index":7,"background":"Red"}]
$ dhall-to-yaml <<< "./example"
- contents: Lorem
  index: 0
  background: Green
- contents: ipsum
  index: 1
  background: Red
- contents: dolor
  index: 2
  background: Green
- contents: sit
  index: 3
  background: Red
- contents: amet
  index: 4
  background: Green
- contents: consectetur
  index: 5
  background: Red
- contents: adipiscing
  index: 6
  background: Green
- contents: elit
  index: 7
  background: Red

This library bundles both JSON and YAML functionality together because Haskell's yaml library reuses the exact same data type that Haskell's aeson library uses to represent JSON (i.e. the Value type). This means that if you can compile a Dhall expression to a Value then you can render that Value as both JSON and YAML.

Unlike the Dhall bindings to Nix, you can't compile most Dhall features to JSON or YAML, since they aren't real programming languages. After all, the whole point of this binding is to make JSON and YAML programmable! So if you try to compile anything other than primitive types or records you will get a compile error:

$ dhall-to-json <<< "λ(x : Bool) → x"

Error: Cannot translate to JSON

Explanation: Only primitive values, records, ❰List❱s, and ❰Optional❱ values can
be translated from Dhall to JSON

The following Dhall expression could not be translated to JSON:

↳ λ(x : Bool)x

Right now I'm slowly adding Dhall integrations with new languages and configuration file formats so that people can use Dhall as a universal configuration language. So far I've targeted integrations that can reuse the initial Haskell implementation of Dhall (i.e. Nix, JSON, and YAML bindings). However, my next integration will probably reimplement Dhall in another language so that Dhall can be used to natively configure a wider range of languages without having to invoke a command-line dhall executable.

I'm most likely to build the second implementation of Dhall in Rust so that I can also reuse the same implementation to expose a C and C++ API. However, I've never done this sort of thing before, so if this is a dumb idea, let me know!

You can find the dhall-json package on Github or Hackage and if you would like to contribute to Dhall in general you can open an issue here.

Saturday, January 28, 2017

Typed Nix programming using Dhall

I recently released a typed and total configuration language named Dhall that you can use to configure Haskell programs. However, Dhall would be more useful if you could configure other programming languages, like Nix.

Nix powers the Nix package manager and the Nix operating system and if you've never used Nix before then please give Nix a try. We use Nix heavily at Awake Networks for managing our deployments but one of our biggest complaints about Nix is the type system:

  • Nix expressions cannot be annotated with types to guide the user
  • You can't use sum types to make invalid states unrepresentable
  • Type errors are not very helpful and poorly located
  • Many Nix builtins and idioms are inherently difficult to statically analyze

To mitigate this problem I contributed a new Dhall-to-Nix compiler that lets you carve out a restricted subset of Nix with a more user-friendly type system.

This post covers:

  • How to use this in your own Nix projects
  • Details of the translation process from Dhall to Nix
  • Benefits and drawbacks of using Dhall to program Nix

User interface

I began by creating a Dhall to Nix compiler that can translate arbitrary Dhall expressions to equivalent Nix expressions. This compiler is not limited to simple values like records or literals: you can even compile Dhall functions to Nix functions.

The compiler takes a Dhall expression on standard input and emits the corresponding Nix expression on standard output:

$ dhall-nix <<< "{ foo = 1, bar = True }"
{ bar = true; foo = 1; }
$ dhall-nix <<< "λ(x : Bool) → x == False"
x: x == false

However, you do not need to install or manually run the compiler to use Dhall within your Nix project. I went a step further and added a dhallToNix utility to nixpkgs which automatically converts Dhall expressions to Nix expressions for you. This utility automatically bootstraps and caches the dhall-to-nix compiler as part of the evaluation process.

Here's an example of how you can use the dhallToNix function to embed arbitrary Dhall expressions inside of Nix:

let
  pkgs = import <nixpkgs> { };

  inherit (pkgs) dhallToNix;

in
  { example0 = dhallToNix "λ(x : Bool) → x == False" false;
    example1 = (dhallToNix "{ foo = 1, bar = True }").foo;
    example2 = dhallToNix "List/reverse Integer" [1 2 3];
  }

... and that will evaluate to:

{ example0 = true;
  example1 = 1;
  example2 = [3 2 1];
}

You can find a larger test suite here exercising all the features of the Dhall language by compiling them to Nix.

Compilation

The Dhall language is heavily inspired by Nix, so many Dhall language features translate mechanically to Nix language features. For example:

  • Dhall's records translate to Nix's records
  • Primitive Dhall values (like integers) translate to primitive Nix values
  • Dhall's anonymous functions translate to Nix's anonymous functions

... and so on

However, Dhall does some things differently from Nix, and these differences fell into four categories:

  • Dhall supports anonymous sum types, while Nix does not
  • Dhall provides a different set of built-in functions from Nix
  • Dhall is explicitly typed and uses type abstraction and type application
  • Dhall supports floating point numbers, while Nix does not

Compiling sum types was pretty simple: just Church-encode them. For example, a sum type like this:

< Left = 2 | Right : Bool >

... translates into the following Nix expression:

{ Left, Right }: Left 2

In other words, you can encode a sum type as a "pre-formed pattern match", which you consume in Nix by providing one function per alternative to handle:

dhallToNix "< Left = | Right : Bool >" {
  Left  = n : n == 0;  # Handler for the `Left`  case
  Right = b : b;       # Handler for the `Right` case
}

The dhallToNix invocation evaluates to:

({ Left, Right }: Left 2) {
  Left  = n : n == 0;
  Right = b : b;
}

... which in turn reduces to:

(n : n == 0) 2

... which in turn reduces to:

false

Built-in functions which were missing in Nix required the most effort. I had to translate them to efficient implementations based on other Nix-builtins. For example, Dhall's List/reverse primitive uses Nix's builtins.genList and builtins.elemAt under the hood:

dhall-to-nix <<< "List/reverse"
t: xs: let
      n = builtins.length xs;
      in builtins.genList (i:
      builtins.elemAt xs (n - i - 1)) n

The most complex builtin to translate was Dhall's (∧) operation for merging records, which is similar to Nix's (//) operator except that (∧) also merges children recursively:

$ dhall-to-nix <<< "λ(x : { foo : Bool }) → λ(y : { bar : Text }) → x ∧ y"
x: y: let
      combine = kvsL: kvsR: let
            ks = builtins.attrNames kvsL ++ builtins.attrNames kvsR;
            toKeyVals = k:
            if builtins.hasAttr k kvsL
                then if builtins.hasAttr k kvsR
                  then if builtins.isAttrs (builtins.getAttr k kvsL) && builtins.isAttrs (builtins.getAttr k kvsR)
                    then [
                      {
                        name = k;
                        value = combine (builtins.getAttr k kvsL) (builtins.getAttr k kvsR);
                      }
                    ]
                    else [
                      {
                        name = k;
                        value = builtins.getAttr k kvsL;
                      }
                    ]
                  else [
                    {
                      name = k;
                      value = builtins.getAttr k kvsL;
                    }
                  ]
                else if builtins.hasAttr k kvsR
                  then [
                    {
                      name = k;
                      value = builtins.getAttr k kvsR;
                    }
                  ]
                  else [];
            in builtins.listToAttrs (builtins.concatLists (map toKeyVals ks));
      in combine x y

The last tricky part was translating Dhall's explicit type abstraction and type application. "Explicit type abstraction" means that polymorphic (or "generic") functions in Dhall are encoded as ordinary functions that take a type as a function argument. For example, this is how you encode the polymorphic identity function in Dhall:

λ(a : Type) → λ(x : a) → x

"Explicit type application" means that you specialize polymorphic functions by applying them to a type argument specifying which type to use. For example, this is how you use the polymorphic identity function:

$ echo "λ(a : Type) → λ(x : a) → x" > id
$ dhall <<< "./id Integer 4"
Integer

4
$ dhall <<< "./id Integer"  # Just specialize the function(x : Integer)Integer

λ(x : Integer)x

dhall-to-nix translates polymorphic functions to functions that just ignore their type argument. For example, the polymorphic identity function becomes:

$ dhall-to-nix <<< "./id"
a : x : x

The first argument named a is the type and the corresponding Nix function still expects the argument but never uses it.

For type application, dhall-to-nix translate all types to an empty Nix record:

$ dhall-to-nix <<< "Integer"
{}

... which is then ignored by any polymorphic function:

$ dhall-to-nix <<< "./id"
a : x : x
$ dhall-to-nix <<< "./id Integer"
x : x
$ dhall-to-nix <<< "./id Integer 4"
4

Some Dhall built-in functions are also polymorphic, and we treat them the same way. For example, the List/reverse function is polymorphic, which is why the first argument in the corresponding Nix expression is an unused type argument named t:

$ dhall-to-nix <<< "List/reverse"
t: xs: let
      n = builtins.length xs;
      in builtins.genList (i:
      builtins.elemAt xs (n - i - 1)) n
$ dhall-to-nix <<< "List/reverse Integer"
(t: xs: let
      n = builtins.length xs;
      in builtins.genList (i:
      builtins.elemAt xs (n - i - 1)) n) {}
$ dhall-to-nix <<< "List/reverse Integer ([1, 2, 3] : List Integer)"
[3 2 1]

Finally, floating point numbers are not supported in Nix at all, so the dhall-to-nix compiler must reject Double values:

$ dhall-to-nix <<< "1.0"

Error: No doubles

Explanation: Dhall values of type ❰Double❱ cannot be converted to Nix
expressions because the Nix language provides no way to represent floating point
values

You provided the following value:

↳ 1.0

... which cannot be translated to Nix

Benefits

When first learning Nix, particularly NixOS, you'll frequently run into the issue where you're not sure what values you're expected to provide due to the lack of a type system. Dhall can fix that in several ways:

  • You can request the inferred types of functions so that you know what type of function argument you need to supply
  • You can also provide users with a "schema" for an expected value, which in Dhall is just an ordinary type annotation pointing to a remote path
  • You can replaced weakly typed values (like strings) with more strongly typed representations

The following example will illustrate all of the above points

For example a derivation in Nix can be minimally represented in Dhall as the following type:

{ name : Text, builder : Text, system : Text }

... which you can save to a file named derivation and use to check if other expressions match the expected type:

$ echo "{ name : Text, builder : Text, system : Text }" > derivation
$ dhall <<EOF
{ name = "empty"

  -- Dhall supports Nix-style multi-line string literals, too
, builder = ''
    touch $out
  ''

, system = "x86_64-linux"
} : ./derivation
EOF
{ builder : Text, name : Text, system : Text }

{ builder = "\ntouch $out\n\n", name = "empty", system = "x86_64-linux" }

If we mistype a field name the type annotation will flag the error:

$ dhall <<EOF
{ name = "empty", builder = "touch $out", sytem = "x86_64-linux" }
: ./derivation
EOF
Use "dhall --explain" for detailed errors

Error: Expression doesn't match annotation

... and we can ask for more detailed error messages:

$ dhall --explain <<EOF
{ name = "empty", builder = "touch $out", sytem = "x86_64-linux" }
: ./derivation
EOF

Error: Expression doesn't match annotation

Explanation: You can annotate an expression with its type or kind using the
❰:❱ symbol, like this:


    ┌───────┐
    │ x : t │  ❰x❱ is an expression and ❰t❱ is the annotated type or kind of ❰x❱
    └───────┘

The type checker verifies that the expression's type or kind matches the
provided annotation

For example, all of the following are valid annotations that the type checker
accepts:


    ┌─────────────┐
    │ 1 : Integer │  ❰1❱ is an expression that has type ❰Integer❱, so the type
    └─────────────┘  checker accepts the annotation


    ┌────────────────────────┐
    │ Natural/even +2 : Bool │  ❰Natural/even +2❱ has type ❰Bool❱, so the type
    └────────────────────────┘  checker accepts the annotation


    ┌────────────────────┐
    │ List : TypeType │  ❰List❱ is an expression that has kind ❰Type → Type❱,
    └────────────────────┘  so the type checker accepts the annotation


    ┌──────────────────┐
    │ List Text : Type │  ❰List Text❱ is an expression that has kind ❰Type❱, so
    └──────────────────┘  the type checker accepts the annotation


However, the following annotations are not valid and the type checker will
reject them:


    ┌──────────┐
    │ 1 : TextThe type checker rejects this because ❰1❱ does not have type
    └──────────┘  ❰Text❱


    ┌─────────────┐
    │ List : Type │  ❰List❱ does not have kind ❰Type❱
    └─────────────┘


You or the interpreter annotated this expression:

↳ { builder = "touch $out", name = "empty", sytem = "x86_64-linux" }

... with this type or kind:

↳ { builder : Text, name : Text, system : Text }

... but the inferred type or kind of the expression is actually:

↳ { builder : Text, name : Text, sytem : Text }

Some common reasons why you might get this error:

● The Haskell Dhall interpreter implicitly inserts a top-level annotation
  matching the expected type

  For example, if you run the following Haskell code:


    ┌───────────────────────────────┐
    │ >>> input auto "1" :: IO Text │
    └───────────────────────────────┘


  ... then the interpreter will actually type check the following annotated
  expression:


    ┌──────────┐
    │ 1 : Text │
    └──────────┘


  ... and then type-checking will fail

────────────────────────────────────────────────────────────────────────────────

We can also take advantage of the fact that Dhall supports sum types so that we can make invalid states unrepresentable. For example, the system field really shouldn't be Text because not all strings are valid systems.

We can fix this by first creating a type more accurately representing all supported platforms. First, we just need to create a sum type for all supported operating systems:

$ dhall > OperatingSystem <<EOF
< cygwin  : {}
| darwin  : {}
| linux   : {}
| netbsd  : {}
| openbsd : {}
| solaris : {}
>
EOF

... along with helper utilities to build each operating system:

$ dhall > linux <<EOF
< linux   = {=}
| cygwin  : {}
| darwin  : {}
| netbsd  : {}
| openbsd : {}
| solaris : {}
>

Then we can create a sum type for each supported architecture:

$ dhall > Architecture <<EOF
< aarch64  : {}
| armv5tel : {}
| armv6l   : {}
| armv7l   : {}
| i686     : {}
| mips64el : {}
| x86_64   : {}
>
EOF

... as well as helper constructors for each architecture type:

$ dhall > x86_64 <<EOF
< x86_64   = {=}
| aarch64  : {}
| armv5tel : {}
| armv6l   : {}
| armv7l   : {}
| i686     : {}
| mips64el : {}
>
EOF

... and then we can create a type for supported Nix platforms:

$ dhall > Platform <<EOF
{ operatingSystem : ./OperatingSystem
, architecture    : ./Architecture
}
EOF

... and helper utilities for each platform:

$ dhall > x86_64-linux <<EOF
{ architecture    = ./x86_64
, operatingSystem = ./linux
}
EOF

... and verify that they type-check against our Platform type:

$ dhall <<< "./x86_64-linux : ./Platform"
{ architecture : < aarch64 : {} | armv5tel : {} | armv6l : {} | armv7l : {} | i686 : {} | mips64el : {} | x86_64 : {} >, operatingSystem : < cygwin : {} | darwin : {} | linux : {} | netbsd : {} | openbsd : {} | solaris : {} > }

{ architecture = < x86_64 = {=} | aarch64 : {} | armv5tel : {} | armv6l : {} | armv7l : {} | i686 : {} | mips64el : {} >, operatingSystem = < linux = {=} | cygwin : {} | darwin : {} | netbsd : {} | openbsd : {} | solaris : {} > }

Then we can always add a Nix translation layer that converts the strongly typed Dhall version to the weakly typed Nix string:

# platform.nix

let
  pkgs = import <nixpkgs> { };

  inherit (pkgs) dhallToNix;

  architectureToText =
    architecture:
      architecture {
        aarch64  = _ : "aarch64" ;
        armv5tel = _ : "armv5tel";
        armv6l   = _ : "armv6l"  ;
        armv7l   = _ : "armv7l"  ;
        i686     = _ : "i686"    ;
        mips64el = _ : "mips64el";
        x86_64   = _ : "x86_64"  ;
      };

  operatingSystemToText =
    operatingSystem:
      operatingSystem {
        cygwin  = _ : "cygwin" ;
        darwin  = _ : "darwin" ;
        linux   = _ : "linux"  ;
        netbsd  = _ : "netbsd" ;
        openbsd = _ : "openbsd";
        solaris = _ : "solaris";
      };

  platformToText =
    {architecture, operatingSystem}:
      let
        arch = architectureToText    architecture   ;
        os   = operatingSystemToText operatingSystem;
      in
        "${arch}-${os}";
in
  platformToText (dhallToNix "${./x86_64-linux} : ${./Platform}")

... which would type-check our ./x86_64-linux file against the ./Platform and return the following result:

"x86_64-linux"

We can even go a step further and implement the intermediate functions in Dhall, too:

let
  pkgs = import <nixpkgs> { };

  inherit (pkgs) dhallToNix;

  platformToText =
    dhallToNix ''
        let architectureToText
            =   λ(x : ${./Architecture} )merge
                { aarch64  = λ(_ : {}) → "aarch64"
                , armv5tel = λ(_ : {}) → "armv5tel"
                , armv6l   = λ(_ : {}) → "armv6l"
                , armv7l   = λ(_ : {}) → "armv7l"
                , i686     = λ(_ : {}) → "i686"
                , mips64el = λ(_ : {}) → "mips64el"
                , x86_64   = λ(_ : {}) → "x86_64"
                }
                x : Text
    in  let operatingSystemToText
            =   λ(x : ${./OperatingSystem} )merge
                { cygwin  = λ(_ : {}) → "cygwin"
                , darwin  = λ(_ : {}) → "darwin"
                , linux   = λ(_ : {}) → "linux"
                , netbsd  = λ(_ : {}) → "netbsd"
                , openbsd = λ(_ : {}) → "openbsd"
                , solaris = λ(_ : {}) → "solaris"
                }
                x : Text
    in  let platformToText
            =   λ(x : ${./Platform} )architectureToText    x.architecture
            ++  "-"
            ++  operatingSystemToText x.operatingSystem

    in  platformToText
    '';
in
  platformToText (dhallToNix "${./x86_64-linux} : ${./Platform}")

However, in practice you'd like to keep the platform expression as the original strongly typed record instead of converting the platform to a string. The original record lets you more easily extract the architecture and operating system fields and make decisions on their values using exhaustive pattern matching.

Drawbacks

The largest drawback of using Dhall to program Nix is that Dhall cannot encode many common idioms used in nixpkgs. Some examples of idioms that do not translate well to Nix are:

  • The callPackage idiom that nixpkgs uses very heavily for easily updating dependencies. This relies on reflection and recursive fixpoints, neither of which Dhall supports
  • Anything which uses builtins.listToAttrs or does reflection on record field names

I don't expect Dhall to be used at all in nixpkgs, but I do believe Dhall can benefit end users or companies for their own internal Nix projects.

Conclusion

The dhallToNix utility is available right now in the nixpkgs-unstable channel if you would like to try this out in your own project:

Also, if you would like to use the dhall-to-nix compiler for other purposes you can find the compiler on Hackage or Github:

If you're new to Dhall you can learn more about the configuration language by reading the Dhall tutorial.

Monday, December 5, 2016

Dhall - A non-Turing-complete configuration language

I'm releasing a new configuration language named Dhall with Haskell bindings. Even if you don't use Haskell you might still find this language interesting.

This language started out as an experiment to answer common objections to programmable configuration files. Almost all of these objections are, at their root, criticisms of Turing-completeness.

For example, people commonly object that configuration files should be easy to read, but they descend into unreadable spaghetti if you make them programmable. However, Dhall doesn't have this problem because Dhall is a strongly normalizing language, which means that we can reduce every expression to a standard normal form in a finite amount of time by just evaluating everything.

For example, consider this deliberately obfuscated configuration file:

$ cat config
    let zombieNames =
            [ "Rachel", "Gary", "Liz" ] : List Text

in  let isAZombie =
            \(name : Text) -> { name = name, occupation = "Zombie" }

in  let map =
            https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/List/map

in  let tag =
            map Text { name : Text, occupation : Text }

in  let zombies =
            tag isAZombie zombieNames

in  let policeNames =
            [ "Leon", "Claire" ] : List Text

in  let worksForPolice =
            \(name : Text) -> { name = name, occupation = "Police officer" }

in  let policeOfficers =
            tag worksForPolice policeNames

in  let concat =
            https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/List/concat

in  let characters =
            concat
            { name : Text, occupation : Text }
            (   [   zombies
                ,   policeOfficers
                ]   : List (List { name : Text, occupation : Text })
            )

in  {   protagonist =
            List/head { name : Text, occupation : Text } policeOfficers
    ,   numberOfCharacters =
            List/length { name : Text, occupation : Text } characters
    }

We can use the dhall compiler to cut through the indirection and reduce the above configuration file to the following fully evaluated normal form:

$ stack install dhall
$ dhall < config
{ numberOfCharacters : Natural, protagonist : Optional { name : Text, occupation : Text } }

{ numberOfCharacters = +5, protagonist = [{ name = "Leon", occupation = "Police officer" }] : Optional { name : Text, occupation : Text } }

The first line is the inferred type of the file, which we can format as:

{ numberOfCharacters : Natural
, protagonist        : Optional { name : Text, occupation : Text }
}

This says that our configuration file is a record with two fields:

  • a field named numberOfCharacters that stores a Natural number (i.e. a non-negative number)
  • a field named protagonist that stores an Optional record with a name and occupation

From this type alone, we know that no matter how complex our configuration file gets the program will always evaluate to a simple record. This type places an upper bound on the complexity of the program's normal form.

The second line is the actual normal form of our configuration file:

{ numberOfCharacters =
    +5
, protagonist =
    [ { name = "Leon", occupation = "Police officer" }
    ] : Optional { name : Text, occupation : Text }
}

In other words, our compiler cut through all the noise and gave us an abstraction-free representation of our configuration.

Total programming

You can also evaluate configuration files written in other languages, too, but Dhall differentiates itself from other languages by offering several stronger guarantees about evaluation:

  • Dhall is not Turing complete because evaluation always halts

    You can never write a configuration file that accidentally hangs or loops indefinitely when evaluated

    Note that you can still write a configuration file that takes longer than the age of the universe to compute, but you are much less likely to do so by accident

  • Dhall is safe, meaning that functions must be defined for all inputs and can never crash, panic, or throw exceptions

  • Dhall is sandboxed, meaning that the only permitted side effect is retrieving other Dhall expressions by their filesystem path or URL

    There are examples of this in the above program where Dhall retrieves two functions from the Prelude by their URL

  • Dhall's type system has no escape hatches

    This means that we can make hard guarantees about an expression purely from the expression's type

  • Dhall can normalize functions

    For example, Dhall's Prelude provides a replicate function which builds a list by creating N copies of an element. Check out how we can normalize this replicate function before the function is even saturated:

    $ dhall
    let replicate = https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/List/replicate
    in  replicate +10
    <Ctrl-D>
    ∀(a : Type) → ∀(x : a) → List a
    
    λ(a : Type) → λ(x : a) → [x, x, x, x, x, x, x, x, x, x] : List a

    The compiler knows that no matter what element we provide for the final argument to replicate the result must be 10 copies of that element in a list

Types

Dhall is also a typed language, so every configuration file can be checked ahead of time against an expected schema. The schema can even live in a separate file, like this:

$ cat schema
{ numberOfCharacters : Natural
, protagonist        : Optional { name : Text, occupation : Text }
}

... and then checking our configuration against a schema is as simple as giving the configuration file a type annotation:

$ dhall
./config : ./schema
<Ctrl-D>
{ numberOfCharacters : Natural, protagonist : Optional { name : Text, occupation : Text } }

{ numberOfCharacters = +5, protagonist = [{ name = "Leon", occupation = "Police officer" }] : Optional { name : Text, occupation : Text } }

If the compiler doesn't complain then that means that the configuration file type checks against our schema.

Haskell bindings

Dhall configuration files can be marshalled into Haskell data types. For example, the following Haskell program:

{-# LANGUAGE DeriveAnyClass    #-}
{-# LANGUAGE DeriveGeneric     #-}
{-# LANGUAGE OverloadedStrings #-}

import Dhall

data Summary = Summary
    { numberOfCharacters :: Natural
    , protagonist :: Maybe Person
    } deriving (Generic, Interpret, Show)

data Person = Person
    { name :: Text
    , occupation :: Text
    } deriving (Generic, Interpret, Show)

main :: IO ()
main = do
    x <- input auto "./config"
    print (x :: Summary)

... will marshal our config file into Haskell and print the corresponding Haskell representation of our configuration file:

$ stack runghc example.hs
Summary {numberOfCharacters = 5, protagonist = Just (Person {name = "Leon", occupation = "Police officer"})}

The Haskell program automatically checks that the configuration file's schema automatically matches the data structures that we marshal into. The entire pipeline is type safe from end to end.

Imports

Dhall expressions can reference other expression, either by their filesystem paths or URLs. Anything can be imported, such as fields of records:

$ cat record
{ foo = 1.0
, bar = ./bar
}

$ cat bar
[1, 2, 3] : List Integer

$ dhall < record
{ bar = [1, 2, 3] : List Integer, foo = 1.0 }

... or types:

$ cat function
\(f : ./type ) -> f False

$ cat type
Bool -> Integer

$ dhall < function
∀(f : Bool → Integer) → Integer

λ(f : Bool → Integer) → f False

... or functions:

$ cat switch
\(b : Bool) -> if b then 2 else 3

$ dhall <<< "./function ./switch"
Integer

3

You can also import URLs, too. The Dhall Prelude is hosted using IPFS (a distributed and immutable filesystem), and you can browse the Prelude here:

https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/

Anything from the Prelude can be used by just pasting the URL into your program:

$ dhall
https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/Natural/sum
([+2, +3, +5] : List Natural)
<Ctrl-D>
Natural

+10

... although usually you want to assign the URL to a shorter name for readability:

let sum = https://ipfs.io/ipfs/QmcTbCdS21pCxXysTzEiucDuwwLWbLUWNSKwkJVfwpy2zK/Prelude/Natural/sum 
in  sum ([+2, +3, +5] : List Natural)

You're not limited to IPFS for hosting Dhall expressions. Any pastebin, web server, or Github repository that can serve raw UTF8 text can host a Dhall expression for others to use.

Error messages

Dhall outputs helpful error messages when things go wrong. For example, suppose that we change our type file to something that's not a type:

$ echo "1" > type
$ dhall <<< "./function ./switch"
Use "dhall --explain" for detailed errors

↳ ./function 

f : 1

Error: Not a function

f False

function:1:19

By default Dhall gives a concise summary of what broke. The error message begins with a trail of breadcrumbs pointing to which file in your import graph is broken:

↳ ./function 

In this case, the error is located in the ./function file that we imported.

Then the next part of the error message is a context that prints the types of all values that are in scope:

f : 1

... which says that only value named f is in scope and f has type 1 (Uh oh!)

The next part is a brief summary of what went wrong:

Error: Not a function

... which says that we are using something that's not a function

The compiler then prints the code fragment so we can see at a glance what is wrong with our code before we even open the file:

f False

The above fragment is wrong because f is not a function, but we tried to apply f to an argument.

Finally, the compiler prints out the file, column, and line number so that we can jump to the broken code fragment and fix the problem:

function:1:19

This says that the problem is located in the file named function at row 1 and column 19.

Detailed error messages

But wait, there's more! You might have noticed this line at the beginning of the error message:

Use "dhall --explain" for detailed errors

Let's add the --explain flag to see what happens:

$ dhall --explain <<< "./function ./switch"

↳ ./function 

f : 1

Error: Not a function

Explanation: Expressions separated by whitespace denote function application,
like this:


    ┌─────┐
    │ f x │  This denotes the function ❰f❱ applied to an argument named ❰x❱ 
    └─────┘


A function is a term that has type ❰a → b❱ for some ❰a❱ or ❰b❱.  For example,
the following expressions are all functions because they have a function type:


                        The function's input type is ❰Bool❱
                        ⇩
    ┌───────────────────────────────┐
    │ λ(x : Bool) → x : Bool → Bool │  User-defined anonymous function
    └───────────────────────────────┘
                               ⇧
                               The function's output type is ❰Bool❱


                     The function's input type is ❰Natural❱
                     ⇩
    ┌───────────────────────────────┐
    │ Natural/even : Natural → Bool │  Built-in function
    └───────────────────────────────┘
                               ⇧
                               The function's output type is ❰Bool❱


                        The function's input kind is ❰Type❱
                        ⇩
    ┌───────────────────────────────┐
    │ λ(a : Type) → a : Type → Type │  Type-level functions are still functions
    └───────────────────────────────┘
                               ⇧
                               The function's output kind is ❰Type❱


             The function's input kind is ❰Type❱
             ⇩
    ┌────────────────────┐
    │ List : Type → Type │  Built-in type-level function
    └────────────────────┘
                    ⇧
                    The function's output kind is ❰Type❱


                        Function's input has kind ❰Type❱
                        ⇩
    ┌─────────────────────────────────────────────────┐
    │ List/head : ∀(a : Type) → (List a → Optional a) │  A function can return
    └─────────────────────────────────────────────────┘  another function
                                ⇧
                                Function's output has type ❰List a → Optional a❱


                       The function's input type is ❰List Text❱
                       ⇩
    ┌────────────────────────────────────────────┐
    │ List/head Text : List Text → Optional Text │  A function applied to an
    └────────────────────────────────────────────┘  argument can be a function
                                   ⇧
                                   The function's output type is ❰Optional Text❱


An expression is not a function if the expression's type is not of the form
❰a → b❱.  For example, these are not functions:


    ┌─────────────┐
    │ 1 : Integer │  ❰1❱ is not a function because ❰Integer❱ is not the type of
    └─────────────┘  a function


    ┌────────────────────────┐
    │ Natural/even +2 : Bool │  ❰Natural/even +2❱ is not a function because
    └────────────────────────┘  ❰Bool❱ is not the type of a function


    ┌──────────────────┐
    │ List Text : Type │  ❰List Text❱ is not a function because ❰Type❱ is not
    └──────────────────┘  the type of a function


You tried to use the following expression as a function:

↳ f

... but this expression's type is:

↳ 1

... which is not a function type

────────────────────────────────────────────────────────────────────────────────

f False

function:1:19

We get a brief language tutorial explaining the error message in excruciating detail. These mini-tutorials target beginners who are still learning the language and want to better understand what error messages mean.

Every type error has a detailed explanation like this and these error messages add up to ~2000 lines of text, which is ~25% of the compiler's code base.

Tutorial

The compiler also comes with an extended tutorial, which you can find here:

This tutorial is also ~2000 lines long or ~25% of the code base. That means that half the project is just the tutorial and error messages and that's not even including comments.

Design goals

Programming languages are all about design tradeoffs and the Dhall language uses the following guiding principles (in order of descending priority) that help navigate those tradeoffs:

  • Polish

    The language should delight users. Error messages should be fantastic, execution should be snappy, documentation should be excellent, and everything should "just work".

  • Simplicity

    When in doubt, cut it out. Every configuration language needs bindings to multiple programming languages, and the more complex the configuration language the more difficult to create new bindings. Let the host language that you bind to compensate for any missing features from Dhall.

  • Beginner-friendliness

    Dhall needs to be a language that anybody can learn in a day and debug with little to no assistance from others. Otherwise people can't recommend Dhall to their team with confidence.

  • Robustness

    A configuration language needs to be rock solid. The last thing a person wants to debug is their configuration file. The language should never hang or crash. Ever.

  • Consistency

    There should only be one way to do something. Users should be able to instantly discern whether or not something is possible within the Dhall language or not.

The dhall configuration language is also designed to negate many of the common objections to programmable configuration files, such as:

"Config files shouldn't be Turing complete"

Dhall is not Turing-complete. Evaluation always terminates, no exceptions

"Configuration languages become unreadable due to abstraction and indirection"

Every Dhall configuration file can be reduced to a normal form which eliminates all abstraction and indirection

"Users will go crazy with syntax and user-defined constructs"

Dhall is a very minimal programming language. For example: you cannot even compare strings for equality (yes, really). The language also forbids many other common operations in order to force users to keep things simple.

Conclusion

You should read the tutorial if you would like to learn more about the language or use Dhall to configure your own projects:

You can also contribute, file issues, or ask questions by visiting the project repository on Github:

And the Haskell library is hosted on Hackage here:

If you would like to contribute, you can try porting Dhall to bind to languages other than Haskell, so that Dhall configuration files can be used across multiple languages. I keep the compiler simple (less than ~4000 lines of code if you don't count error messages) so that people can port the language more easily.

Also, for people who are wondering, the language is named after a Dustman from the game Planescape: Torment who belongs to a faction obsessed with death (termination).