Friday, August 27, 2021

Naming function arguments in Dhall

doc-args

This post showcases some neat Dhall language features for improving the readability of types that I think other languages should steal.

To motivate this post, consider the following Haskell type for Data.Text.replace:

replace :: Text -> Text -> Text -> Text

This function replaces all occurrences of a substring with another substring, but you wouldn’t be able to easily guess which argument is which from the type alone.

Fortunately, the function does have Haddock-level documentation in the form of comments for each function argument:

replace
    :: Text
    -- ^ @needle@ to search for.  If this string is empty, an
    -- error will occur.
    -> Text
    -- ^ @replacement@ to replace @needle@ with.
    -> Text
    -- ^ @haystack@ in which to search.
    -> Text

… but what if you could provide a hint to what each argument does within the type itself?

Naming function arguments using / forall

Well, in Dhall you can, and here is the equivalent Dhall type:

$ dhall repl
:type Text/replace

∀(needle : Text) → ∀(replacement : Text) → ∀(haystack : Text) → Text

… or if you prefer ASCII syntax then that is the same as this type:

forall (needle : Text) ->
forall (replacement : Text) ->
forall (haystack : Text) ->
  Text

Here we’ve tagged each function argument with the argument’s name. In fact, you don’t need to do this explicitly. Dhall will automatically infer named function argument types when you create user-defined functions:

:type λ(name : Text) → "Hello, ${name}"

∀(name : Text) → Text

In many cases these names are “cosmetic”, meaning that they have no effect on type-checking. For example, as far as the type-checker is concerned the following function type:

∀(name : Text) → Text

… is the exact same as the following simpler function type (they are α-equivalent):

TextText

However, sometimes these names are not cosmetic! The simplest example is a polymorphic function (a.k.a. a “generic” function), like the following polymorphic identity function:

let identity
      : ∀(a : Type) → a → a
      = λ(a : Type) → λ(x : a) → x

in  identity

Here we’ve used the to name the first function argument a, which lets us reference that name downstream within the same type.

let identity
       -- We're naming our first function argument `a` …
       -- ↓
      : ∀(a : Type) → a → a
                   -- ↑   ↑
                   -- … so that we can reference the value of the first function
                   -- argument downstream within the same type
      = λ(a : Type) → λ(x : a) → x

in  identity

However, unlike other languages, Dhall also lets us name the second function argument using the exact same syntax, even though our second function argument is not a Type:

let identity
      : ∀(a : Type) → ∀(x : a) → a
      = λ(a : Type) → λ(x : a) → x

In fact, this is actually the type that the interpreter would have inferred if we had omitted the type annotation. The name of this second function argument is cosmetic, though.

Another example where the argument name is not cosmetic is the following hypothetical Dhall type for a safe division function that rejects 0 denominators at type-checking time:

divide
    : ∀(denominator : Natural) →
      ∀(proof : Natural/isZero denominatorFalse) →
      ∀(numerator : Natural) →
        Natural

This division function takes three arguments instead of the usual two:

  • The first argument is the denominator

    … as you may have guessed from the helpful type-level name

  • The second function argument requires a proof that denominator is non-zero

    Side note: This proof obligation is satisfied by assert : False ≡ False so long as denominator is not 0

  • The third argument is the numerator

By naming the first argument denominator, we can refer to the value of the first argument within the type of the second argument:

divide
    --  We're referring to the value of the first argument …
    --  ↓
    : ∀(denominator : Natural) →
    --                         … within the type of the second argument
    --                         ↓
      ∀(proof : Natural/isZero denominatorFalse) →
    --  ↑
    --  These names are still cosmetic, though
    --  ↓
      ∀(numerator : Natural) →
        Natural

You might wonder why Dhall uses the / forall keyword for this purpose, since typically most languages only use forall to create polymorphic (a.k.a. “generic”) functions. This is because Dhall is implemented as a pure type system, meaning that Dhall uniformly handles term-level function arguments and type-level function arguments in the exact same way. They’re both special cases of Π types.

The following paper does a really good job explaining pure type systems and Π types:

… and that’s actually where I stole this trick from.

Really the correct symbol to use is Π and not , but there isn’t a great way to input Π as ASCII. The Henk paper suggests using |~| as the ASCII counterpart to Π, but that didn’t seem right to me. I went with / forall because there’s already prior art for using those for type arguments.

Anonymous record types

There is another approach to naming function arguments that doesn’t require a pure type system at all: anonymous record types.

For example, we could have changed the Text/replace function to have this type:

{ needle : Text, replacement : Text, haystack : Text } → Text

That’s pretty clear, too, and can be done in any language that supports anonymous record types, including Dhall, F#, PureScript, Elm, and OCaml.

Naming type parameters

You can also use the above two tricks to name type arguments for type constructors, too.

For example, we can define a Map type constructor to be a list of key-value pairs:

let Map =
      λ(key : Type) →
      λ(value : Type) →
        List { mapKey : key, mapValue : value }

… which we would use like this:

Map Text Natural

List { mapKey : Text, mapValue : Natural }

… and the inferred type names the arguments to our Map type constructor:

:type Map

∀(key : Type) → ∀(value : Type) → Type

We can also use anonymous records to name type arguments, too! For example, we could have instead defined Map as:

let Map =
      λ(args : { key : Type, value : Type }) →
        List { mapKey : args.key, mapValue : args.value }

… which we would use like this:

Map { key = Text, value = Natural }

List { mapKey : Text, mapValue : Natural }

… and this latter Map would have an inferred type of:

:type Map

∀(args : { key : Type, value : Type }) → Type

Pretty neat! Not many languages can pass type constructor arguments as named fields of a record.

Conclusion

Hopefully this gives language designers some ideas for how they can add language support for naming function arguments.

Unfortunately, my favorite language (Haskell) does not exactly support these features, so sometimes people work around this by using newtypes to name function arguments. I’ve never been of fan of this approach, especially if the newtype is not opaque and this post does a good job of explaining why:

However, you can simulate this trick in Haskell using something like what the vulkan package does, which is to use DataKinds and TypeOperators to create a type-level operator that lets you associate arbitrary name data with types:

type (name :: k) ::: a = a

… which the package uses like this:

cmdDraw
    :: forall io . (MonadIO io)
    => -- | @commandBuffer@ is the command buffer into which the command is
       -- recorded.
       CommandBuffer
    -> -- | @vertexCount@ is the number of vertices to draw.
       ("vertexCount" ::: Word32)
    -> -- | @instanceCount@ is the number of instances to draw.
       ("instanceCount" ::: Word32)
    -> -- | @firstVertex@ is the index of the first vertex to draw.
       ("firstVertex" ::: Word32)
    -> -- | @firstInstance@ is the instance ID of the first instance to draw.
       ("firstInstance" ::: Word32)
    -> io ()

These names are ignored by the type-checker, just like the equivalent cosmetic names in Dhall.

Friday, August 13, 2021

Namespaced De Bruijn indices

Namespaced De Bruijn indices

In this post I share a trick I use for dealing with bound variables in Dhall that I thought might be useful for other interpreted programming languages. I have no idea if this trick has been introduced before but if it has then just let me know and I’ll acknowledge any prior art here.

Edit: Todd Wilson points out that Mark-Oliver Stehr’s CINNI originally introduced this idea.

The brief explanation of the trick is: instead of choosing between a named or a nameless representation for bound variables you can get the best of both worlds by namespacing De Bruijn indices by variable names. This simplifies the implementation and in some cases improves the end user’s experience.

The rest of this post is a longer explanation of the above summary, starting with an explanation of the trick and followed by a review of the benefits of this approach.

Background

I’d like to first explain what I mean by “named” and “nameless” representations before I explain the trick.

A named representation of the lambda calculus syntax tree typically looks something like this:

data Syntax
    = Variable String
    | Lambda String Syntax
    | Apply Syntax Syntax

For example, if the user wrote the following Haskell-like code:

\f -> \x -> f x

… then that would correspond to this syntax tree:

example :: Syntax
example = Lambda "f" (Lambda "x" (Apply (Variable "f") (Variable "x")))

The named representation has the nice property that it preserves the original variable names … well, sort of. This representation definitely preserves the variable names when you initially parse the code into the syntax tree, but if you β-reduce an expression you can potentially run into problems.

For example, consider this expression:

\x -> (\y -> \x -> y) x

… which corresponds to this syntax tree:

Lambda "x" (Apply (Lambda "y" (Lambda "x" (Variable "y"))) (Variable "x"))

If you try to β-reduce (\y -> \x -> y) x without renaming any variables then you get the following incorrect result:

\x -> \x -> x

This bug is known as “name capture” and capture-avoiding substitution requires renaming one of the variables named x so that the inner x does not shadow the outer x. For example, we could fix the problem by renaming the outer x to x1 like this:

\x1 -> \x -> x1

A nameless representation tries to work around these name capture issues by replacing the variable names with numeric indices (known as De Bruijn indices):

data Syntax
    = Variable Int
    | Lambda Syntax
    | Apply Syntax Syntax

For example, code like this:

\f -> \x -> f x

… corresponds to this nameless representation:

example :: Syntax
example = Lambda (Lambda (Apply (Variable 1) (Variable 0)))

Carefully note that the Lambda constructor now has no field for the bound variable name, so it’s as if the user had instead written:

\ -> \ -> @1 @0

… using @n to represent the variable whose De Bruijn index is n.

The numeric De Bruijn indices refer to bound variables. Specifically, the numeric index 0 refers to the “closest” or “innermost” variable bound by a lambda:

--                This 0 index …
--                ↓
    \ -> \ -> @1 @0
--       ↑ … refers to the variable bound by this lambda

… and incrementing the index moves to the next outermost lambda:

--             This 1 index …
--             ↓
    \ -> \ -> @1 @0
--  ↑ … refers to the variable bound by this lambda

De Bruijn indices avoid name collisions between bound variables, but they require you to do additional work if you wish to preserve the original variable names. There are several ways to do so, and I’ll present my preferred approach.

The trick - Part 1

We can get the best of both worlds by combining the named and nameless representations into a hybrid representation like this:

data Syntax
    = Variable String Int
    | Lambda String Syntax
    | Apply Syntax Syntax

I call this representation “namespaced De Bruijn indices”.

This is almost the exact same as our named representation, except that we have now added an Int field to the Variable constructor. This Int field is morally the same as the De Bruijn index in the nameless representation, except that this time the De Bruijn index is “namespaced” to a specific variable name.

The easiest way to explain this is with a few examples.

The following expression:

\x -> \y -> \x -> x@0

… corresponds to this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 0)))

… and this curried function returns the third function argument:

--                    This …
--                    ↓
    \x -> \y -> \x -> x@0
--               ↑ … refers to this bound variable

… because that is the innermost bound variable named x.

Similarly, the following expression:

\x -> \y -> \x -> y@0

… corresponds to this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "y" 0)))

… which returns the second function argument:

--                    This …
--                    ↓
    \x -> \y -> \x -> y@0
--         ↑ … refers to this bound variable

… because that is the innermost bound variable named y.

Carefully note that our variable still has a De Bruijn index of 0, but we ignore the innermost bound variable named x because we also pair our De Bruijn index with name of the variable we are referring to (y) so we only count bound variables named y when resolving the De Bruijn index.

Finally, the following expression:

\x -> \y -> \x -> x@1

… corresponds to this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 1)))

… which returns the first function argument:

--                    This …
--                    ↓
    \x -> \y -> \x -> x@1
--   ↑ … refers to this bound variable

The De Bruijn index is 1, which means that it refers to the second innermost (0-indexed) bound variable named x.

Notice how this representation lets us refer to shadowed variables by their index. These De Bruijn indices are not an internal implementation detail, but are actually available to the user as part of the surface syntax of the language.

However, we want to avoid littering the code with these De Bruijn indices, which brings us to the second part of the trick.

The trick - Part 2

The next step is to add syntactic sugar to the language by allowing users to omit the index in the source code, which defaults the index to 0. This means that an expression that never references shadowed variables never needs to specify a De Bruijn index.

For example, instead of writing this:

\x -> \y -> \x -> x@0

… we can elide the index to simplify the code to:

\x -> \y -> \x -> x

… which will still parse as:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 0)))

Similarly, we can simplify this:

\x -> \y -> \x -> y@0

… to this:

\x -> \y -> \x -> y

… which will still parse as:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "y" 0)))

However, we cannot use this syntactic sugar to simplify the final example:

\x -> \y -> \x -> x@1

… since the index is non-zero. Any code that references a shadowed variable still needs to use an explicit De Bruijn index to do so.

Vice versa, we also omit zero indices when pretty-printing code. When we pretty-print this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 0)))

… we don’t include the index:

\x -> \y -> \x -> x

This syntactic sugar ensures that most users do not need to be aware that indices exist at all when writing code. The user only encounters the indices in two scenarios:

  • The user wishes to explicitly reference a shadowed variable

    For example, in the following expression:

    \x -> \y -> \x -> x@1

    … the user might prefer to use the built-in language support for disambiguating variables of the same name rather than renaming one of the two variables named x.

  • The indices appear in a β-reduced result

    For example, this expression has no user-visible De Bruijn indices:

    \x -> (\y -> \x -> y) x

    … but if you β-reduce the expression (I’ll cover how in the Appendix) and pretty-print the β-reduced expression then the result will introduce a non-zero De Bruijn index to disambiguate the two variables named x:

    \x -> \x -> x@1

In fact, the latter scenario is the reason I originally adopted this trick: I wanted to be able to display β-reduced functions to the end user while preserving the original variable names as much as possible.

Note that De Bruijn indices don’t appear when a β-reduced expression does not reference any shadowed variables. For example, if you β-reduce this expression:

(\f -> f f) (\x -> x)

… the result has no De Bruijn index (because the index is 0 and is therefore elided by the pretty-printer):

\x -> x

The trick - Part 3

One of the benefits of the traditional nameless representation using (non-namespaced) De Bruijn indices is that you get α-equivalence for free. Two nameless expressions are α-equivalent if they are syntactically identical. We can build upon this useful property to derive a compact algorithm for α-equivalence of “namespaced De Bruijn indices”.

The trick is to recognize that namespaced De Bruijn indices reduce to ordinary De Bruijn indices in the degenerate case when you rename all variables to the same name. I’ll call this renaming process “α-reduction”.

For example, if we α-reduce the following expression by renaming all of the: variables to _:

\x -> \y -> \x -> x@1

… then we get this result:

\_ -> \_ -> \_ -> _@2

See the Appendix for the α-reduction algorithm.

Equipped with α-reduction, then we can derive α-equivalence: two expressions are α-equivalent if their α-reduced forms are syntactically identical.

For example, this expression:

\x -> x

… and this expression:

\y -> y

… both α-reduce to:

\_ -> _

… so they are α-equivalent.

Benefits

There are a few benefits of using this trick that motivate me to use this in all of my interpreted languages:

  • This trick improves the readability of β-reduced functions

    β-reduced functions preserve the original variable names and this trick doesn’t suffer from the rename-related name pollution that plagues other capture-avoiding substitution algorithms. In particular, β-reduced expressions only display De Bruijn indices when absolutely necessary (if they reference a shadowed variable) and they otherwise use the original pristine variable names.

  • This trick simplifies the internal implementation

    You don’t need to maintain two separate syntax trees for a named and nameless representation. You can use the same syntax tree for both since any named syntax tree can be α-reduced to give the equivalent nameless syntax tree.

  • This trick enables userland support for referencing shadowed variables

    I know some people think that referencing shadowed variable names is a misfeature. However, I personally feel that resolving name collisions by adding ' or _ characters to the end of variable names is less principled than having language support for resolving name collisions using optional De Bruijn indices.

  • (Not shown) This trick can sometimes improve type errors

    To be precise, this trick improves the inferred types displayed in error messages when using explicit universal quantification.

    Type variables also have to avoid name collisions, so if you use the same namespaced De Bruijn representation for your types then you avoid polluting your inferred types and error messages with junk type variables like a14.

    This post doesn’t cover the equivalent type-level trick, but you can refer to the Dhall standard if you need an example of a language that uses this trick.

Conclusion

I believe that namespaced De Bruijn indices are most appropriate for languages that are (A) strongly normalizing (like Dhall) and (B) interpreted, because such languages tend to support pretty-printing β-reduced functions.

I think this trick is also useful to a lesser extent for all interpreted languages, if only because the implementation is (in my opinion) simpler and more elegant than other algorithms for capture-avoiding substitution (See the Appendix below).

On the other hand, compiled languages will likely not benefit much from this trick since they typically have no need to preserve the original variable names and they also will use an intermediate representation that is very different from the concrete syntax tree.

Appendix - Implementation

This section provides Haskell code specifying how to α-reduce and β-reduce a syntax tree that uses namespaced De Bruijn indices.

This reference implementation is not the most efficient implementation, but it’s the simplest one which I use for pedagogical purposes. If you’re interested in efficiency then check out my Grace project, which mixes this trick with the more efficient normalization-by-evaluation algorithm.

I also don’t include code for the parser or pretty-printer, because the only interesting part is the syntactic sugar for handling variables with a De Bruijn index of 0. Again, check out Grace if you want to refer to a more complete implementation.

-- | Syntax tree
data Syntax
    = Variable String Int
    | Lambda String Syntax
    | Apply Syntax Syntax
    deriving (Eq, Show)

{-| Increase the index of all bound variables matching the given variable name

    This is modified from the Shifting definition in Pierce's \"Types and
    Programming Languages\" by adding an additional argument for the namespace
    to shift
-}
shift
    :: Int
    -- ^ The amount to shift by
    -> String
    -- ^ The variable name to match (a.k.a. the namespace)
    -> Int
    -- ^ The minimum bound for which indices to shift
    -> Syntax
    -- ^ The expression to shift
    -> Syntax
shift offset namespace minIndex syntax =
    case syntax of
        Variable name index -> Variable name index'
          where
            index'
                | name == namespace && minIndex <= index = index + offset
                | otherwise                              = index

        Lambda name body -> Lambda name body'
          where
            minIndex'
                | name == namespace = minIndex + 1
                | otherwise         = minIndex

            body' = shift offset namespace minIndex' body

        Apply function argument -> Apply function' argument'
          where
            function' = shift offset namespace minIndex function

            argument' = shift offset namespace minIndex argument

{-| Substitute the given variable name and index with an expression

    This is modified from the Substitution definition in Pierce's \"Types and
    Programming Languages\" by adding an additional argument for the variable
    index
-}
substitute
    :: Syntax
    -- ^ The expression to substitute into
    -> String
    -- ^ The name of the variable to replace
    -> Int
    -- ^ The index of the variable to replace
    -> Syntax
    -- ^ The expression to substitute in place of the given variable
    -> Syntax
substitute expression name index replacement =
    case expression of
        Variable name' index'
            | name == name' && index == index' -> replacement
            | otherwise                        -> Variable name' index'

        Lambda name' body -> Lambda name' body'
          where
            index'
                | name == name' = index + 1
                | otherwise     = index

            shiftedBody = shift 1 name' 0 replacement

            body' = substitute body name index' shiftedBody

        Apply function argument -> Apply function' argument'
          where
            function' = substitute function name index replacement

            argument' = substitute argument name index replacement

-- | β-reduce an expression
betaReduce :: Syntax -> Syntax
betaReduce syntax =
    case syntax of
        Variable name index -> Variable name index

        Lambda name body -> Lambda name body'
          where
            body' = betaReduce body

        Apply function argument ->
            case function' of
                Lambda name body -> body'
                  where
                    shiftedArgument = shift 1 name 0 argument

                    substitutedBody = substitute body name 0 shiftedArgument

                    unshiftedBody = shift (-1) name 0 substitutedBody

                    body' = betaReduce unshiftedBody

                _ -> Apply function' argument'
          where
            function' = betaReduce function

            argument' = betaReduce argument

-- | α-reduce an expression
alphaReduce :: Syntax -> Syntax
alphaReduce syntax =
    case syntax of
        Variable name index -> Variable name index

        Lambda name body -> Lambda "_" body'
          where
            shiftedBody = shift 1 "_" 0 body

            substitutedBody = substitute shiftedBody name 0 (Variable "_" 0)

            unshiftedBody = shift (-1) name 0 substitutedBody

            body' = alphaReduce unshiftedBody

        Apply function argument -> Apply function' argument'
          where
            function' = alphaReduce function

            argument' = alphaReduce argument

-- | Returns `True` if the two input expressions are α-equivalent
alphaEquivalent :: Syntax -> Syntax -> Bool
alphaEquivalent left right = alphaReduce left == alphaReduce right

Appendix - History

I actually first introduced this feature in Morte, not Dhall. The idea originated from the discussion on this issue.