Friday, January 17, 2020

Why Dhall advertises the absence of Turing-completeness

total2

Several people have asked why I make a big deal out of the Dhall configuration language being “total” (i.e. not Turing-complete) and this post will summarize the two main reasons:

  1. If Dhall is total, that implies that the language got several other things correct

  2. “Not Turing-complete” is a signaling mechanism that appeals to Dhall’s target audience

“Because of the Implication”

The absence of Turing completeness per se does not provide many safety guarantees. Many people have correctly noted that you can craft compact Dhall functions that can take longer than the age of the universe to evaluate. I even provide a convenient implementation of the Ackermann function in Dhall to make it as easy as possible for people to foil the interpreter:

However, a total language like Dhall needs to get several other things correct in order to be able to guarantee that the language is not Turing complete. There are multiple ways you can eliminate Turing-completeness from a language, but nearly all of them improve the language in some way.

For example, the way Dhall eliminates Turing-completeness is:

  • Eliminating general recursion

    … which protects against common mistakes that introduce infinite loops

  • Having a strong type system

    … especially one with no escape hatches for reintroducing general recursion

  • Forbidding arbitrary side effects

    … which can also be another way to backdoor general recursion into a language

These three features are widely viewed as good things in their own right by people who care about language security, regardless of whether they are employed in service of eliminating Turing-completeness.

In other words, Turing-completeness functions as a convenient “umbrella” or “shorthand” for other safety features that LangSec advocates promote.

Shibboleth

According to Wikipedia a shibboleth is:

… a custom or tradition, usually a choice of phrasing or even a single word that distinguishes one group of people from another. Shibboleths have been used throughout history in many societies as passwords, simple ways of self-identification, signaling loyalty and affinity, maintaining traditional segregation, or protecting from real or perceived threats.

The phrase “not Turing-complete” is one such shibboleth. People who oppose the use of general-purpose programming languages for configuration files use this phrase as a signaling mechanism. This choice of words communicates to like-minded people that they share the same values as the Dhall community and agree on the right balance between configuration files being data vs. being programs.

If you follow online arguments about programmable configuration files, the discussion almost invariably follows this pattern:

  • Person A: “Configuration files should be inert so that they are easier to understand and manipulate”
  • Person B: “Software enginering practices like types and DRY can prevent config-induced production outages. Configs should be written in a general-purpose programming language.”
  • Person A: “But configuration files should not be Turing-complete!”

Usually, what “Person A” actually meant to say was something like:

  • configuration files should not permit arbitrary side effects
  • configuration files should not enable excessive indirection or obfuscation
  • configuration files should not crash or throw exceptions

… and none of those desires necessarily imply the absence of Turing-completeness!

However, for historical reasons all of the “Person A”s of the world rallied behind the absence of Turing-completeness as their banner. When I advertise that Dhall is not Turing-complete I’m signaling to them that they “belong here” with the rest of the Dhall community.

Conclusion

In my view, those two points make the strongest case for not being Turing complete. However, if you think I missed an important point just let me know.

Sunday, January 5, 2020

Dhall - Year in review (2019-2020)

dhall-2020

The Dhall configuration language is now three years old and this post reviews progress in 2019 and the future direction of the language in 2020.

If you’re not familiar with Dhall, you might want to visit the official website for the language. This post assumes familiarity and interest in the language.

I would like to use this post to advertise a short survey you can take if you would like to provide feedback that informs the direction of the language:

Language bindings

Last year’s survey indicated that many respondents were keen on additional language bindings, which this section covers.

This year there is a new officially supported language binding! 🎉

  • dhall-rust - Rust bindings to Dhall by Nadrieril

    I’m excited about this binding, both because Rust is an awesome language and also because I believe this paves the way for C/C++ bindings (and transitively any language that can interop with C)

There is also one new language binding close to completion:

  • dhall-golang - Go bindings to Dhall by Philip Potter

    This binding is not yet official, but I’m mentioning here in case interested parties might want to contribute. If you are interested in contributing then this thread is a good starting point.

    This is a binding that I believe would improve the user experience for one of Dhall’s largest audiences (Ops / CI / CD), since many tools in this domain (such as Kubernetes) are written in Go.

If there is a language binding that you would most like to see the survey includes a question to let you advertise your wish list for language bindings.

As I mentioned last year, I have no plans to implement any new language bindings myself. However, there are always things I can do to improve the likelihood of new language bindings popping up:

  • Error messages

    I’ve noticed that a major barrier for a new implementation is adding quality error messages.

    One solution to this problem may be taking the error messages for the Haskell implementation and upstreaming them into shared templates that all implementations can reuse (and improve upon)

  • Reference implementation

    One idea I’ve floated a few times recently is having a simplified reference implementation in some programming language instead of using natural deduction as the notation for specifying language semantics. This might help ease the life of people who are not as familiar with programming language theory and its notation.

    For example, the current Haskell implementation is not suitable as a reference implementation because it operates under a lot of constraints that are not relevant to the standard (such as customization, formatting, and performance).

  • Simplify the standard

    This year we removed several stale features (such as old-style Optional literals and old-style union literals) in the interest of decreasing the cost for language binding maintainers.

    There is also one feature that in my eyes is “on the chopping block”, which is the language’s using keyword for custom headers. This feature is one of the more complex ones to implement correctly and doesn’t appear to be carrying its own weight. There also may be preferable alternatives to this feature that don’t require language support (such as .netrc files).

Integrations

There are other integrations that are not language bindings, which this section covers.

PureScript package sets

Dhall is now the officially supported way of specifying PureScript package sets:

… and there is a new PureScript build tool named spago that provides the command-line interface to using these package sets:

There are several contributors to both of these repositories, so I can’t acknowledge them all, but I would like to give special mention to Justin Woo and Fabrizio Ferrai for bootstrapping these projects.

This integration is the largest case I’m aware of where Dhall is not being used for its own sake but rather as a required configuration format for another tool.

YAML/JSON

Last year we added support for converting Dhall to JSON/YAML and this year antislava and Robbie McMichael also added support for converting JSON/YAML to Dhall. Specifically, there are two new json-to-dhall and yaml-to-dhall executables that you can use.

This addressed a common point of feedback from users that migrating existing YAML configuration files to Dhall was tedious and error-prone. Now the process can be automated.

This year we also added Prelude support for JSON and YAML. Specifically:

  • There is a new Prelude.JSON.Type that can model arbitrary schema-free JSON or YAML

  • There is a new Prelude.JSON.render utility that can render expressions of the above type as JSON or YAML Text that is guaranteed to be well-formed

Here is an example of how it works:

In other words, there is a now a “pure Dhall” implementation of JSON/YAML support, although it is not as featureful as the dhall-to-{json,yaml} executables.

Special thanks to Philipp Krüger for contributing Prelude.JSON.renderYAML!

On top of that, {yaml,json}-to-dhall and dhall-to-{yaml,json} both natively support the schema-free JSON type from the Prelude, which means that you can now incrementally migrate YAML/JSON configuration files. You can learn more about this from the following chapter in the Dhall Configuration Language Manual:

XML

Thanks to Stephen Weber Dhall now supports XML:

The above package provides dhall-to-xml and xml-to-dhall utilities for converting between Dhall and XML. This package also provides a Ruby API to this functionality as well.

This fills one of the big omissions in supported configuration formats that we had last year, so I’m very thankful for this contribution.

Rails

The same Stephen Weber also contributed Rails support for Dhall:

… so that you can use Dhall as the configuration file format for a Rails app instead of YAML.

C package management

Vanessa McHale built a C package manager named with an emphasis on cross compilation:

The current package set already supports a surprisingly large number of C packages!

I also find this project fascinating because I’ve seen a few people discuss what Nixpkgs (the Nix package repository) might look like if it were redone from the ground up in terms of Dhall. cpkg most closely resembles how I imagined it would be organized.

Language improvements

Last year some survey respondents were interested more in improvements to the language ergonomics rather than specific integrations, so this section covers new language enhancements.

Consuming packages

One thing we improved this year was the experience for people consuming Dhall packages.

Probably the biggest improvement was changing to “stable hashes”, where we stopped using the standard version as an input to semantic hashes. Users complained about each new version of the standard breaking their integrity checks, and now that is a thing of the past. This means that expressions authored for older versions of the language are now far more likely to work for newer language versions when protected by an integrity check.

Oliver Charles also contributed another large improvement by standardizing support for mixed records of types and terms. This means that package authors can now serve both types and terms from the same top-level package.dhall file instead of having to author separate types.dhall and terms.dhall files.

For example, the Prelude now serves both terms and types from a single package:

The above example also illustrates how field names no longer need to be escaped if they conflict with reserved names (like Type). This improves the ergonomics of using the Prelude which had several field names that conflicted with built-in language types and previously had to be escaped with backticks.

Authoring packages

We also improved the experience for users authoring new packages. Dhall now has language support for tests so that package authors do not need to implement testing infrastructure out of band.

You can find several examples of this in the Prelude, such as the tests for the Prelude.Natural.greaterThan utility:

The above example shows how you can not only write unit tests but in limited cases you can also write property tests. For example, the above property0 test verifies that greaterThan n n is False for all possible values of n.

Dependent types

Language support for tests is a subset of a larger change: dependent types. Dhall is now a technically dependently-typed language, meaning that you can take advantage of some basic features of dependent types, such as:

  • Type-level assertions (i.e. the tests we just covered)
  • Type-level literals (such as Natural and Text)

… but you cannot do more sophisticated things like length-indexed Lists.

toMap keyword

This year Mario Blažević added a new toMap keyword for converting Dhall records to homogeneous lists of key-value pairs (a.k.a. Maps):

Dhall users frequently requested this feature for supporting JSON/YAML-based formats. These formats commonly use dictionaries with a variable set of fields, but this led to an impedance mismatch when interoperating with a typed language like Dhall because Dhall records are not homogeneous maps and the type of a Dhall record changes when you add or remove fields.

Normally the idiomatic way to model a homogeneous Map in Dhall would be a List of key-value pairs (since you can add or remove key-value pairs without changing the type of a List), but that’s less ergonomic than using a record. The toMap keyword gives users the best of both worlds: they can use Dhall’s record notation to ergonomically author values that they can convert to homogeneous Maps using toMap.

The :: operator for record completion

Several users complained about the language’s support for records with defaultable fields, so we added a new operator to make this more ergonomic.

This example illustrates how the operator works:

In other words, given a “schema” record (such as Person) containing a record type and a record of default values, you can use that schema to instantiate a record, defaulting all fields that are not specified.

The operator is “syntactic sugar”. When you write:

… that “desugars” to:

Also, dhall format will recognize this operator and format the operator compactly for large nested records authored using this operator.

The easiest way to motivate this change is to compare the dhall-kubernetes simple deployment example before and after using this operator. Before, using the // operator and old formatting rules, the example looked like this:

Now the most recent iteration of the example looks like this:

New built-ins

One change with a high power-to-weight ratio was adding new built-ins. Without listing all of them, the key changes were:

  • Integers are no longer opaque. You can convert back and forth between Integer and Natural and therefore implement arbitrary arithmetic on Integers.

  • Some new built-ins enabled new efficient Prelude utilities that would have been prohibitively slow otherwise

  • Some things that used to require external command-line tools can now be implemented entirely within the language (such as modeling and rendering JSON/YAML in “pure Dhall” as mentioned above)

Note that Text is still opaque, although I predict that is the most likely thing that will change over the next year if we continue to add new built-ins.

Enums

Enums are now much more ergonomic in Dhall, as the following example illustrates:

More generally, union alternatives can now be empty, like this:

… and enums are the special case where all alternatives are empty.

Before this change users would have to use an alternative type of {}, like this:

… which made things more verbose both for authors and consumers of Dhall packages.

Tooling improvements

This section covers improvements to the the tooling in order to provide a more complete development experience.

Language server

Last year I stated that one of our goals was to create a Dhall language server for broader better editor support and I’m happy to announce that we accomplished that goal!

Credit goes to both PanAeon (who authored the initial implementation) and Folkmar Ramcke (who greatly expanded upon the initial implementation as part of a Google Summer of Code project).

You can read the final report at the end of Folkmar’s work here:

… and this GIF gives a sample of what the language server can do:

The language server was tested to work with VSCode but in principle should work with any editor that supports the language server protocol with a small amount of work. I’ve personally tested that the language server works fine with Vim/Neovim.

If you have any issues getting the language server working with your editor of choice just let us know as we plan to polish and document the setup process for a wide variety of editors.

Also, we continue to add new features to language server based on user feedback. If you have cool ideas for how to make the editor experience more amazing please share them with us!

Pre-built executables for all platforms

Several users contributed continuous delivery support so that we could automatically generate pre-built executables for the shared command-line tools, including the dhall command and dhall-lsp-server (the language server).

That means that for each new release you can download pre-built executables for:

  • Windows
  • OS X
  • Linux

… from this page:

Docker support

You can also obtain docker containers for many of the command-line tools for ease of integration with your company’s container-based infrastructure:

Performance

The Haskell implementation (which powers the dhall tool and the language server) has undergone some dramatic performance improvements over the last year.

Most of these performance improvements were in response to the following two pressures on the language:

  • The language server requires a snappy feedback loop for productive editing

  • People are commonly using Dhall on very large program configurations (like dhall-kubernetes)

There is still room for improvement, but it is markedly better for all Dhall configuration files and orders of magnitude faster in many cases compared to a year ago.

Formatting improvements

The standard formatter is probably one of the things I get the most feedback about (mostly criticism 🙂), so I’ve spent some time this year on improving the formatter in the following ways:

  • let-related comments are now preserved

    … and I plan to expand support for preserving more comments

  • Expressions are now much more compact than before

    … such as in the dhall-kubernetes sample code above

Additionally, I recently started a discussion about potentially switching to ASCII as the default for formatting code, which you can follow here:

The outcome of that discussion was to add several new survey questions to assess whether people prefer to read and write Dhall code using ASCII or Unicode. So if you have strong opinions about this then please take the survey!

Dhall packages

Another significant component of the Dhall ecosystem is packages written within the language.

Dhall differentiates itself from other programmable file formats (e.g. Jsonnet) by having hundreds of open source packages built around the language that support for a variety of tools and formats (especially in the Ops / CI / CD domain).

In fact, Dhall’s open source footprint large enough this year that GitHub now recognizes Dhall as a supported file format. This means that files with a .dhall extension now enjoy syntax highlighting on GitHub and show up in language statistics for projects.

Here are some example Dhall bindings to various formats added last year:

I’m highly grateful for every person who improves the ecosystem. In fact, I randomly stalk Dhall packages on GitHub to inform language design by seeing how people use Dhall “in the wild”.

Shared infrastructure

We made two main improvements to shared infrastructure for the Dhall community this year:

Documentation

The Dhall wiki has been moved to docs.dhall-lang.org thanks to work by Tristan de Cacqueray. This means that:

  • The documentation is now generated using Sphinx

  • The documentation is now much easier to contribute to as it is under version control here:

    dhall-lang/docs

So if you would like to improve the documentation you can now open a pull request to do so!

Discourse

We also have a new Discourse forum hosted at discourse.dhall-lang.org that you can use to discuss anything Dhall-related.

We’ve been using the forum so far for announcing projects / releases and also as a sounding board for ideas.

Funding

Last year I solicited ideas for funding improvements to the Dhall ecosystem and this year we followed through on three different funding mechanisms:

Google Summer of Code

The most successful funding source by far was Google’s Summer of Code grant that funded Folkmar Ramcke to develop the language server. I plan to try this again for the upcoming summer and I will also recommend that other Dhall projects and language bindings try this out, too. Besides providing a generous source of funding (thank you, Google 🙇‍♂️) this program is an excellent opportunity to bring in new contributors to the ecosystem.

Open Collective

Another thing we set up this year is an Open Collective for Dhall so that we can accept donations from companies and individuals. We started this only a few months ago and thanks to people’s generosity we’ve accumulated over $500 in donations.

I would like to give special thanks to our first backer:

… and our largest backer:

We plan to use these donations to fund projects that (A) benefit the entire Dhall community and (B) bring in new contributors, so your donations help promote a vibrant and growing developer community around the language.

Our first such project was to implement “pure Dhall” support for rendering YAML:

… and we have another proposal in progress to fund documenting the setup process for the language server for various editors:

If you would like to donate, you can do so here:

Book

I’m also working on the following book:

My plan is to make the book freely available using LeanPub but to give users the option to pay for the book, with all proceeds going to the above Open Collective for Dhall.

One of the strongest pieces of survey feedback we got was that users were willing to pay for Dhall-related merchandise (especially books) and they were highly eager for documentation regarding best practices for the language. This book intends to address both of those points of feedback.

I expect that at the current rate of progress the book will likely be done by the end of this year, but you can already begin reading the chapters that I’ve completed so far:

Future directions

Marketing

One of the things that’s slowly changing about the language is how we market ourselves. People following the language know that we’ve recently revamped the website:

… and changed our “slogan” to “Maintainable configuration files”.

One difference from last year is that we’re no longer trying to replace all uses for YAML. User feedback indicated that some uses of YAML were better served by TOML rather than Dhall. Specifically, small (~10 line) configuration files for simple command-line tools were cases where TOML was a better default choice than Dhall.

On the other hand, we find that users that deal with very large and fragmented program configurations tend to prefer Dhall due the language’s support for features that promote maintainability and reduce total cost of ownership.

I continue to prioritize Ops / CI / CD use cases, but I no longer try to displace YAML for use cases where TOML would be a more appropriate choice.

Completing the Dhall manual

One of my personal goals is to complete the Dhall manual to help people confidently recommend Dhall to others by providing high-quality material to help onboard their coworkers. I expect this will help accelerate language adoption quite a bit.

Polish the language server

The language server is another area of development that I see as highly promising. Although we currently provide common features like type-on-hover and intelligent auto-completion I still think there is a lot of untapped potential here to really “wow” new users by showcasing Dhall’s strengths.

People currently have really low expectations for programmable file formats, so I view the quality of the language server implementation as being a way that Dhall can rapidly differentiate itself from competing programmable file formats. In particular, Dhall is one of the few typed configuration formats and quality editor support is one of the easiest ways to convey the importance of using a typed language.

Packaging for various distributions

One thing that the Dhall ecosystem would benefit from is packaging, not just for Linux distributions but other platforms as well. We made progress this year by adding support for Brew and Docker, but there are still important omissions for other platforms, such as:

  • Windows (e.g. Nuget)
  • Linux (e.g. Debian/Fedora/Arch)

This one of the areas where I have the greatest difficulty making progress because each package repository tends to have a pretty high barrier to entry in my experience. If you are somebody who has experience with any of the above package repositories and could help me get started I would appreciate it!

Package discovery

I mentioned earlier that Dhall is growing quite a large open source package footprint, but these packages are not easy to discover.

One effort to address this is:

… which is working to create a “mono-repo” of Dhall packages to promote discoverability.

In addition to that, the language probably also needs a standard documentation generator. There have been a few nascent efforts along these lines this year, but at some point we need to take this idea “all the way”.

Library of Kubernetes utilities

Last year I mentioned that I would spend some time on a new Ops-related Dhall integration and I quickly gravitated towards improving the existing dhall-kubernetes integration. After familarizing myself with Kubernetes I realized that this is a use case that is served well by Dhall since Kubernetes configurations are highly unmaintainable.

Programmable Kubernetes configuration files are a bit of a crowded field (a cottage industry, really), with a steady stream of new entrants (like Pulumi, Cue, and Tanka). That said, I’m fairly confident that with some attention Dhall can become the best-in-class solution in this space.

Conclusion

I would like thank everybody who contributed last year, and I apologize if I forgot to acknowledge your contribution.

This post is not an exhaustive list of what happened over the last year. If you would like to learn more, the best places to start are:

Please don’t forget to take our yearly survey to provide feedback on the language or to inform the future direction:

In a month I will follow up with another post reviewing the survey feedback.