Monday, May 20, 2024

Prefer do notation over Applicative operators when assembling records

Prefer do notation over Applicative operators when assembling records

This is a short post explaining why you should prefer do notation when assembling a record, instead of using Applicative operators (i.e. (<$>)/(<*>)). This advice applies both for type constructors that implement Monad (e.g. IO) and also for type constructors that implement Applicative but not Monad (e.g. the Parser type constructor from the optparse-applicative package). The only difference is that in the latter case you would need to enable the ApplicativeDo language extension.

The guidance is pretty simple. Instead of doing this:

data Person = Person
    { firstName :: String
    , lastName :: String

getPerson :: IO Person
getPerson = Person <$> getLine <*> getLine

… you should do this:

{-# LANGUAGE RecordWildCards #-}

{-# OPTIONS_GHC -Werror=missing-fields #-}

data Person = Person
    { firstName :: String
    , lastName :: String

getPerson :: IO Person
getPerson = do
    firstName <- getLine
    lastName <- getLine
    return Person{..}

Why is the latter version better? There are a few reasons.


It’s more ergonomic to assemble a record using do notation because you’re less pressured to try to cram all the logic into a single expression.

For example, suppose we wanted to explicitly prompt the user to enter their first and last name. The typical way people would do extend the former example using Applicative operators would be something like this:

getPerson :: IO Person
getPerson =
    <$> (putStrLn "Enter your first name:" *> getLine)
    <*> (putStrLn "Enter your last name:"  *> getLine)

The expression gets so large that you end up having to split it over multiple lines, but if we’re already splitting it over multiple lines then why not use do notation?

getPerson :: IO Person
getPerson = do
    putStrLn "Enter your first name:"
    firstName <- getLine

    putStrLn "Enter your last name:"
    lastName <- getLine

    return Person{..}

Wow, much clearer! Also, the version using do notation doesn’t require that the reader is familiar with all of the Applicative operators, so it’s more approachable to Haskell beginners.

Order insensitivity

Suppose we take that last example and then change the Person type to reorder the two fields:

data Person = Person
    { lastName :: String
    , firstName :: String

… then the former version using Applicative operators would silently break: the first name and last name would now be read in the wrong order. The latter version (using do notation) is unaffected.

More generally, the approach using do notation never breaks or changes its behavior if you reorder the fields in the datatype definition. It’s completely order-insensitive.

Better error messages

If you add a new argument to the Person constructor, like this:

data Person = Person
    { alive :: Bool
    , firstName :: String
    , lastName :: String

… and you don’t make any other changes to the code then the former version will produce two error messages, neither of which is great:

    • Couldn't match type ‘String -> Person’ with ‘Person’
      Expected: Bool -> String -> Person
        Actual: Bool -> String -> String -> Person
    • Probable cause: ‘Person’ is applied to too few arguments
      In the first argument of ‘(<$>)’, namely ‘Person’
      In the first argument of ‘(<*>)’, namely ‘Person <$> getLine’
      In the expression: Person <$> getLine <*> getLine
  | getPerson = Person <$> getLine <*> getLine
  |             ^^^^^^

    • Couldn't match type ‘[Char]’ with ‘Bool’
      Expected: IO Bool
        Actual: IO String
    • In the second argument of ‘(<$>)’, namely ‘getLine’
      In the first argument of ‘(<*>)’, namely ‘Person <$> getLine’
      In the expression: Person <$> getLine <*> getLine
  | getPerson = Person <$> getLine <*> getLine
  |                        ^^^^^^^

… whereas the latter version produces a much more direct error message:

    • Fields of ‘Person’ not initialised:
        alive :: Bool
    • In the first argument of ‘return’, namely ‘Person {..}’
      In a stmt of a 'do' block: return Person {..}
      In the expression:
        do putStrLn "Enter your first name: "
           firstName <- getLine
           putStrLn "Enter your last name: "
           lastName <- getLine
   |     return Person{..}
   |            ^^^^^^^^^^

… and that error message more clearly suggests to the developer what needs to be fixed: the alive field needs to be initialized. The developer doesn’t have to understand or reason about curried function types to fix things.


This advice obviously only applies for datatypes that are defined using record syntax. The approach I’m advocating here doesn’t work at all for datatypes with positional arguments (or arbitrary functions).

However, this advice does still apply for type constructors that are Applicatives and not Monads; you just need to enable the ApplicativeDo language extension. For example, this means that you can use this same trick for defining command-line Parsers from the optparse-applicative package:

{-# LANGUAGE ApplicativeDo #-}
{-# LANGUAGE RecordWildCards #-}

{-# OPTIONS_GHC -Werror=missing-fields #-}

import Options.Applicative (Parser, ParserInfo)

import qualified Options.Applicative as Options

data Person = Person
    { firstName :: String
    , lastName :: String
    } deriving (Show)

parsePerson :: Parser Person
parsePerson = do
    firstName <- Options.strOption
        (   Options.long "first-name"
        <> "Your first name"
        <>  Options.metavar "NAME"

    lastName <- Options.strOption
        (   Options.long "last-name"
        <> "Your last name"
        <>  Options.metavar "NAME"

    return Person{..}

parserInfo :: ParserInfo Person
parserInfo = parsePerson
        (Options.progDesc "Parse and display a person's first and last name")

main :: IO ()
main = do
    person <- Options.execParser parserInfo

    print person

Wednesday, May 8, 2024

All error messages are necessarily bad to some degree

All error messages are necessarily bad to some degree

This is something I feel like enough people don’t appreciate. One of the ways I like to explain this is by this old tweet of mine:

The evolution of an error message:

  • No error message
  • A one-line message
  • “Expected: … / Actual: …”
  • “Here’s what went wrong: …”
  • “Here’s what you should do: …”
  • I automated away what you should do
  • The invalid state is no longer representable

One of the common gripes I will hear about error messages is that they don’t tell the user what to do, but if you stop to think about it: if the error message knew exactly what you were supposed to do instead then your tool could just fix it for you (by automatically doing the right thing instead).

But wait!”, you might say, “sometimes an error message can’t automatically fix the problem for you because there’s not necessarily a right or obvious way to fix the problem or the user’s intent is not clear.” Yes, exactly, which brings us back to the original point:

Error messages are necessarily bad because they cannot anticipate what you should have done instead. If an error message could read your mind then they’d eventually evolve into something better than an error message. This creates a selection bias where the only remaining error messages are the ones that can’t read your mind.

Thursday, February 29, 2024

The siren song of domain-specific languages

The siren song of domain-specific languages

I’ve seen a lot of engineering teams mistakenly believe that they can author domain-specific languages for less technical users on a budget. In particular they seem to believe that if they create this domain-specific language then the less technical users will be able to thoughtlessly churn out a bunch of code in that language and there won’t be any problem and they can then move onto the next project. This rarely works out in the way that people hope it will.

In the best case scenario, your less technical users will churn out a large amount of code using your domain-specific language (which is exactly the outcome you hoped for!) and that corpus of code will push the boundaries of what your language is capable of (like performance, compilation speed, features, or supporting integrations). The larger your userbase the greater the demand will be to improve your language in a myriad of ways.

In the worst case scenario your users will find increasingly inane ways to do things wrong with your language despite your best efforts and you will be expected to clean up their mess because you sold the project on the premise of “our users are not going to have to think”.

… and in either case this process will never end; the project will never be in a “done state” and require permanent staffing. Hell, even if you staff an entire team to support this language it’s still often a struggle to keep up with the needs of less technical users.

This tradeoff can still tempt businesses because it’s appealing to replace skilled labor with unskilled labor. The reasoning goes that a small investment of more skilled labor (the authors of the domain-specific language) can enable a larger pool of less skilled labor (the less technical users) to do most of the work. However, what you will often find in practice is that this larger group of less technical users is frequently blocked without continuous assistance from the engineers who created the language.

So in practice you’re not actually replacing skilled labor with unskilled labor. Rather, you’re merely “laundering” skilled labor as unskilled labor and creating more work for your engineers to make them seem more replaceable than they actually are.

I do think there are situations where domain-specific languages make sense, but typically not on the scale of a software engineering organization or even a small product. I personally think this sort of division of labor tends to only work on the scale of an open source ecosystem where you get a large enough economy of scale.

Thursday, February 22, 2024

Unification-free ("keyword") type checking

Unification-free ("keyword") type checking

From my perspective, one of the biggest open problems in implementing programming languages is how to add a type system to the language without significantly complicating the implementation.

For example, in my tutorial Fall-from-Grace implementation the type checker logic accounts for over half of the code. In the following lines of code report I’ve highlighted the modules responsible for type-checking with a :

$ cloc --by-file src/Grace/*.hs       

File                                    blank        comment           code
src/Grace/Infer.hs        ‡               499            334           1696
src/Grace/Type.hs         ‡                96             91            633
src/Grace/Syntax.hs                        61            163            543
src/Grace/Parser.hs                       166             15            477
src/Grace/Lexer.hs                         69             25            412
src/Grace/Normalize.hs                     47             48            409
src/Grace/Context.hs      ‡                72            165            249
src/Grace/Import.hs                        38              5            161
src/Grace/REPL.hs                          56              4            148
src/Grace/Interpret.hs                     30             28            114
src/Grace/Pretty.hs                        25             25            108
src/Grace/Monotype.hs     ‡                11             48             61
src/Grace/Location.hs                      16             15             60
src/Grace/TH.hs                            23             32             53
src/Grace/Value.hs                         12             53             53
src/Grace/Input.hs                         10              8             43
src/Grace/Compat.hs                         9              2             32
src/Grace/Existential.hs  ‡                12             23             25
src/Grace/Domain.hs       ‡                 4              7             20
SUM:                                     1256           1091           5297

That’s 2684 lines of code (≈51%) just for type-checking (and believe me: I tried very hard to simplify the type-checking code).

This is the reason why programming language implementers will be pretty keen to just not implement a type-checker for their language, and that’s how we end up with a proliferation of untyped programming languages (e.g. Godot or Nix), or ones that end up with a type system bolted on long after the fact (e.g. TypeScript or Python). You can see why someone would be pretty tempted to skip implementing a type system for their language (especially given that it’s an optional language feature) if it’s going to balloon the size of their codebase.

So I’m extremely keen on implementing a “lean” type checker that has a high power-to-weight ratio. I also believe that a compact type checker is an important foundational step for functional programming to “go viral” and displace imperative programming. This post outlines one approach to this problem that I’ve been experimenting with1.


The thing that bloats the size of most type-checking implementations is the need to track unification variables. These variables are placeholders for storing as-yet-unknown information about something’s type.

For example, when a functional programming language infers the type of something like this Grace expression:

(λx → x) true

… the way it typically works is that it will infer the type of the function (λx → x) which will be:

λx → x : α → α

… where α is a unification variable (an unsolved type). So you can read the above type annotation as saying “the type of λx → x is a function from some unknown input type (α) to the same output type (α).

Then the type checker will infer the type of the function’s input argument (true) which will be:

true : Bool

… and finally the type checker will combine those two pieces of information and reason about the final type like this:

  • the input to the function (true) is a Bool
  • therefore the function’s input type (α) must also be Bool
  • therefore the function’s output type (α) must also be Bool
  • therefore the entire expression’s type is Bool

… which gives the following conclusion of type inference:

(λx → x) true : Bool

However, managing unification variables like α is a lot trickier than it sounds. There are multiple unification algorithms/frameworks in the wild but the problem with all of them is that you have to essentially implement a bespoke logic programming language (with all of the complexity that entails). Like, geez, I’m already implementing a programming language and I don’t want to have to implement a logic programming language on top of that just to power my type-checker.

So there are a couple of ways I’ve been brainstorming how to address this problem and one idea I had was: what if we could get rid of unification variables altogether?

Deleting unification

Alright, so this is the part of the post that requires some familiarity/experience with implementing a type-checker. If you’re somebody new to programming language theory then you can still keep reading but this is where I have to assume some prior knowledge otherwise this post will get way too long.

The basic idea is that you start from the “Complete and Easy” bidirectional type checking algorithm which is a type checking algorithm that does use unification variables2 but is simpler than most type checking algorithms. The type checking rules look like this (you can just gloss over them):

Now, delete all the rules involving unification variables. Yes, all of them. That means that all of the type-checking judgments from Figures 9 and 10 are gone and also quite a few rules from Figure 11 disappear, too.

Surprisingly, you can still type check a lot of code with what’s left, but you lose two important type inference features if you do this:

  • you can no longer infer the types of lambda arguments

  • you can no longer automatically instantiate polymorphic code

… and I’ll dig into those two issues in more detail.

Inferring lambda argument types

You lose the ability to infer the type of a function like this one when you drop support for unification variables:

λx → x == False

Normally, a type checker that supports unification can infer that the above function has type Bool → Bool, but (in general) a type checker can no longer infer that when you drop unification variables from the implementation.

This loss is not too bad (in fact, it’s a pretty common trade-off proposed in the bidirectional type checking literature) because you can make up for it in a few ways (all of which are easy and efficient to implement in a type checker):

  • You can allow the input type to be inferred if the lambda is given an explicit type annotation, like this:

    λx → x == False : BoolBool

    More generally, you can allow the input type to be inferred if the lambda is checked against an expected type (and a type annotation is one case, but not the only case, where a lambda is checked against an expected type).

    We’re going to lean on this pretty heavily because it’s pretty reasonable to ask users to provide type annotations for function definitions and also because there are many situations where we can infer the expected type of a lambda expression from its immediate context.

  • You can allow the user to explicitly supply the type of the argument

    … like this:

    λ(x : Bool) → x == False

    This is how Dhall works, although it’s not as ergonomic.

  • You can allow the input type to be inferred if the lambda is applied to an argument

    This is not that interesting, but I’m mentioning it for completeness. The reason it’s not interesting is because you won’t often see expressions of the form (λx → e) y in the wild, because they can more idiomatically be rewritten as let x = y in e.

Instantiating polymorphic code

The bigger issue with dropping support for unification variables is: all user-defined polymorphic functions now require explicit type abstraction and explicit type application, which is a major regression in the type system’s user experience.

For example, in a language with unification variables you can write the polymorphic identity function as:

λx → x

… and use it like this3:

let id = λx → x
in  (id true, id 1)

… but when you drop support for unification variables then you have to do something like this:

let id = λ(a : Type) → λ(x : a) → x
in  (id Bool true, id Natural 1)

Most programmers do NOT want to program in a language where they have to explicitly manipulate type variables in this way. In particular, they really hate explicit type application. For example, nobody wants to write:

map { x : Bool, … large record … } Bool (λr → r.x) rs

So we need to figure out some way to work around this limitation.

The trick

However, there is a solution that I believe gives a high power-to-weight ratio, which I will refer to as “keyword” type checking:

  • add a bunch of built-in functions

    Specifically, add enough built-in functions to cover most use cases where users would need a polymorphic function.

  • add special type-checking rules for those built-in functions when they’re fully saturated with all of their arguments

    These special-cased type-checking rules would not require unification variables.

  • still require explicit type abstraction when these built-in functions are not fully saturated

    Alternatively, you can require that built-in polymorphic functions are fully saturated with their arguments and make it a parsing error if they’re not.

  • still require explicit type abstraction and explicit type application for all user-defined (i.e. non-builtin) polymorphic functions

  • optionally, turn these built-in functions into keywords or language constructs

I’ll give a concrete example: the map function for lists. In many functional programming languages this map function is not a built-in function; rather it’s defined within the host language as a function of the following type:

map : ∀(a b : Type) → (a → b) → List a → List b

What I’m proposing is that the map function would now become a built-in function within the language and you would now apply a special type-checking rule when the map function is fully saturated:

Γ ⊢ xs ⇒ List a   Γ ⊢ f ⇐ a → b
Γ ⊢ map f xs ⇐ List b

In other words, we’re essentially treating the map built-in function like a “keyword” in our language (when it’s fully saturated). Just like a keyword, it’s a built-in language feature that has special type-checking rules. Hell, you could even make it an actual keyword or language construct (e.g. a list comprehension) instead of a function call.

I would even argue that you should make each of these special-cased builtin-functions a keyword or a language construct instead of a function call (which is why I call this “keyword type checking” in the first place). When viewed through this lens the restrictions that these polymorphic built-in functions (A) are saturated with their arguments and (B) have a special type checking judgment are no different than the restrictions for ordinary keywords or language constructs (which also must be saturated with their arguments and also require special type checking judgments).

To make an analogy, in many functional programming languages the if/then/else construct has this same “keyword” status. You typically don’t implement it as a user-space function of this type:

ifThenElse : ∀(a : Type) → Bool → a → a → a

Rather, you define if as a language construct and you also add a special type-checking rule for if:

Γ ⊢ b ⇐ Bool   Γ ⊢ x ⇒ a   Γ ⊢ y ⇐ a
Γ ⊢ if b then x else y ⇒ a

… and what I’m proposing is essentially greatly exploding the number of “keywords” in the implementation of the language by turning a whole bunch of commonly-used polymorphic functions into built-in functions (or keywords, or language constructs) that are given special type-checking treatment.

For example, suppose the user were to create a polymorphic function like this one:

let twice = λ(a : Type) → λ(x : a) → [ x, x ]

in  twice (List Bool) (twice Bool true)

That’s not very ergonomic to define and use, but we also can’t reasonably expect our programming language to provide a twice built-in function. However, our language could provide a generally useful replicate builtin function (like Haskell’s replicate function):

replicate : ∀(a : Type) → Natural → a → List a

… with the following type-checking judgment:

Γ ⊢ n ⇐ Natural   Γ ⊢ x ⇒ a
Γ ⊢ replicate n x ⇒ List a

… and then you would tell the user to use replicate directly instead of defining their own twice function:

replicate 2 (replicate 2 true)

… and if the user were to ask you “How do I define a twice synonym for replicate 2” you would just tell them “Don’t do that. Use replicate 2 directly.”


This approach has the major upside that it’s much easier to implement a large number of keywords than it is to implement a unification algorithm, but there are other benefits to doing this, too!

  • It discourages complexity and fragmentation in user-space code

    Built-in polymorphic functions have an ergonomic advantage over user-defined polymorphic functions because under this framework type inference works better for built-in functions. This creates an ergonomic incentive to stick to the “standard library” of built-in polymorphic functions, which in turn promotes an opinionated coding style across all code written in that language.

    You might notice that this approach is somewhat similar in spirit to how Go handles polymorphism which is to say: it doesn’t handle user-defined polymorphic code well. For example, Go provides a few built-in language features that support polymorphism (e.g. the map data structure and for loops) but if users ask for any sort of user-defined polymorphism then the maintainers tell them they’re wrong for wanting that. The main difference here is that (unlike Go) we do actually support user-defined polymorphism; it’s not forbidden, but it is less ergonomic than sticking to the built-in utilities that support polymorphism..

  • It improves error messages

    When you special-case the type-checking logic you can also special-case the error messages, too! With general-purpose unification the error message can often be a bit divorced from the user’s intent, but with “keyword type checking” the error message is not only more local to the problem but it can also suggest highly-specific tips or fixes appropriate for that built-in function (or keyword or language construct).

  • It can in some cases more closely match the expectations of imperative programmers

    What I mean is: most programmers coming from an imperative and typed background are used to languages where (most of the time) polymorphism is “supported” via built-in language constructs and keywords and user-defined polymorphism might be supported but considered “fancy”. Leaning on polymorphism via keywords and language constructs would actually make them more comfortable using polymorphism instead of trying to teach them how to produce and consume user-defined polymorphic functions.

    For example, in a lot of imperative languages the idiomatic solution for how to do anything with a list is “use a for loop” where you can think of a for loop as a built-in keyword that supports polymorphic code. The functional programming equivalent of “just use a for loop” would be something like “just use a list comprehension” (where a list comprehension is a “keyword” that supports polymorphic code that we can give special type checking treatment).

That said, this approach is still more brittle than unification and will require more type annotations in general. The goal here isn’t to completely recover the full power of unification but rather to get something that’s not too bad but significantly easier to implement.

I think this “keyword type checking” can potentially occupy a “low tech” point in the type checking design space for functional programming languages that need to have efficient and compact implementations (e.g. for ease of embedding). Also, this can potentially provide a stop-gap solution for novice language implementers that want some sort of a type system but they’re not willing to commit to implementing a unification-based type system.

There’s also variation on this idea which Verity Scheel has been exploring, which is to provide userland support for defining new functions with special type-checking rules and there’s a post from her outlining how to do that:

User Operators with Implicits & Overloads

  1. The other approach is to create essentially an “ABNF for type checkers” that would let you write type-checking judgments in a standard format that could generate the corresponding type-checking code in multiple languages. That’s still a work-in-progress, though.↩︎

  2. I believe some people might take issue with calling these unification variables because they consider bidirectional type checking as a distinct framework from unification. Moreover, in the original bidirectional type checking paper they’re called “unsolved” variables rather than unification variables. However, I feel that for the purpose of this post it’s still morally correct to refer to these unsolved variables as unification variables since their usage and complexity tradeoffs are essentially identical to unification variables in traditional unification algorithms.↩︎

  3. … assuming let expressions are generalized.↩︎

Wednesday, October 4, 2023

A GHC plugin for OpenTelemetry build metrics

A GHC plugin for OpenTelemetry build metrics

This post is about a new OpenTelemetry plugin for GHC that I’ve been building for work that we’re open sourcing because I think it might be broadly useful to others. If all you want to do is use the plugin then you can find it on Hackage, which includes more detailed usage instructions. This post will focus more on the motivation and background behind the plugin’s development.


The context behind this work was that we use Honeycomb at work for collecting metrics related to production and our team1 has begun to apply those same metrics to our builds. In particular, we wanted to collect detailed (module-level) build metrics so that we could begin to hunt down and fix expensive modules within our codebase. For context: our codebase currently has almost 7000 modules, so these expensive modules can easily fly under the radar.

When we enable the plugin and export the results to Honeycomb we can begin to see which modules are the most expensive to build:

Sample module build times

… and none of the modules are individually very expensive to build (the worst offender is only about 5 seconds), so they’d easily get lost within a sea of thousands of other modules.

However, these sorts of insights have already proven useful. For example:

  • one expensive modules was completely unused in our codebase

    The above list brought it to our attention so that we could delete it.

  • other expensive modules were representative examples of larger issues to fix

    For example, one expensive module consisted of 2000 invocations of an internal function which is expensive to type-check and fixing this function will improve compile speeds across our codebase and not just that module.

  • other expensive modules are indicative of architectural anti-patterns

    Frequently “horizontally-organized” modules top the chart, and I view them as anti-patterns for a few reasons (see: my post on Module organization guidelines). These modules are not expensive per se (the code inside them has to be compiled somewhere), but they tend to be build chokepoints because they have a large number of dependencies and reverse dependencies. Highlighting expensive modules has a tendency to highlight these sorts of build chokepoints as a side bonus.

In principle you can also browse a given build’s trace interactively, like this:

However, for our codebase Honeycomb chokes on our giant build traces and we can only produce visualizations like the above image if we filter down the spans to a randomly sampled subset of modules. Honeycomb doesn’t do a good job of handling traces with a few thousand spans or more.


This plugin was surprisingly difficult for me to implement because GHC’s Plugin interface is so constrained.

For example, the hs-opentelemetry-sdk package asks you to finalize any TracerProvider that you acquire, but there’s no good way (that I know of2) to run finalization logic at the end of a ghc build using the Plugin interface. The purpose of this finalization logic is to flush metrics that haven’t yet been exported.

So what I did was to hack around this by detecting all modules that are root modules of the build graph and flushing metrics after each of those root modules is built (since one of them will be the last module built). I tried a bunch of other alternative approaches (like installing a phase hook), but this was the only approach I was able to get to work.

And the OpenTelemetry plugin is full of workarounds like this. We have vetted internally that the plugin works for normal builds, ghcid and haskell-language-server, but generally I expect there to be some trailing bugs that we’ll have to fix as more people use it due to these various unsafe implementation details.

In fact, one limitation of the plugin is that the top-level span has a duration of 0 (instead of reporting the duration of the build). This is related to the same issue of the Plugin interface apparently not having a good way to run code exactly once after the build completes (even using hacks). If somebody knows of a way to do this that I missed I’d definitely welcome the tip!


What we do know from internal usage is that:

  • the plugin definitely scales to very large codebases (thousands of modules)

    … although honeycomb doesn’t scale to thousands of spans, but that’s not our fault.

  • the plugin’s overhead is negligible (so it’s safe to always enable)

  • the plugin works with cabal commands, ghcid, and haskell-language-server

So it should be fine for most use cases, but please report any issues that you run into.

Monday, October 2, 2023

My views on NeoHaskell

My views on NeoHaskell

Recently Nick Seagull has announced a NeoHaskell project which (I believe) has generated some controversy. My first run-in with NeoHaskell was this post on cohost criticizing the NeoHaskell project and a few of my friends within the Haskell community have expressed concern about the NeoHaskell project. My gut reaction is also critical, but I wanted to do a more thorough investigation before speaking publicly against NeoHaskell so I figured I would dig into the project more first. Who knows, maybe my gut reaction is wrong? 🤷🏻‍♀️

Another reason NeoHaskell is relevant to me is that I think a lot about marketing and product management for the Haskell community, and even presented a talk on How to market Haskell mainstream programmers so I’m particularly keen to study NeoHaskell through that lens to see if he is trying to approach things in a similar way or not.

I also have credentials to burnish in this regard. I have a lot of experience with product management and technical product management for open source projects via my work on Dhall. Not only did I author the original implementation of Dhall but I singlehandedly built most of the language ecosystem (including the language standard, documentation, numerous language bindings, and the command-line tooling) and mentored others to do the same.

Anyway, with that out of the way, on to NeoHaskell:

What is NeoHaskell?

I feel like this is probably the most important question to answer because unless there is a clear statement of purpose for a project there’s nothing to judge; it’s “not even wrong” because there’s no yardstick by which to measure it and nothing to challenge.

So what is NeoHaskell?

I’ll break this into two parts: what NeoHaskell is right now and what NeoHaskell aspires to be.

Based on what I’ve gathered, right now NeoHaskell is:

However, it’s not clear what NeoHaskell aspires to be from studying the website, the issue tracker, or announcement:

  • Is this going to be a new programming language inspired by Haskell?

    In other words, will this be a “clean room” implementation of a language which is Haskell-like?

  • … or this going to be a fork of Haskell (more specifically: ghc) to add the desired features?

    In other words, will the relationship of NeoHaskell to Haskell be similar to the relationship between NeoVim and Vim? (The name seems to suggest as much)

  • … or this going to be changes to the command-line Haskell tooling?

    In other words, will this be kind of like stack and promote a new suite of tools for doing Haskell development?

  • … or this going to be improvements to the Haskell package ecosystem?

    In other words, will this shore up and/or revive some existing packages within the Haskell ecosystem?

Here’s what I think NeoHaskell aspires to be based on carefully reading through the website and all of the issues in the issue tracker and drawing (I believe) reasonable inferences:

NeoHaskell is not going to be a fork of ghc and is instead proposing to implement the following things:

  • A new command-line tool (neo) similar in spirit to stack
    • It is proposing some new features not present in stack but it reads to me as similar to stack.
  • A GHC plugin that would add:
    • new language features (none proposed so far, but it aims to be a Haskell dialect)
    • improved error messages
    • some improvements to the UX (e.g. automatic hole filling)
  • An attempt to revive the work on a mobile (ARM) backend for Haskell
  • An overhaul of Haskell’s standard libraries similar in spirit to foundation
  • TemplateHaskell support for the cpython package for more ergonomic Python interop
  • A set of documentation for the language and some parts of the ecosystem
  • An event sourcing framework
    • … and a set of template applications based on that framework

And in addition to that concrete roadmap Nick Seagull is essentially proposing the following governance model for the NeoHaskell project (and possibly the broader Haskell ecosystem if NeoHaskell gains traction):

  • Centralizing product management in himself as a benevolent dictator

    I don’t believe I’m exaggerating this. Here is the relevant excerpt from the announcement post, which explicitly references the BDFL model:

    I believe that in order for a product to be successful, the design process must be centralized in a single person. This person must listen to the users, the other designers, and in general must have an open mind to always cherry-pick all possible ideas in order to improve the product. I don’t believe that a product should be guided by democracy, and neither it should implement all suggestions by every user. In other words, I’ll be the one in charge of generating and listening to discussions, and prioritizing the features of the project.

    I understand that this comes with some risk, but at the same time I believe that all programming tools like Python and Ruby that are very loved by their communities are like that because of the BDFL model

  • Organizing work via the NeoHaskell discord and NeoHaskell GitHub issue tracker

I feel like it should have been easier to gather this concrete information about NeoHaskell’s aspirational goals, if only so that the project is less about vibes and more a discussion on a concrete roadmap.

Alright, so now I’ll explain my general impression of this project. I’ll start with the positive feedback followed by the negative feedback and I’ll be a bit less reserved and more emotionally honest in my feedback.

Positive feedback

Welcome contributions

I’m not the kind of person who will turn down someone willing to do work to make things better as long as they don’t make things worse. A new mobile backend for Haskell sounds great! Python interop using TemplateHaskell sounds nice! Documentation? Love it!

A GHC plugin is a good approach

I think the approach of implementing this as a GHC plugin is a much better idea than forking ghc. This sidesteps the ludicrous amount of work that would be required to maintain a fork of ghc.

Moreover, implementing any Haskell dialect as a GHC plugin actually minimizes ecosystem fragmentation because (similar to an alternate Prelude) it doesn’t “leak”. If one of your dependencies uses a GHC plugin for the NeoHaskell dialect then your package doesn’t have to use that same dialect (you can still build that dependency and code your package in non-Neo Haskell). cabal can handle that sort of thing transparently.

Haskell does need better product management

I think the Haskell foundation was supposed to be this (I could be wrong) but that didn’t really seem to pan out.

Either way, I think a lot of us know what good product management is and it is strikingly absent from the ecosystem.

Negative feedback

Benevolent dictator

I think it’s ridiculous that someone who hasn’t made significant contributions to the Haskell ecosystem wants to become a benevolent dictator for a project aspiring to make an outsized impact on the Haskell ecosystem. I know that this is harsh and a personal attack on Nick and I’m also mindful that there’s a real person behind the avatar. HOWEVER, when you propose to be a benevolent dictator you are inherently making things personal. A proposal to become a benevolent dictator is essentially a referendum on you as a person.1

And it’s not just a matter of fairness or whatever. Nick’s lack of Haskell credentials directly impact his ability to actually meaningfully improve upon prior art if he doesn’t understand the current state of the art. Like, when Michael Snoyman created stack it did lead to a lot of fragmentation in the Haskell tooling but at least I felt like he was justified in his attempt because he had an impressive track record and a deep understanding of the Haskell ecosystem and toolchain.

I do not get anything remotely resembling that impression from Nick Seagull. He strikes me as a dilettante in this area and not just due to his lack of Haskell credentials but also due to some of his questionable proposed changes. This brings me to:

Unwelcome contributions

Not all contributions benefit the ecosystem2. I think proposing a new neo build tool is likely to fragment the tooling in a way similar to stack. I have worked pretty extensively with all three of cabal, stack and Nix throughout my career and my intuition based on that experience is that the only improvement to the Haskell command-line experience that is viable and that will “win” in the long run is one that is directly upstreamed into cabal. It’s just that nobody wants to do that because it’s not as glamorous as writing your own build tool.

Similarly, I think his proposed vision of “event source all the Haskell applications” (including command-line scripts) is poorly thought out. I firmly subscribe to the principle of least power which says that you should use the simplest type or abstraction available that gets the job done instead of trying to shoehorn everything into the same “god type” or “god abstraction”. I learned this the hard way when I tried to shoehorn everything into my pipes package and realized that it was a huge mistake, so it’s not like I’m innocent in this regard. Don’t make the same mistake I did.

And it matters that some of these proposed changes are counterproductive because if he indeed plays a role as a benevolent dictator you’re not going to get to pick and choose which changes to keep and which changes to ignore. You’re getting the whole package, like it or not.

Not good product management

I don’t believe NeoHaskell is the good product management we’re all looking for. “Haskell dialect + python interop + event sourcing + mobile backend” is not a product. It’s an odd bundle of features that don’t have a clear market or vertical or use case to constrain the design and navigate tradeoffs. The NeoHaskell roadmap comes across to me as a grab bag of unrelated features which individually sound good but that is not necessarily good product management.

To make this concrete: what is the purpose of bundling both python interop and a mobile backend into NeoHaskell’s roadmap? As far as I know there is no product vertical that requires both of those things.

The overall vibe is bad

My initial impression of NeoHaskell was that it struck me as bullshit. Carefully note that I’m not saying that Nick is a bullshitter, but if he wants to be taken seriously then he needs to rethink how he presents his ideas. Everything from the tone of the announcement post (including the irrelevant AI-generated images), the complete absence of any supporting code or mockups, and the wishy washy statement of purpose all contributed to the non-serious vibes.


Anyway, I don’t hate Nick and I’m pretty sure I’d get along with him great in person in other contexts. He also seems like a decently accomplished guy in other respects. However, I think nominating himself as a benevolent dictator for an ambitious ecosystem is a bit irresponsible. However, we all make mistakes and can learn from them.

And I don’t endorse NeoHaskell. I don’t think it’s any more likely to succeed than Haskell absent some better product management. “I like simple Haskell tailored to blue collar engineers” is a nice vibe but it’s not a product.