Wednesday, October 4, 2023

A GHC plugin for OpenTelemetry build metrics

A GHC plugin for OpenTelemetry build metrics

This post is about a new OpenTelemetry plugin for GHC that I’ve been building for work that we’re open sourcing because I think it might be broadly useful to others. If all you want to do is use the plugin then you can find it on Hackage, which includes more detailed usage instructions. This post will focus more on the motivation and background behind the plugin’s development.

Motivation

The context behind this work was that we use Honeycomb at work for collecting metrics related to production and our team1 has begun to apply those same metrics to our builds. In particular, we wanted to collect detailed (module-level) build metrics so that we could begin to hunt down and fix expensive modules within our codebase. For context: our codebase currently has almost 7000 modules, so these expensive modules can easily fly under the radar.

When we enable the plugin and export the results to Honeycomb we can begin to see which modules are the most expensive to build:

Sample module build times

… and none of the modules are individually very expensive to build (the worst offender is only about 5 seconds), so they’d easily get lost within a sea of thousands of other modules.

However, these sorts of insights have already proven useful. For example:

  • one expensive modules was completely unused in our codebase

    The above list brought it to our attention so that we could delete it.

  • other expensive modules were representative examples of larger issues to fix

    For example, one expensive module consisted of 2000 invocations of an internal function which is expensive to type-check and fixing this function will improve compile speeds across our codebase and not just that module.

  • other expensive modules are indicative of architectural anti-patterns

    Frequently “horizontally-organized” modules top the chart, and I view them as anti-patterns for a few reasons (see: my post on Module organization guidelines). These modules are not expensive per se (the code inside them has to be compiled somewhere), but they tend to be build chokepoints because they have a large number of dependencies and reverse dependencies. Highlighting expensive modules has a tendency to highlight these sorts of build chokepoints as a side bonus.

In principle you can also browse a given build’s trace interactively, like this:

However, for our codebase Honeycomb chokes on our giant build traces and we can only produce visualizations like the above image if we filter down the spans to a randomly sampled subset of modules. Honeycomb doesn’t do a good job of handling traces with a few thousand spans or more.

Workarounds

This plugin was surprisingly difficult for me to implement because GHC’s Plugin interface is so constrained.

For example, the hs-opentelemetry-sdk package asks you to finalize any TracerProvider that you acquire, but there’s no good way (that I know of2) to run finalization logic at the end of a ghc build using the Plugin interface. The purpose of this finalization logic is to flush metrics that haven’t yet been exported.

So what I did was to hack around this by detecting all modules that are root modules of the build graph and flushing metrics after each of those root modules is built (since one of them will be the last module built). I tried a bunch of other alternative approaches (like installing a phase hook), but this was the only approach I was able to get to work.

And the OpenTelemetry plugin is full of workarounds like this. We have vetted internally that the plugin works for normal builds, ghcid and haskell-language-server, but generally I expect there to be some trailing bugs that we’ll have to fix as more people use it due to these various unsafe implementation details.

In fact, one limitation of the plugin is that the top-level span has a duration of 0 (instead of reporting the duration of the build). This is related to the same issue of the Plugin interface apparently not having a good way to run code exactly once after the build completes (even using hacks). If somebody knows of a way to do this that I missed I’d definitely welcome the tip!

Conclusion

What we do know from internal usage is that:

  • the plugin definitely scales to very large codebases (thousands of modules)

    … although honeycomb doesn’t scale to thousands of spans, but that’s not our fault.

  • the plugin’s overhead is negligible (so it’s safe to always enable)

  • the plugin works with cabal commands, ghcid, and haskell-language-server

So it should be fine for most use cases, but please report any issues that you run into.

Monday, October 2, 2023

My views on NeoHaskell

My views on NeoHaskell

Recently Nick Seagull has announced a NeoHaskell project which (I believe) has generated some controversy. My first run-in with NeoHaskell was this post on cohost criticizing the NeoHaskell project and a few of my friends within the Haskell community have expressed concern about the NeoHaskell project. My gut reaction is also critical, but I wanted to do a more thorough investigation before speaking publicly against NeoHaskell so I figured I would dig into the project more first. Who knows, maybe my gut reaction is wrong? 🤷🏻‍♀️

Another reason NeoHaskell is relevant to me is that I think a lot about marketing and product management for the Haskell community, and even presented a talk on How to market Haskell mainstream programmers so I’m particularly keen to study NeoHaskell through that lens to see if he is trying to approach things in a similar way or not.

I also have credentials to burnish in this regard. I have a lot of experience with product management and technical product management for open source projects via my work on Dhall. Not only did I author the original implementation of Dhall but I singlehandedly built most of the language ecosystem (including the language standard, documentation, numerous language bindings, and the command-line tooling) and mentored others to do the same.

Anyway, with that out of the way, on to NeoHaskell:

What is NeoHaskell?

I feel like this is probably the most important question to answer because unless there is a clear statement of purpose for a project there’s nothing to judge; it’s “not even wrong” because there’s no yardstick by which to measure it and nothing to challenge.

So what is NeoHaskell?

I’ll break this into two parts: what NeoHaskell is right now and what NeoHaskell aspires to be.

Based on what I’ve gathered, right now NeoHaskell is:

However, it’s not clear what NeoHaskell aspires to be from studying the website, the issue tracker, or announcement:

  • Is this going to be a new programming language inspired by Haskell?

    In other words, will this be a “clean room” implementation of a language which is Haskell-like?

  • … or this going to be a fork of Haskell (more specifically: ghc) to add the desired features?

    In other words, will the relationship of NeoHaskell to Haskell be similar to the relationship between NeoVim and Vim? (The name seems to suggest as much)

  • … or this going to be changes to the command-line Haskell tooling?

    In other words, will this be kind of like stack and promote a new suite of tools for doing Haskell development?

  • … or this going to be improvements to the Haskell package ecosystem?

    In other words, will this shore up and/or revive some existing packages within the Haskell ecosystem?

Here’s what I think NeoHaskell aspires to be based on carefully reading through the website and all of the issues in the issue tracker and drawing (I believe) reasonable inferences:

NeoHaskell is not going to be a fork of ghc and is instead proposing to implement the following things:

  • A new command-line tool (neo) similar in spirit to stack
    • It is proposing some new features not present in stack but it reads to me as similar to stack.
  • A GHC plugin that would add:
    • new language features (none proposed so far, but it aims to be a Haskell dialect)
    • improved error messages
    • some improvements to the UX (e.g. automatic hole filling)
  • An attempt to revive the work on a mobile (ARM) backend for Haskell
  • An overhaul of Haskell’s standard libraries similar in spirit to foundation
  • TemplateHaskell support for the cpython package for more ergonomic Python interop
  • A set of documentation for the language and some parts of the ecosystem
  • An event sourcing framework
    • … and a set of template applications based on that framework

And in addition to that concrete roadmap Nick Seagull is essentially proposing the following governance model for the NeoHaskell project (and possibly the broader Haskell ecosystem if NeoHaskell gains traction):

  • Centralizing product management in himself as a benevolent dictator

    I don’t believe I’m exaggerating this. Here is the relevant excerpt from the announcement post, which explicitly references the BDFL model:

    I believe that in order for a product to be successful, the design process must be centralized in a single person. This person must listen to the users, the other designers, and in general must have an open mind to always cherry-pick all possible ideas in order to improve the product. I don’t believe that a product should be guided by democracy, and neither it should implement all suggestions by every user. In other words, I’ll be the one in charge of generating and listening to discussions, and prioritizing the features of the project.

    I understand that this comes with some risk, but at the same time I believe that all programming tools like Python and Ruby that are very loved by their communities are like that because of the BDFL model

  • Organizing work via the NeoHaskell discord and NeoHaskell GitHub issue tracker

I feel like it should have been easier to gather this concrete information about NeoHaskell’s aspirational goals, if only so that the project is less about vibes and more a discussion on a concrete roadmap.

Alright, so now I’ll explain my general impression of this project. I’ll start with the positive feedback followed by the negative feedback and I’ll be a bit less reserved and more emotionally honest in my feedback.

Positive feedback

Welcome contributions

I’m not the kind of person who will turn down someone willing to do work to make things better as long as they don’t make things worse. A new mobile backend for Haskell sounds great! Python interop using TemplateHaskell sounds nice! Documentation? Love it!

A GHC plugin is a good approach

I think the approach of implementing this as a GHC plugin is a much better idea than forking ghc. This sidesteps the ludicrous amount of work that would be required to maintain a fork of ghc.

Moreover, implementing any Haskell dialect as a GHC plugin actually minimizes ecosystem fragmentation because (similar to an alternate Prelude) it doesn’t “leak”. If one of your dependencies uses a GHC plugin for the NeoHaskell dialect then your package doesn’t have to use that same dialect (you can still build that dependency and code your package in non-Neo Haskell). cabal can handle that sort of thing transparently.

Haskell does need better product management

I think the Haskell foundation was supposed to be this (I could be wrong) but that didn’t really seem to pan out.

Either way, I think a lot of us know what good product management is and it is strikingly absent from the ecosystem.

Negative feedback

Benevolent dictator

I think it’s ridiculous that someone who hasn’t made significant contributions to the Haskell ecosystem wants to become a benevolent dictator for a project aspiring to make an outsized impact on the Haskell ecosystem. I know that this is harsh and a personal attack on Nick and I’m also mindful that there’s a real person behind the avatar. HOWEVER, when you propose to be a benevolent dictator you are inherently making things personal. A proposal to become a benevolent dictator is essentially a referendum on you as a person.1

And it’s not just a matter of fairness or whatever. Nick’s lack of Haskell credentials directly impact his ability to actually meaningfully improve upon prior art if he doesn’t understand the current state of the art. Like, when Michael Snoyman created stack it did lead to a lot of fragmentation in the Haskell tooling but at least I felt like he was justified in his attempt because he had an impressive track record and a deep understanding of the Haskell ecosystem and toolchain.

I do not get anything remotely resembling that impression from Nick Seagull. He strikes me as a dilettante in this area and not just due to his lack of Haskell credentials but also due to some of his questionable proposed changes. This brings me to:

Unwelcome contributions

Not all contributions benefit the ecosystem2. I think proposing a new neo build tool is likely to fragment the tooling in a way similar to stack. I have worked pretty extensively with all three of cabal, stack and Nix throughout my career and my intuition based on that experience is that the only improvement to the Haskell command-line experience that is viable and that will “win” in the long run is one that is directly upstreamed into cabal. It’s just that nobody wants to do that because it’s not as glamorous as writing your own build tool.

Similarly, I think his proposed vision of “event source all the Haskell applications” (including command-line scripts) is poorly thought out. I firmly subscribe to the principle of least power which says that you should use the simplest type or abstraction available that gets the job done instead of trying to shoehorn everything into the same “god type” or “god abstraction”. I learned this the hard way when I tried to shoehorn everything into my pipes package and realized that it was a huge mistake, so it’s not like I’m innocent in this regard. Don’t make the same mistake I did.

And it matters that some of these proposed changes are counterproductive because if he indeed plays a role as a benevolent dictator you’re not going to get to pick and choose which changes to keep and which changes to ignore. You’re getting the whole package, like it or not.

Not good product management

I don’t believe NeoHaskell is the good product management we’re all looking for. “Haskell dialect + python interop + event sourcing + mobile backend” is not a product. It’s an odd bundle of features that don’t have a clear market or vertical or use case to constrain the design and navigate tradeoffs. The NeoHaskell roadmap comes across to me as a grab bag of unrelated features which individually sound good but that is not necessarily good product management.

To make this concrete: what is the purpose of bundling both python interop and a mobile backend into NeoHaskell’s roadmap? As far as I know there is no product vertical that requires both of those things.

The overall vibe is bad

My initial impression of NeoHaskell was that it struck me as bullshit. Carefully note that I’m not saying that Nick is a bullshitter, but if he wants to be taken seriously then he needs to rethink how he presents his ideas. Everything from the tone of the announcement post (including the irrelevant AI-generated images), the complete absence of any supporting code or mockups, and the wishy washy statement of purpose all contributed to the non-serious vibes.

Conclusion

Anyway, I don’t hate Nick and I’m pretty sure I’d get along with him great in person in other contexts. He also seems like a decently accomplished guy in other respects. However, I think nominating himself as a benevolent dictator for an ambitious ecosystem is a bit irresponsible. However, we all make mistakes and can learn from them.

And I don’t endorse NeoHaskell. I don’t think it’s any more likely to succeed than Haskell absent some better product management. “I like simple Haskell tailored to blue collar engineers” is a nice vibe but it’s not a product.

Friday, September 8, 2023

GHC plugin for HLint

GHC plugin for HLint

At work I was recently experimenting with running hlint (the widely used Haskell linting program) as a GHC plugin. One reason why I was interested in this is because we have a large (6000+ module) Haskell codebase at work, and I wanted to see if this would make it cheaper to run hlint on our codebase. Ultimately it did not work out but I built something that we could open source so I polished it up and released it in case other people find it useful. You can find the plugin (named hlint-plugin) on Hackage and on GitHub.

This post will explain the background and motivation behind this work to explain why such a plugin might be potentially useful to other Haskell users.

Introduction to hlint

If you’ve never heard of hlint before, it’s a Haskell source code linting tool that is pretty widely used in the Haskell ecosystem. For example, if you run hlint on the following Haskell file:

main :: IO ()
main = (mempty)

… then you’ll get the following hlint error message:

Main.hs:2:8-15: Warning: Redundant bracket
Found:
  (mempty)
Perhaps:
  mempty
  
1 hint

… telling the user to remove the parentheses1 from around the mempty.

Integrating hlint

However, hlint is a tool that is not integrated into the compiler, meaning that you have to run it out of band from compilation for it to catch errors. There are a few ways that one can fix this, though:

  • Create a script that builds your program and then runs hlint

    This is the simplest possible thing that one can do, but it works and some people do this. It’s the “low-tech” solution.

  • Use haskell-language-server or some IDE that plugin that auto-runs hlint

    This is a bit nicer for developers because now they can get rapid feedback (in their editor) as they are authoring the code. For example, haskell-language-server supports an hlint plugin2 for this purpose.

  • A GHC plugin (what this post is about)

    If you turn hlint into a GHC plugin, then ALL GHC-based Haskell tools automatically incorporate hlint suggestions. For example, ghcid would automatically include hlint suggestions in its output, something that doesn’t work with other approaches to integrate hlint. Similarly, all cabal commands (including cabal build and cabal repl) and all stack commands benefit from a GHC plugin.

Alternatives

I’m not the first person who had this idea of turning hlint into a GHC plugin. The first attempt to do this was hlint-source-plugin, but that was a pretty low-tech solution; it basically ran hlint as an executable on the Haskell source file being processed even though the GHC plugin already has access to the parsed syntax tree.

The second attempt was the splint package. This GHC plugin was really well done (it’s basically exactly how I envisioned this was supposed to work) and the corresponding announcement post does a great job of motivating why hlint benefits from being run as a GHC plugin.

However, the problem is that the splint package was recently abandoned and the last version of GHC it supports is GHC 9.2. Since we use GHC 9.6 at work I decided to essentially revive the splint package so I created the hlint-plugin package which is essentially the successor to splint.

Improvements

hlint-plugin is not too different from what splint did, but the main improvements that hlint-plugin brings are:

  • Support for newer versions of GHC

    splint supports GHC versions 8.10, 9.0, and 9.2 whereas hlint-plugin supports GHC versions 9.0, 9.2, 9.4, and 9.6.

  • Known-good cabal/stack/nix builds for the plugin

    … see the next section for more details.

  • A test suite to verify that the plugin works

    hlint-plugin’s CI actually checks that the plugin works for all supported versions of GHC.

  • A simpler work-around to GHC issue #18261

    Basically, I independently stumbled upon the exact same problem that splint encountered, but worked around it in a simpler way. I won’t go into too much detail here other than to point out that you can compare how splint works around this bug with how hlint-plugin works around the bug.

Also, when stress testing hlint-plugin on our internal codebase I discovered an hlint bug which affected some of our modules, and fixed that, so the fix will be in the next release of hlint.

Tricky build stuff

Unfortunately, both splint and hlint-plugin are tricky to correctly install. Why? Because, by default hlint (and ghc-lib-parser-ex) use the ghc-lib and ghc-lib-parser packages by default instead of the ghc API. This is actually a pain in the ass because a GHC plugin needs to be created using the ghc API (i.e. it needs to be a value of type ghc:GHC.Plugins.Plugin). Like, you can use hlint to create a ghc-lib:GHC.Plugins.Plugin and everything will type-check and build, but then when you try to actually run the plugin it will fail.

There is a way to get hlint and ghc-lib-parser-ex to use the ghc API, though! However, you have to build them with non-default cabal configure flags. Specifically, you have to configure hlint with the -f-ghc-lib option and configure ghc-lib-parser-ex with the -fno-ghc-lib option.

To ease things for users I provided a cabal.project file and a flake.nix file4 with working builds for hlint-plugin that set all the correct configuration options.

Performance

I mentioned in the introduction that I was hoping for some performance improvements from switching to a plugin but those improvements didn’t materialize. I’ll talk a bit about what I thought would work and why it didn’t pan out for us (even though it still might help for you).

So there are up to three ways that hlint could potentially be faster as a GHC plugin:

  • Not having to re-lint modules that haven’t changed

    This is nice (especially when your codebase has 6000+ modules like ours). When you turn hlint into a GHC plugin you only run it whenever GHC recompiles a module and you don’t have to run hlint over your entire codebase after every change.

    However, this was actually not a significant benefit to our company because we already have some scripts which take care of only running hlint on the modules that have changed (according to git). However, it’s still a “nice to have” because it’s architecturally simpler (no need to write that clever script if GHC can take care of detecting changes for us).

  • Not having to parse the Haskell code twice

    This is likely a minor performance improvement since parsing is (in my experience) typically not the bottleneck for compiling Haskell code.

  • Running hlint while GHC is compiling modules

    What I mean by this is that if hlint is a GHC plugin then it can begin running while the GHC build is ongoing! In large builds (like ours) there are often a large number of cores that go unused and the hlint plugin could potentially exploit those idle cores to do work before the build is done.

    However, in practice this benefit did not pan out and our build didn't really get faster when we enabled hlint-plugin. The time it took to build our codebase with the plugin was essentially the same amount of time as running hlint in a separate step.

Future directions

The hlint-source-plugin repository notes that if hlint were implemented as a GHC plugin (which it now is) then it would fix some of the hacks that hlint has to use:

Currently this plugin simply hooks into the parse stage and calls HLint with a file path. This means HLint will re-parse all source code. The next logical step is to use the actual parse tree, as given to us by GHC, and HLint that. This means that HLint can lose the special logic to run CPP, along with the hacky handling of fixity resolution (we get that done correctly by GHC’s renaming phase).

… because of this I sort of feel that hlint really should be a GHC plugin. It’s understandable why hlint was not initially implemented in this way (since I believe the GHC plugin system didn’t exist back then), but now it sort of feels like a GHC plugin is a much more natural way of integrating hlint.


  1. I refuse to call parentheses “brackets”.↩︎

  2. Note that this is a plugin for haskell-language-server, which is a different type of plugin than a GHC plugin. A haskell-language-server plugin only works with haskell-language-server whereas a GHC plugin works with anything that uses GHC. The two types of plugins are also installed and set up in different ways.↩︎

  3. Note that this is a plugin for haskell-language-server, which is a different type of plugin than a GHC plugin. A haskell-language-server plugin only works with haskell-language-server whereas a GHC plugin works with anything that uses GHC. The two types of plugins are also installed and set up in different ways.↩︎

  4. I tried to create a working stack.yaml and failed to get it working, but I’d accept a pull request adding a working stack build if someone else has better luck than I did.↩︎

Monday, April 3, 2023

Ergonomic newtypes for Haskell strings and numbers

Ergonomic newtypes for Haskell strings and numbers

This blog post summarizes a very brief trick I commonly recommend whenever I see something like this:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text (Text)
import Numeric.Natural (Natural)

newtype Name = Name { getName :: Text }
    deriving (Show)

newtype Age = Age { getAge :: Natural }
    deriving (Show)

data Person = Person { name :: Name, age :: Age }
    deriving (Show)

example :: Person
example = Person{ name = Name "John Doe", age = Age 42 }

… where the newtypes are not opaque (i.e. the newtype constructors are exported), so the newtypes are more for documentation purposes rather than type safety.

The issue with the above code is that the newtypes add extra boilerplate for both creating and displaying those types. For example, in order to create the Name and Age newtypes you need to explicitly specify the Name and Age constructors (like in the definition for example above) and they also show up when displaying values for debugging purposes (e.g. in the REPL):

>>> example
Person {name = Name {getName = "John Doe"}, age = Age {getAge = 42}}

Fortunately, you can easily elide these noisy constructors if you follow these rules of thumb:

  • Derive IsString for newtypes around string-like types

  • Derive Num for newtypes around numeric types

  • Change the Show instances to use the underlying Show for the wrapped type

For example, I would suggest amending the original code like this:

{-# LANGUAGE DerivingStrategies         #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE OverloadedStrings          #-}

module Example1 where

import Data.Text (Text)
import Data.String (IsString)
import Numeric.Natural (Natural)

newtype Name = Name { getName :: Text }
    deriving newtype (IsString, Show)

newtype Age = Age { getAge :: Natural }
    deriving newtype (Num, Show)

data Person = Person { name :: Name, age :: Age }
    deriving stock (Show)

example :: Person
example = Person{ name = "John Doe", age = 42 }

… and now the Age and Name constructors are invisible, even when displaying these types (using their Show instances):

>>> example
Person {name = "John Doe", age = 42}

That is the entirety of the trick, but if you still don’t follow, I’ll expand upon that below.

Explanation

Revisiting the starting code:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text (Text)
import Numeric.Natural (Natural)

newtype Name = Name { getName :: Text }
    deriving (Show)

newtype Age = Age { getAge :: Natural }
    deriving (Show)

data Person = Person { name :: Name, age :: Age }
    deriving (Show)

example :: Person
example = Person{ name = Name "John Doe", age = Age 42 }

… the first thing we’re going to do is to enable the DerivingStrategies language extension because I’m going to lean pretty heavily on Haskell’s support for deriving typeclass instances in this post and I want to be more explicit about how these instances are being derived:

{-# LANGUAGE DerivingStrategies #-}

newtype Name = Name { getName :: Text }
    deriving stock (Show)

newtype Age = Age { getAge :: Natural }
    deriving stock (Show)

I’ve changed the code to explicitly specify that we’re deriving Show using the “stock” deriving strategy, meaning that Haskell has built-in language support for deriving Show and we’re going to use that.

The next step is that we’re going to add an IsString instance for Name because it wraps a string-like type (Text). However, at first we’ll write out the instance by hand:

import Data.String (IsString(..))

instance IsString Name where
    fromString string = Name (fromString string)

This IsString instance works in conjunction with Haskell’s OverloadedStrings so that we can directly use a string literal in place of a Name, like this:

example :: Person
example = Person{ name = "John Doe", age = Age 42 }
                      -- ↑
                      -- No more Name constructor required here

… and the reason that works is because the compiler implicitly inserts fromString around all string literals when you enable OverloadedStrings, as if we had written this:

example :: Person
example = Person{ name = fromString "John Doe", age = Age 42 }

The IsString instance for Name:

instance IsString Name where
    fromString string = Name (fromString string)

… essentially defers to the IsString instance for the underlying wrapped type (Text). In fact, this pattern of deferring to the underlying instance is common enough that Haskell provides a language extension for this purpose: GeneralizedNewtypeDeriving. If we enable that language extension, then we can simplify the IsString instance to this:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

newtype Name = Name { getName :: Text }
    deriving stock (Show)
    deriving newtype (IsString)

The deriving newtype indicates that we’re explicitly using the GeneralizedNewtypeDeriving extension to derive the implementation for the IsString instance.

In this particular case we don’t have to specify the deriving strategy; we could have just said deriving (IsString) and it still would have worked because it wasn’t ambiguous; no other deriving strategy would have worked in this case. However, as we’re about to see there are cases where you want to explicitly disambiguate between multiple possible deriving strategies.

The next step is that we implement Num for our Age type since it wraps a numeric type (Natural):

instance Num Age where
    Age x + Age y = Age (x + y)

    Age x - Age y = Age (x - y)

    Age x * Age y = Age (x * y)

    negate (Age x) = Age (negate x)

    abs (Age x) = Age (abs x)

    signum (Age x) = Age (signum x)

    fromInteger integer = Age (fromInteger integer)

Bleh! That’s a lot of work to do when really we were most interested in the fromInteger method (so that we could use numeric literals directly to create an Age).

The reason we care about the fromInteger method is because Haskell lets you use integer literals for any type that implements Num (without any language extension; this is part of the base language). So, for example, we can further simplify our example Person to:

example :: Person
example = Person{ name = "John Doe", age = 42 }
                                        -- ↑
                                        -- No more Age constructor required here

… and the reason that works is because the compiler implicitly inserts fromInteger around all integer literals, as if we had written this:

example :: Person
example = Person{ name = "John Doe", age = fromInteger 42 }

It would be nice if Haskell had a dedicated class for just the fromInteger method (e.g. IsInteger), but alas if we want ergonomic support for numeric literals then we have to add support for other numeric operations, too, even if they might not necessarily make sense for our newtype.

Like before, though, we can use the GeneralizedNewtypeDeriving extension to derive Num instead:

newtype Age = Age { getAge :: Natural }
    deriving stock (Show)
    deriving newtype (Num)

Much better!

However, we’re not done, yet, because at the moment these Name and Age constructors still appear in the debug output:

>>> example
Person {name = Name {getName = "John Doe"}, age = Age {getAge = 42}}

Yuck!

Okay, so the final step is to change the Show instances for Name and Age to defer to the Show instances for their underlying types:

instance Show Name where
    show (Name string) = show string

instance Show Age where
    show (Age natural) = show natural

These are still valid Show instances! The Show class requires that the displayed representation should be valid Haskell code for creating a value of that type, and in both cases that’s what we get.

For example, if you show a value like Name "John Doe" you will get "John Doe", and that’s valid Haskell code for creating a Name if you enable OverloadedStrings.

Note: You might argue that this is not a valid Show instance because it requires the use of a language extension (e.g. OverloadedStrings) in order to be valid code. However, this is no different than the Show instance for Text (which is also only valid if you enable OverloadedStrings), and most people do not take issue with that Show instance for Text either.

Similarly, if you show a value like Age 42 you will get 42, and that’s valid Haskell code for creating an Age.

So with those two new Show instances our Person type now renders much more compactly:

>>> example
Person {name = "John Doe", age = 42}

… but we’re not done! The last part of the trick is to use GeneralizedNewtypeDeriving to derive the Show instances, like this:

newtype Name = Name { getName :: Text }
    deriving newtype (IsString, Show)

newtype Age = Age { getAge :: Natural }
    deriving newtype (Num, Show)

… and this is where the DerivingStrategies language extension really matters! Without that extension there would be no way to tell the compiler to derive Show by deferring to the underlying type. By default, if you don’t specify the deriving strategy then the compiler assumes that derived Show instances use the stock deriving strategy.

Conclusion

There’s one last bonus to doing things in this way: you might now be able to hide the newtype constructor by not exporting it! I think this is actually the most important benefit of all because a newtype with an exposed constructor doesn’t really improve upon the type safety of the underlying type.

When a newtype like Name or Age exposes the newtype constructor then the newtype serves primarily as documentation and I’m not a big fan of this “newtypes as documentation” design pattern. However, I’m not that strongly opposed to it either; I wouldn’t use it in own code, but I also wouldn’t insist that others don’t use it. Another post which takes a stronger stance on this is Names are not type safety, especially the section on “Newtypes as tokens”.

I’m personally okay with other people using newtypes in this way, but if you do use “newtypes as documentation” then please add IsString / Num / Show instances as described in this post so that they’re more ergonomic for others to use.

Monday, March 6, 2023

The "open source native" principle for software design

The "open source native" principle for software design

This post summarizes a software design principle I call the “open source native” principle which I’ve invoked a few times as a technical lead. I wanted to write this down so that I could easily reference this post in the future.

The “open source native” principle is simple to state:

Design proprietary software as if you intended to open source that software, regardless of whether you will open source that software

I call this the “open source native” principle because you design your software as if it were a “native” member of the open source ecosystem. In other words, your software is spiritually “born” open source, aspirationally written from the beginning to be a good open source citizen, even if you never actually end up open sourcing that software.

You can’t always adhere to this principle, but I still use this as a general design guideline.

Example

It’s hard to give a detailed example of this principle since most of the examples I’d like to use are … well … proprietary and wouldn’t make sense outside of their respective organizations. However, I’ll try to outline a hypothetical example (inspired by a true story) that hopefully enough can people can relate to.

Suppose that your organization provides a product with a domain-specific programming language for customizing their product’s behavior. Furthermore, suppose that you’re asked to design and implement a package manager for this programming language.

There are multiple data stores you could use for storing packages, but to simplify this example suppose there are only two options:

  • Store packages in a product-specific database

    Perhaps your product already uses a database for other reasons, so you figure that you can reuse that existing database for storing packages. That way you don’t need to set up any new infrastructure to get going since the database team will handle that for you. Plus you get the full powerful of a relational database so now you have powerful tools for querying and/or modifying packages.

  • Store packages in git

    You might instead store your packages as flat files inside of a git repository.

These represent two extremes of the spectrum and in reality there might be other options in between (like a standalone sqlite database), but this is a contrived example.

According to the open source principle, you’d prefer to store packages in git because git is a foundational building block of the open source ecosystem that is already battle-tested for this purpose. You’d be sacrificing some features (you’d no longer have access to the full power of a relational database), but your package manager would now be more “open-source native”.

You might wonder: why would one deliberately constrain themselves like that? What’s the benefit of designing things in this way if they might never be open sourced?

Motivation

There are several reasons I espouse this design principle:

  • better testability

    If you design your component so that it’s easy to use outside of the context of your product then it’s also easier to test in isolation. This means that you don’t need to rely on heavyweight integration tests or end-to-end tests to verify that your component works correctly.

    For example, a package manager based on git is easier to test than a package manager based on a database because a git repository is easier to set up.

  • faster release cadence

    If your component can be tested in isolation then you don’t even need to share continuous integration (CI) with the rest of your organization. Your component can have its own CI and release on whatever frequency is appropriate for that component instead of coupling its release cadence to the rest of your product.

    That in turn typically means that you can release earlier and more often, which is a virtue in its own right.

    Continuing the package manager example, you wouldn’t need to couple releases of your package manager to the release cadence of the rest of your product, so you’d be able to push out improvements or fixes more quickly.

  • simpler documentation

    It’s much easier to write a tutorial for software that delivers value in isolation since there’s less supporting infrastructure necessary to follow along with the tutorial.

  • well-chosen interfaces

    You have to carefully think through the correct logical boundaries for your software when you design for a broader audience of users. It’s also easier to enforce stronger boundaries and narrower scope for the same reasons.

    For example, our hypothetical package manager is less likely to have package metadata polluted with product-specific details if it is designed to operate independently of the product.

  • improved stability

    Open source software doesn’t just target a broader audience, but also targets a broader time horizon. An open source mindset promotes thinking beyond the needs of this financial quarter.

  • you can open source your component! (duh)

    Needless to say, if you design your component to be open-source native, it’s also easier to open source. Hooray! 🎉

Conclusion

You can think of this design principle as being similar to the rule of least power, where you’re making your software less powerful (by adding the additional constraint that it can be open sourced), but in turn improving ease of comprehension, maintainability, and distribution.

Also, if you have any examples along these lines that you care to share, feel free to drop them in the comments.

Monday, January 30, 2023

terraform-nixos-ng: Modern terraform support for NixOS

terraform-nixos-ng: Modern terraform support for NixOS

Recently I’ve been working on writing a “NixOS in Production” book and one of the chapters I’m writing is on deploying NixOS using terraform. However, one of the issues I ran across was the poor NixOS support for terraform. I’ve already gone through the nix.dev post explaining how to use the terraform-nixos project but I ran into several issues trying to follow those instructions (which I’ll explain below). That plus the fact that terraform-nixos seems to be unmaintained pushed me over the edge to rewrite the project to simplify and improve upon it.

So this post is announcing my terraform-nixos-ng project:

… which is a rewrite of terraform-nixos and I’ll use this post to compare and contrast the two projects. If you’re only interested in trying out the terraform-nixos-ng project then go straight to the README

Using nixos-rebuild

One of the first things I noticed when kicking the tires on terraform-nixos was that it was essentially reinventing what the nixos-rebuild tool already does. In fact, I was so surprised by this that I wrote a standalone post explaining how to use nixos-rebuild as a deployment tool:

Simplifying that code using nixos-rebuild fixed lots of tiny papercuts I had with terraform-nixos, like:

  • The deploy failing if you don’t have a new enough version of bash installed

  • The inability to turn off the use of the --use-substitutes flag

    That flag causes issues if you want to deploy to a machine that disables outbound connections.

  • The dearth of useful options (compared to nixos-rebuild)

    … including the inability to fully customize ssh options

  • The poor interop with flakes

    For example, terraform-nixos doesn’t respect the standard nixosConfigurations flake output hierarchy.

    Also, terraform-nixos doesn’t use flakes natively (it uses flake-compat), which breaks handling of the config.nix.binary{Caches,CachePublicKeys} flakes settings. The Nix UX for flakes is supposed to ask the user to consent to those settings (because they are potentially insecure to auto-enable for a flake), but their workaround breaks that UX by automatically enabling those settings without the user’s consent.

I wanted to upstream this rewrite to use nixos-rebuild into terraform-nixos, but I gave up on that idea when I saw that no pull request since 2021 had been merged, including conservative pull requests like this one to just use the script included within the repository to update the list of available AMIs.

That brings me to the next improvement, which is:

Auto-generating available AMIs

The terraform-nixos repository requires the AMI list to be manually updated. The way you do this is to periodically run a script to fetch the available AMIs from Nixpkgs and then create a PR to vendor those changes. However, this shouldn’t be necessary because we could easily program terraform to generate the list of AMIs on the fly.

This is what the terraform-nixos-ng project does, where the ami module creates a data source that runs an equivalent script to fetch the AMIs at provisioning time.

In the course of rewriting the AMI module, I made another small improvement, which was:

Support for aarch64 AMIs

Another gripe I had with terraform-nixos-ng is that its AMI module doesn’t support aarch64-linux NixOS AMIs even though these AMIs exist and Nixpkgs supports them. That was a small and easy fix, too.

Functionality regressions

terraform-nixos-ng is not a strict improvement over terraform-nixos, though. Specifically, the most notable feature omissions are:

  • Support for non-flake workflows

    terraform-nixos-ng requires the use of flakes and doesn’t provide support for non-flake-based workflows. I’m very much on team “Nix flakes are good and shouldn’t be treated as experimental any longer” so I made an opinionated choice to require users to use flakes rather than support their absence.

    This choice also isn’t completely aesthetic, the use of flakes improves interop with nixos-rebuild, where flakes are the most ergonomic way for nixos-rebuild to select from one of many deployments.

  • Support for secrets management

    I felt that this should be handled by something like sops-nix rather than rolling yet another secrets management system that was idiosyncratic to this deploy tool. In general, I wanted these terraform modules to be as lightweight as possible by making more idiomatic use of the modern NixOS ecosystem.

  • Support for Google Compute Engine images

    terraform-nixos supports GCE images and the only reason I didn’t add the same support is because I’ve never used Google Compute Engine so I didn’t have enough context to do a good rewrite, nor did I have the inclination to set up a GCE account just to test the rewrite. However, I’d accept a pull request adding this support from someone interested in this feature.

Conclusion

There’s one last improvement over the terraform-nixos project, which is that I don’t leave projects in an abandoned state. Anybody who has contributed to my open source projects knows that I’m generous about handing out the commit bit and I’m also good about relinquishing control if I don’t have time to maintain the project myself.

However, I don’t expect this to be a difficult project to maintain anyway because I designed terraform-nixos-ng to outsource the work to existing tools as much as possible instead of reinventing the wheel. This is why the implementation of terraform-nixos-ng is significantly smaller than terraform-nixos.