Wednesday, October 4, 2023

A GHC plugin for OpenTelemetry build metrics

A GHC plugin for OpenTelemetry build metrics

This post is about a new OpenTelemetry plugin for GHC that I’ve been building for work that we’re open sourcing because I think it might be broadly useful to others. If all you want to do is use the plugin then you can find it on Hackage, which includes more detailed usage instructions. This post will focus more on the motivation and background behind the plugin’s development.

Motivation

The context behind this work was that we use Honeycomb at work for collecting metrics related to production and our team1 has begun to apply those same metrics to our builds. In particular, we wanted to collect detailed (module-level) build metrics so that we could begin to hunt down and fix expensive modules within our codebase. For context: our codebase currently has almost 7000 modules, so these expensive modules can easily fly under the radar.

When we enable the plugin and export the results to Honeycomb we can begin to see which modules are the most expensive to build:

Sample module build times

… and none of the modules are individually very expensive to build (the worst offender is only about 5 seconds), so they’d easily get lost within a sea of thousands of other modules.

However, these sorts of insights have already proven useful. For example:

  • one expensive modules was completely unused in our codebase

    The above list brought it to our attention so that we could delete it.

  • other expensive modules were representative examples of larger issues to fix

    For example, one expensive module consisted of 2000 invocations of an internal function which is expensive to type-check and fixing this function will improve compile speeds across our codebase and not just that module.

  • other expensive modules are indicative of architectural anti-patterns

    Frequently “horizontally-organized” modules top the chart, and I view them as anti-patterns for a few reasons (see: my post on Module organization guidelines). These modules are not expensive per se (the code inside them has to be compiled somewhere), but they tend to be build chokepoints because they have a large number of dependencies and reverse dependencies. Highlighting expensive modules has a tendency to highlight these sorts of build chokepoints as a side bonus.

In principle you can also browse a given build’s trace interactively, like this:

However, for our codebase Honeycomb chokes on our giant build traces and we can only produce visualizations like the above image if we filter down the spans to a randomly sampled subset of modules. Honeycomb doesn’t do a good job of handling traces with a few thousand spans or more.

Workarounds

This plugin was surprisingly difficult for me to implement because GHC’s Plugin interface is so constrained.

For example, the hs-opentelemetry-sdk package asks you to finalize any TracerProvider that you acquire, but there’s no good way (that I know of2) to run finalization logic at the end of a ghc build using the Plugin interface. The purpose of this finalization logic is to flush metrics that haven’t yet been exported.

So what I did was to hack around this by detecting all modules that are root modules of the build graph and flushing metrics after each of those root modules is built (since one of them will be the last module built). I tried a bunch of other alternative approaches (like installing a phase hook), but this was the only approach I was able to get to work.

And the OpenTelemetry plugin is full of workarounds like this. We have vetted internally that the plugin works for normal builds, ghcid and haskell-language-server, but generally I expect there to be some trailing bugs that we’ll have to fix as more people use it due to these various unsafe implementation details.

In fact, one limitation of the plugin is that the top-level span has a duration of 0 (instead of reporting the duration of the build). This is related to the same issue of the Plugin interface apparently not having a good way to run code exactly once after the build completes (even using hacks). If somebody knows of a way to do this that I missed I’d definitely welcome the tip!

Conclusion

What we do know from internal usage is that:

  • the plugin definitely scales to very large codebases (thousands of modules)

    … although honeycomb doesn’t scale to thousands of spans, but that’s not our fault.

  • the plugin’s overhead is negligible (so it’s safe to always enable)

  • the plugin works with cabal commands, ghcid, and haskell-language-server

So it should be fine for most use cases, but please report any issues that you run into.

Monday, October 2, 2023

My views on NeoHaskell

My views on NeoHaskell

Recently Nick Seagull has announced a NeoHaskell project which (I believe) has generated some controversy. My first run-in with NeoHaskell was this post on cohost criticizing the NeoHaskell project and a few of my friends within the Haskell community have expressed concern about the NeoHaskell project. My gut reaction is also critical, but I wanted to do a more thorough investigation before speaking publicly against NeoHaskell so I figured I would dig into the project more first. Who knows, maybe my gut reaction is wrong? 🤷🏻‍♀️

Another reason NeoHaskell is relevant to me is that I think a lot about marketing and product management for the Haskell community, and even presented a talk on How to market Haskell mainstream programmers so I’m particularly keen to study NeoHaskell through that lens to see if he is trying to approach things in a similar way or not.

I also have credentials to burnish in this regard. I have a lot of experience with product management and technical product management for open source projects via my work on Dhall. Not only did I author the original implementation of Dhall but I singlehandedly built most of the language ecosystem (including the language standard, documentation, numerous language bindings, and the command-line tooling) and mentored others to do the same.

Anyway, with that out of the way, on to NeoHaskell:

What is NeoHaskell?

I feel like this is probably the most important question to answer because unless there is a clear statement of purpose for a project there’s nothing to judge; it’s “not even wrong” because there’s no yardstick by which to measure it and nothing to challenge.

So what is NeoHaskell?

I’ll break this into two parts: what NeoHaskell is right now and what NeoHaskell aspires to be.

Based on what I’ve gathered, right now NeoHaskell is:

However, it’s not clear what NeoHaskell aspires to be from studying the website, the issue tracker, or announcement:

  • Is this going to be a new programming language inspired by Haskell?

    In other words, will this be a “clean room” implementation of a language which is Haskell-like?

  • … or this going to be a fork of Haskell (more specifically: ghc) to add the desired features?

    In other words, will the relationship of NeoHaskell to Haskell be similar to the relationship between NeoVim and Vim? (The name seems to suggest as much)

  • … or this going to be changes to the command-line Haskell tooling?

    In other words, will this be kind of like stack and promote a new suite of tools for doing Haskell development?

  • … or this going to be improvements to the Haskell package ecosystem?

    In other words, will this shore up and/or revive some existing packages within the Haskell ecosystem?

Here’s what I think NeoHaskell aspires to be based on carefully reading through the website and all of the issues in the issue tracker and drawing (I believe) reasonable inferences:

NeoHaskell is not going to be a fork of ghc and is instead proposing to implement the following things:

  • A new command-line tool (neo) similar in spirit to stack
    • It is proposing some new features not present in stack but it reads to me as similar to stack.
  • A GHC plugin that would add:
    • new language features (none proposed so far, but it aims to be a Haskell dialect)
    • improved error messages
    • some improvements to the UX (e.g. automatic hole filling)
  • An attempt to revive the work on a mobile (ARM) backend for Haskell
  • An overhaul of Haskell’s standard libraries similar in spirit to foundation
  • TemplateHaskell support for the cpython package for more ergonomic Python interop
  • A set of documentation for the language and some parts of the ecosystem
  • An event sourcing framework
    • … and a set of template applications based on that framework

And in addition to that concrete roadmap Nick Seagull is essentially proposing the following governance model for the NeoHaskell project (and possibly the broader Haskell ecosystem if NeoHaskell gains traction):

  • Centralizing product management in himself as a benevolent dictator

    I don’t believe I’m exaggerating this. Here is the relevant excerpt from the announcement post, which explicitly references the BDFL model:

    I believe that in order for a product to be successful, the design process must be centralized in a single person. This person must listen to the users, the other designers, and in general must have an open mind to always cherry-pick all possible ideas in order to improve the product. I don’t believe that a product should be guided by democracy, and neither it should implement all suggestions by every user. In other words, I’ll be the one in charge of generating and listening to discussions, and prioritizing the features of the project.

    I understand that this comes with some risk, but at the same time I believe that all programming tools like Python and Ruby that are very loved by their communities are like that because of the BDFL model

  • Organizing work via the NeoHaskell discord and NeoHaskell GitHub issue tracker

I feel like it should have been easier to gather this concrete information about NeoHaskell’s aspirational goals, if only so that the project is less about vibes and more a discussion on a concrete roadmap.

Alright, so now I’ll explain my general impression of this project. I’ll start with the positive feedback followed by the negative feedback and I’ll be a bit less reserved and more emotionally honest in my feedback.

Positive feedback

Welcome contributions

I’m not the kind of person who will turn down someone willing to do work to make things better as long as they don’t make things worse. A new mobile backend for Haskell sounds great! Python interop using TemplateHaskell sounds nice! Documentation? Love it!

A GHC plugin is a good approach

I think the approach of implementing this as a GHC plugin is a much better idea than forking ghc. This sidesteps the ludicrous amount of work that would be required to maintain a fork of ghc.

Moreover, implementing any Haskell dialect as a GHC plugin actually minimizes ecosystem fragmentation because (similar to an alternate Prelude) it doesn’t “leak”. If one of your dependencies uses a GHC plugin for the NeoHaskell dialect then your package doesn’t have to use that same dialect (you can still build that dependency and code your package in non-Neo Haskell). cabal can handle that sort of thing transparently.

Haskell does need better product management

I think the Haskell foundation was supposed to be this (I could be wrong) but that didn’t really seem to pan out.

Either way, I think a lot of us know what good product management is and it is strikingly absent from the ecosystem.

Negative feedback

Benevolent dictator

I think it’s ridiculous that someone who hasn’t made significant contributions to the Haskell ecosystem wants to become a benevolent dictator for a project aspiring to make an outsized impact on the Haskell ecosystem. I know that this is harsh and a personal attack on Nick and I’m also mindful that there’s a real person behind the avatar. HOWEVER, when you propose to be a benevolent dictator you are inherently making things personal. A proposal to become a benevolent dictator is essentially a referendum on you as a person.1

And it’s not just a matter of fairness or whatever. Nick’s lack of Haskell credentials directly impact his ability to actually meaningfully improve upon prior art if he doesn’t understand the current state of the art. Like, when Michael Snoyman created stack it did lead to a lot of fragmentation in the Haskell tooling but at least I felt like he was justified in his attempt because he had an impressive track record and a deep understanding of the Haskell ecosystem and toolchain.

I do not get anything remotely resembling that impression from Nick Seagull. He strikes me as a dilettante in this area and not just due to his lack of Haskell credentials but also due to some of his questionable proposed changes. This brings me to:

Unwelcome contributions

Not all contributions benefit the ecosystem2. I think proposing a new neo build tool is likely to fragment the tooling in a way similar to stack. I have worked pretty extensively with all three of cabal, stack and Nix throughout my career and my intuition based on that experience is that the only improvement to the Haskell command-line experience that is viable and that will “win” in the long run is one that is directly upstreamed into cabal. It’s just that nobody wants to do that because it’s not as glamorous as writing your own build tool.

Similarly, I think his proposed vision of “event source all the Haskell applications” (including command-line scripts) is poorly thought out. I firmly subscribe to the principle of least power which says that you should use the simplest type or abstraction available that gets the job done instead of trying to shoehorn everything into the same “god type” or “god abstraction”. I learned this the hard way when I tried to shoehorn everything into my pipes package and realized that it was a huge mistake, so it’s not like I’m innocent in this regard. Don’t make the same mistake I did.

And it matters that some of these proposed changes are counterproductive because if he indeed plays a role as a benevolent dictator you’re not going to get to pick and choose which changes to keep and which changes to ignore. You’re getting the whole package, like it or not.

Not good product management

I don’t believe NeoHaskell is the good product management we’re all looking for. “Haskell dialect + python interop + event sourcing + mobile backend” is not a product. It’s an odd bundle of features that don’t have a clear market or vertical or use case to constrain the design and navigate tradeoffs. The NeoHaskell roadmap comes across to me as a grab bag of unrelated features which individually sound good but that is not necessarily good product management.

To make this concrete: what is the purpose of bundling both python interop and a mobile backend into NeoHaskell’s roadmap? As far as I know there is no product vertical that requires both of those things.

The overall vibe is bad

My initial impression of NeoHaskell was that it struck me as bullshit. Carefully note that I’m not saying that Nick is a bullshitter, but if he wants to be taken seriously then he needs to rethink how he presents his ideas. Everything from the tone of the announcement post (including the irrelevant AI-generated images), the complete absence of any supporting code or mockups, and the wishy washy statement of purpose all contributed to the non-serious vibes.

Conclusion

Anyway, I don’t hate Nick and I’m pretty sure I’d get along with him great in person in other contexts. He also seems like a decently accomplished guy in other respects. However, I think nominating himself as a benevolent dictator for an ambitious ecosystem is a bit irresponsible. However, we all make mistakes and can learn from them.

And I don’t endorse NeoHaskell. I don’t think it’s any more likely to succeed than Haskell absent some better product management. “I like simple Haskell tailored to blue collar engineers” is a nice vibe but it’s not a product.