This post is about a new OpenTelemetry plugin for GHC that I’ve been building for work that we’re open sourcing because I think it might be broadly useful to others. If all you want to do is use the plugin then you can find it on Hackage, which includes more detailed usage instructions. This post will focus more on the motivation and background behind the plugin’s development.
The context behind this work was that we use Honeycomb at work for collecting metrics related to production and our team1 has begun to apply those same metrics to our builds. In particular, we wanted to collect detailed (module-level) build metrics so that we could begin to hunt down and fix expensive modules within our codebase. For context: our codebase currently has almost 7000 modules, so these expensive modules can easily fly under the radar.
When we enable the plugin and export the results to Honeycomb we can begin to see which modules are the most expensive to build:
… and none of the modules are individually very expensive to build (the worst offender is only about 5 seconds), so they’d easily get lost within a sea of thousands of other modules.
However, these sorts of insights have already proven useful. For example:
one expensive modules was completely unused in our codebase
The above list brought it to our attention so that we could delete it.
other expensive modules were representative examples of larger issues to fix
For example, one expensive module consisted of 2000 invocations of an internal function which is expensive to type-check and fixing this function will improve compile speeds across our codebase and not just that module.
other expensive modules are indicative of architectural anti-patterns
Frequently “horizontally-organized” modules top the chart, and I view them as anti-patterns for a few reasons (see: my post on Module organization guidelines). These modules are not expensive per se (the code inside them has to be compiled somewhere), but they tend to be build chokepoints because they have a large number of dependencies and reverse dependencies. Highlighting expensive modules has a tendency to highlight these sorts of build chokepoints as a side bonus.
In principle you can also browse a given build’s trace interactively, like this:
However, for our codebase Honeycomb chokes on our giant build traces and we can only produce visualizations like the above image if we filter down the spans to a randomly sampled subset of modules. Honeycomb doesn’t do a good job of handling traces with a few thousand spans or more.
This plugin was surprisingly difficult for me to implement because
Plugin interface is so constrained.
For example, the
hs-opentelemetry-sdk package asks you
to finalize any
TracerProvider that you acquire, but
there’s no good way (that I know of2) to run finalization
logic at the end of a
ghc build using the
Plugin interface. The purpose of this finalization logic is
to flush metrics that haven’t yet been exported.
So what I did was to hack around this by detecting all modules that are root modules of the build graph and flushing metrics after each of those root modules is built (since one of them will be the last module built). I tried a bunch of other alternative approaches (like installing a phase hook), but this was the only approach I was able to get to work.
And the OpenTelemetry plugin is full of workarounds like this. We
have vetted internally that the plugin works for normal builds,
generally I expect there to be some trailing bugs that we’ll have to fix
as more people use it due to these various unsafe implementation
In fact, one limitation of the plugin is that the top-level span has
a duration of 0 (instead of reporting the duration of the build). This
is related to the same issue of the
apparently not having a good way to run code exactly once after the
build completes (even using hacks). If somebody knows of a way to do
this that I missed I’d definitely welcome the tip!
What we do know from internal usage is that:
the plugin definitely scales to very large codebases (thousands of modules)
… although honeycomb doesn’t scale to thousands of spans, but that’s not our fault.
the plugin’s overhead is negligible (so it’s safe to always enable)
the plugin works with
So it should be fine for most use cases, but please report any issues that you run into.