Haskell for all: March 2023

The "open source native" principle for software design

This post summarizes a software design principle I call the “open source native” principle which I’ve invoked a few times as a technical lead. I wanted to write this down so that I could easily reference this post in the future.

The “open source native” principle is simple to state:

Design proprietary software as if you intended to open source that software, regardless of whether you will open source that software

I call this the “open source native” principle because you design your software as if it were a “native” member of the open source ecosystem. In other words, your software is spiritually “born” open source, aspirationally written from the beginning to be a good open source citizen, even if you never actually end up open sourcing that software.

You can’t always adhere to this principle, but I still use this as a general design guideline.

Example

It’s hard to give a detailed example of this principle since most of the examples I’d like to use are … well … proprietary and wouldn’t make sense outside of their respective organizations. However, I’ll try to outline a hypothetical example (inspired by a true story) that hopefully enough can people can relate to.

Suppose that your organization provides a product with a domain-specific programming language for customizing their product’s behavior. Furthermore, suppose that you’re asked to design and implement a package manager for this programming language.

There are multiple data stores you could use for storing packages, but to simplify this example suppose there are only two options:

Store packages in a product-specific database

Perhaps your product already uses a database for other reasons, so you figure that you can reuse that existing database for storing packages. That way you don’t need to set up any new infrastructure to get going since the database team will handle that for you. Plus you get the full powerful of a relational database so now you have powerful tools for querying and/or modifying packages.
Store packages in git

You might instead store your packages as flat files inside of a git repository.

These represent two extremes of the spectrum and in reality there might be other options in between (like a standalone sqlite database), but this is a contrived example.

According to the open source principle, you’d prefer to store packages in git because git is a foundational building block of the open source ecosystem that is already battle-tested for this purpose. You’d be sacrificing some features (you’d no longer have access to the full power of a relational database), but your package manager would now be more “open-source native”.

You might wonder: why would one deliberately constrain themselves like that? What’s the benefit of designing things in this way if they might never be open sourced?

Motivation

There are several reasons I espouse this design principle:

better testability

If you design your component so that it’s easy to use outside of the context of your product then it’s also easier to test in isolation. This means that you don’t need to rely on heavyweight integration tests or end-to-end tests to verify that your component works correctly.

For example, a package manager based on git is easier to test than a package manager based on a database because a git repository is easier to set up.
faster release cadence

If your component can be tested in isolation then you don’t even need to share continuous integration (CI) with the rest of your organization. Your component can have its own CI and release on whatever frequency is appropriate for that component instead of coupling its release cadence to the rest of your product.

That in turn typically means that you can release earlier and more often, which is a virtue in its own right.

Continuing the package manager example, you wouldn’t need to couple releases of your package manager to the release cadence of the rest of your product, so you’d be able to push out improvements or fixes more quickly.
simpler documentation

It’s much easier to write a tutorial for software that delivers value in isolation since there’s less supporting infrastructure necessary to follow along with the tutorial.
well-chosen interfaces

You have to carefully think through the correct logical boundaries for your software when you design for a broader audience of users. It’s also easier to enforce stronger boundaries and narrower scope for the same reasons.

For example, our hypothetical package manager is less likely to have package metadata polluted with product-specific details if it is designed to operate independently of the product.
improved stability

Open source software doesn’t just target a broader audience, but also targets a broader time horizon. An open source mindset promotes thinking beyond the needs of this financial quarter.
you can open source your component! (duh)

Needless to say, if you design your component to be open-source native, it’s also easier to open source. Hooray! 🎉

Conclusion

You can think of this design principle as being similar to the rule of least power, where you’re making your software less powerful (by adding the additional constraint that it can be open sourced), but in turn improving ease of comprehension, maintainability, and distribution.

Also, if you have any examples along these lines that you care to share, feel free to drop them in the comments.

Monday, March 6, 2023

The "open source native" principle for software design

Example

Motivation

Conclusion