Monday, July 27, 2020

The golden rule of software quality

golden-rule

This post summarizes a rule of thumb that I commonly cite in software quality discussions, so that I can link to my own post to save time. I have taken to calling this the “golden rule of software quality” because the rule is succinct and generalizable.

The golden rule is:

Prefer to push fixes upstream instead of working around problems downstream

… and I’ll explain implications of this rule for a few software engineering tradeoffs (using examples from the Haskell community and ecosystem).

Disclaimer: The golden rule of software quality bears no relationship to the golden rule of treating others as you want to be treated.

Third-party dependencies

Most developers rely on third-party dependencies or tools for their projects, but the same developers rarely give thought to fixing or improving that same third-party code. Instead, they tend to succumb to the bystander effect, meaning that the more widely used a project, the more a person assumes that some other developer will take care of any problems for them. Consequently, these same developers tend to work around problems in widely used tools.

For example, for the longest time Haskell did not support a “dot” syntax for accessing record fields, something that the community worked around downstream through a variety of packages (including lens) to simulate an approximation of dot syntax within the language. This approach had some upsides (accessors were first class), but several downsides such as poor type inference, poor error messages, and lack of editor support for field completions. Only recently did Neil Mitchell and Shayne Fletcher upstream this feature directly into the language via the RecordDotSyntax proposal, solving the root of the problem.

The golden rule of software quality implies that you should prefer to directly improve the tools and packages that you depend on (“push fixes upstream”) instead of hacking around the problem locally (“working around problems downstream”). These sorts of upstream improvements can be made directly to:

  • Your editor / IDE
  • Your command-line shell
  • Programming languages you use
  • Packages that you depend on

Note that this is not always possible (especially if upstream is hostile to outside contributions), but don’t give up before at least trying to do so.

Typed APIs

Function types can also follow this same precept. For example, there are two ways that one can assign a “safe” (total) type to the head function for obtaining the first value in a list.

The first approach pushes error handling downstream:

-- Return the first value wrapped in a `Just` if present, `Nothing` otherwise
head :: [a] -> Maybe a

… and the second approach pushes the requirements upstream:

-- Return the first value of a list, which never fails if the list is `NonEmpty`
head :: NonEmpty a -> a

The golden rule states that you should prefer the latter type signature for head (that requires a NonEmpty input) since this type pushes the fix upstream by not allowing the user to supply an empty list in the first place. More generally, if you take this rule to its logical conclusion you end up making illegal states unrepresentable.

Contrast this with the former type signature for head that works around a potentially empty list by returning a Maybe. This type promotes catching errors later in the process, which reduces quality since we don’t fail as quickly as we should. You can improve quality by failing fast at the true upstream root of the problem instead of debugging indirect downstream symptoms of the problem.

Social divisions

I’m a firm believer in Conway’s Law, which says:

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.

— Melvin E. Conway

… which I sometimes paraphrase as “social divisions lead to technical divisions”.

If social issues are upstream of technical issues, the golden rule implies that we should prefer fixing root causes (social friction) instead of attempting to mask social disagreements with technical solutions.

The classic example of this within the Haskell community is the cabal vs. stack divide, which originated out of divisions between FPComplete and Cabal contributors (Corrected based on feedback from the Haskell subreddit). The failure to resolve the upstream friction between the paid and open source contributors led to an attempt to work around the problem downstream with a technical solution by creating a parallel install tool. This in turn fragmented the Haskell community, leading to a poor and confusing experience for first-time users.

That’s not to imply that the divide in the community could have been resolved (maybe the differences between paid contributors and open source volunteers were irreconcilable), but the example still illustrates the marked impact on quality of failing to fix issues at the source.

Conclusion

Carefully note that the golden rule of software quality does not mandate that you have to fix problems upstream. The rule advises that you should prefer to upstream fixes, all other things equal. Sometimes other considerations can prevent one from doing so (such as limitations on time or money). However, when quality is paramount then you should strive to observe the rule!

Monday, July 13, 2020

Record constructors

records

This is a short post documenting various record-related idioms in the Haskell ecosystem. First-time package users can use this post to better understand record API idioms they encounter in the wild.

For package authors, I also include a brief recommendation near the end of the post explaining which idiom I personally prefer.

The example

I’ll use the following record type as the running example for this post:

module Example where

data Person = Person{ name :: String , admin :: Bool }

There are a few ways you can create a Person record if the package author exports the record constructors.

The simplest approach requires no extensions. You can initialize the value of every field in a single expression, like this:

example :: Person
example = Person{ name = "John Doe", admin = True }

Some record literals can get quite large, so the language provides two extensions which can help with record assembly.

First, you can use the NamedFieldPuns extension, to author a record like this:

{-# LANGUAGE NamedFieldPuns #-}

example :: Person
example = Person{ name, admin }
  where
    name = "John Doe"

    admin = True

This works because the NamedFieldPuns extension translates Person{ name, admin } to Person{ name = name, admin = admin }.

The RecordWildCards extension goes a step further and allows you to initialize a record literal without naming all of the fields (again), like this:

{-# LANGUAGE RecordWildCards #-}

example :: Person
example = Person{..}
  where
    name = "John Doe"

    admin = True

Vice versa, you can destructure a record literal in a few ways. For example, you can access record fields using accessor functions:

render :: Person -> String
render person = name person ++ suffix
  where
    suffix = if admin person then " - Admin" else ""

… or you can pattern match on a record literal:

render :: Person -> String
render Person{ name = name, admin = admin } = name ++ suffix
  where
    suffix = if admin then " - Admin" else ""

… or you can use the NamedFieldPuns extension (which also works in reverse):

render :: Person -> String
render Person{ name, admin } = name ++ suffix
  where
    suffix = if admin then " - Admin" else ""

… or you can use the RecordWildCards extension (which also works in reverse):

render :: Person -> String
render Person{..} = name ++ suffix
  where
    suffix = if admin then " - Admin" else ""

Also, once the RecordDotSyntax extension is available you can use ordinary dot syntax to access record fields:

render :: Person -> String
render person = person.name ++ suffix
  where
    suffix = if person.admin then " - Admin" else ""

Opaque record types

Some Haskell packages will elect to not export the record constructor. When they do so they will instead provide a function that initializes a record value with all required fields and defaults the remaining fields.

For example, suppose the name field were required for our Person type and the admin field were optional (defaulting to False). The API might look like this:

module Example (
      Person(name, admin)
    , makePerson
    ) where

data Person = Person{ name :: String, admin :: Bool }

makePerson :: String -> Person
makePerson name = Person{ name = name, admin = False }

Carefully note that the module exports the Person type and all of the fields, but not the Person constructor. So the only way that a user can create a Person record is to use the makePerson “smart constructor”. The typical idiom goes like this:

example :: Person
example = (makePerson "John Doe"){ admin = True }

In other words, the user is supposed to initialize required fields using the “smart constructor” and then set the remaining non-required fields using record syntax. This works because you can update a record type using exported fields even if the constructor is not exported.

The wai package is one of the more commonly used packages that observes this idiom. For example, the Request record is opaque but the accessors are still exported, so you can create a defaultRequest and then update that Request using record syntax:

example :: Request
example = defaultRequest{ requestMethod = "GET", isSecure = True }

… and you can still access fields using the exported accessor functions:

requestMethod example

This approach also works in conjunction with NamedFieldPuns for assembly (but not disassembly), so something like this valid:

example :: Request
example = defaultRequest{ requestMethod, isSecure }
  where
    requestMethod = "GET"

    isSecure = True

However, this approach does not work with the RecordWildCards language extension.

Some other packages go a step further and instead of exporting the accessors they export lenses for the accessor fields. For example, the amazonka-* family of packages does this, leading to record construction code like this:

example :: PutObject
example =
    putObject "my-example-bucket" "some-key" "some-body"
    & poContentLength .~ Just 9
    & poStorageClass  .~ ReducedRedundancy

… and you access fields using the lenses:

view poContentLength example

My recommendation

I believe that package authors should prefer to export record constructors instead of using smart constructors. Specifically, the smart constructor idiom requires too much specialized language knowledge to create a record, something that should be an introductory task for a functional programming language.

Package authors typically justify smart constructors to improve API stability since they permit adding new default-valued fields in a backwards compatible way. However, I personally do not weight such stability highly (both as a package author and a package user) because Haskell is a typed language and these changes are easy for reverse dependencies to accommodate with the aid of the type-checker.

I place a higher premium on improving the experience for new contributors so that Haskell projects can more easily take root within a polyglot engineering organization. Management tends to be less reluctant to accept Haskell projects within their organization if they feel that other teams can confidently contribute to the Haskell code.

Future directions

One long-term solution that could provide the best of both worlds is if the language had first-class support for default-valued fields. In other words, perhaps you could author a record type like this:

data Person = Person{ name :: String , admin :: Bool = False }

… and then you could safely omit default-valued fields when initializing a record. Of course, I haven’t fully thought through the implications of such a change.