Wednesday, March 2, 2022

Applicatives should usually implement Semigroup and Monoid


The gist of this post is that any type constructor F that implements Applicative:

instance Applicative F where

… should usually also implement the following Semigroup and Monoid instances:

instance Semigroup a => Semigroup (F a) where
    (<>) = liftA2 (<>)

instance Monoid a => Monoid (F a) where
    mempty = pure mempty

… which one can also derive using the Data.Monoid.Ap type, which was created for this purpose:

deriving (Semigroup, Monoid) via (Ap F a)

Since each type constructor that implements Monad also implements Applicative, this recommendation also applies for all Monads, too.

Why are these instances useful?

The above instances come in handy in conjunction with utilities from Haskell’s standard library that work with Monoids.

For example, a common idiom I see when doing code review is something like this:

instance Monad M where

example :: M [B]
example = do
    let process :: A -> M [B]
        process a = do

            return bs

    let inputs :: [A]
        inputs =

    bss <- mapM process inputs

    return (concat bss)

… but if you implemented the suggested Semigroup and Monoid instances then you could replace this:

    bss <- mapM process inputs

    return (concat bss)

… with this:

    foldMap process inputs

These instances also come in handy when you need to supply an empty action or empty handler for some callback.

For example, the lsp package provides a sendRequest utility which has the following type:

    :: MonadLsp config f
    => SServerMethod m
    -> MessageParams m
    -> (Either ResponseError (ResponseResult m) -> f ())
    -- ^ This is the callback function
    -> f (LspId m)

I won’t go into too much detail about what the type means other than to point out that this function lets a language server send a request to the client and then execute a callback function when the client responds. The callback function you provide has type:

Either ResponseError (ResponseResult m) -> f ()

Sometimes you’re not interested in the client’s response, meaning that you want to supply an empty callback that does nothing. Well, if the type constructor f implements the suggested Monoid instance then the empty callback is: mempty.

mempty :: Either ResponseError (ResponseResult m) -> f ()

… and this works because of the following three Monoid instances that are automatically chained together by the compiler:

instance Monoid ()

-- The suggested Monoid instance that `f` would ideally provide
instance Monoid a => Monoid (f a)

instance Monoid b => Monoid (a -> b)

In fact, certain Applicative/Monad-related utilites become special cases of simpler Monoid-related utilities once you have this instance. For example:

  • You can sometimes replace traverse_ / mapM_ with the simpler foldMap utility

    Specifically, if you specialize the type of traverse_ / mapM_ to:

    traverse_ :: (Foldable t, Applicative f) => (a -> f ()) -> t a -> f ()
    mapM_     :: (Foldable t, Monad       f) => (a -> f ()) -> t a -> f ()

    … then foldMap behaves the same way when the Applicative f implements the suggested instances:

    foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
  • You can sometimes replace sequenceA_ / sequence_ with the simpler fold utility

    Specifically, if you specialize the type of sequenceA_ / sequence_ to:

    sequenceA_ :: (Foldable t, Applicative f) -> t (f ()) -> f ()
    sequence_  :: (Foldable t, Monad       f) -> t (f ()) -> f ()

    … then fold behaves the same way when the Applicative f implements the’ suggested instances:

    fold :: (Foldable t, Monoid m) => t m -> m
  • You can sometimes replace replicateM_ with mtimesDefault

    Specifically, if you specialize the type of replicateM_ to:

    replicateM_ :: Applicative f => Int -> f () -> f ()

    … then mtimesDefault behaves the same way when the Applicative f implements the suggested instances:

    mtimesDefault :: Monoid m => Int -> m -> m

And you also gain access to new functionality which doesn’t currently exist in Control.Monad. For example, the following specializations hold when f implements the suggested instances:

-- This specialization is similar to the original `foldMap` example
fold :: Applicative f => [f [b]] -> f [b]

-- You can combine two handlers into a single handler
(<>) :: Applicative f => (a -> f ()) -> (a -> f ()) -> (a -> f ())

-- a.k.a. `pass` in the `relude` package
mempty :: Applicative f => f ()

When should one not do this?

You sometimes don’t want to implement the suggested Semigroup and Monoid instances when other law-abiding instances are possible. For example, sometimes the Applicative type constructor permits a different Semigroup and Monoid instance.

The classic example is lists, where the Semigroup / Monoid instances behave like list concatenation. Also, most of the exceptions that fall in this category are list-like, in the sense that they use the Semigroup / Monoid instances to model some sort of element-agnostic concatenation.

I view these “non-lifted” Monoid instances as a missed opportunity, because these same type constructors will typically also implement the exact same behavior for their Alternative instance, too, like this:

instance Alternative SomeListLikeType where
    empty = mempty

    (<|>) = (<>)

… which means that you have two instances doing the exact same thing, when one of those instances could have potentially have been used to support different functionality. I view the Alternative instance as the more natural instance for element-agnostic concatenation since that is the only behavior the Alternative class signature permits. By process of elimination, the Monoid and Semigroup instances should in principle be reserved for the “lifted” implementation suggested by this post.

However, I also understand it would be far too disruptive at this point to change these list-like Semigroup and Monoid instances and expectations around them, so I think the pragmatic approach is to preserve the current Haskell ecosystem conventions, even if they strike me as less elegant.

Why not use Ap exclusively?

The most commonly cited objection to these instances is that you technically don’t need to add these lifted Semigroup and Monoid instances because you can access them “on the fly” by wrapping expressions in the Ap newtype before combining them.

For example, even if we didn’t have a Semigroup and Monoid instance, we could still write our original example using foldMap, albeit with more newtype-coercion boilerplate:

    fmap getAp (foldMap process (map Ap inputs))

… or perhaps using the newtype package on Hackage:

    ala Ap foldMap process inputs

This solution is not convincing to me for a few reasons:

  • It’s unergonomic in general

    There are some places where Ap works just fine (such as in conjunction with deriving via), but typically using Ap directly within term-level code is a solution worse than the original problem; the newtype wrapping and unwrapping boilerplate more than counteracts the ergonomic improvements from using the Semigroup / Monoid instances.

  • In my view, there’s no downside to adding Semigroup and Monoid instances

    … when only one law-abiding implementation of these instances is possible. See the caveat in the previous section.

  • This line of reasoning would eliminate many other useful instances

    For example, one might remove the Applicative instance for list since it’s not the only possible instance and you could in theory always use a newtype to select the desired instance.

Proof of laws

For completeness, I should also mention that the suggested Semigroup and Monoid instances are guaranteed to always be law-abiding instances. You can find the proof in Appendix B of my Equational reasoning at scale post.

1 comment:

  1. > Specifically, if you specialize the type of sequenceA_ / sequence_ to:

    I think you mean to replace the `a`s in the next line with `()`.