Saturday, January 10, 2015

total-1.0.0: Exhaustive pattern matching using traversals, prisms, and lenses

The lens library provides Prisms, which are a powerful way to decouple a type's interface from its internal representation. For example, consider the Either type from the Haskell Prelude:

data Either a b = Left a | Right b

Instead of exposing the Left and Right constructors, you could instead expose the following two Prisms:

_Left  :: Prism (Either a b) (Either a' b ) a a'

_Right :: Prism (Either a b) (Either a  b') b b'

Everything you can do with a constructor, you can do with the equivalent Prism. For example, if you want to build a value using a Prism, you use review:

review _Left  :: a -> Either a b
review _Right :: b -> Either a b

You can also kind-of pattern match on the Prism using preview, returning a Just if the match succeeds or Nothing if the match fails:

preview _Left  :: Either a b -> Maybe a
preview _Right :: Either a b -> Maybe b

However, preview is not quite as good as real honest-to-god pattern matching, because if you wish to handle every branch you can't prove in the types that your pattern match was exhaustive:

nonTypeSafe :: Either Char Int -> String
nonTypeSafe e =
    case preview _Left e of
        Just c  -> replicate 3  c
        Nothing -> case preview _Right e of
            Just n  -> replicate n '!'
            Nothing -> error "Impossible!"  -- Unsatisfying

However, this isn't a flaw with Prisms, but rather a limitation of preview. Prisms actually preserve enough information in the types to support exhaustive pattern matching! The trick for this was described a year ago by Tom Ellis, so I decided to finally implement his idea as the total library.

Example

Here's an example of a total pattern matching using the total library:

import Lens.Family.Total
import Lens.Family.Stock

total :: Either Char Int -> String       -- Same as:
total = _case                            -- total = \case
    & on _Left  (\c -> replicate 3  c )  --     Left  c -> replicate 3 c
    & on _Right (\n -> replicate n '!')  --     Right n -> replicate n '!'

The style resembles LambdaCase. If you want to actually supply a value in the same expression, you can also write:

e & (_case
    & on _Left  ...
    & on _Right ... )

There's nothing unsafe going on under the hood. on is just 3 lines of code:

on p f g x = case p Left x of
    Left  l -> f l
    Right r -> g r

(&) is just an operator for post-fix function application:

x & f = f x

... and _case is just a synonym for a type class method that checks if a type is uninhabited.

class Empty a where
    impossible :: a -> x

_case = impossible

The rest of the library is just code to auto-derive Empty instances using Haskell generics. The entire library is about 40 lines of code.

Generics

Here's an example of exhaustive pattern matching on a user-defined data type:

{-# LANGUAGE DeriveGeneric   #-}
{-# LANGUAGE TemplateHaskell #-}

import Control.Lens.TH (makePrisms)
import GHC.Generics (Generic)
import Lens.Family.Total

data Example a b c = C1 a | C2 b | C3 c deriving (Generic)

makePrisms ''Example

instance (Empty a, Empty b, Empty c) => Empty (Example a b c)

example :: Example String Char Int -> String  -- Same as:
example = _case                               -- example = \case
    & on _C1 (\s -> s              )          --     C1 s -> s
    & on _C2 (\c -> replicate 3  c )          --     C2 c -> replicate 3  c
    & on _C3 (\n -> replicate n '!')          --     C3 n -> replicate n '!'

This uses two features to remove most of the boiler-plate:

  • deriving (Generic) auto-generates the code for the Empty instance
  • makePrisms uses Template Haskell to auto-generate Prisms for our type.

For those who prefer the lens-family-* suite of libraries over lens, I opened up an issue against lens-family-th so that Lens.Family.TH.makeTraversals can be used in place of makePrisms for exhaustive pattern matching.

When we write this instance in our original code:

instance (Empty a, Empty b, Empty c) => Empty (Example a b c)

... that uses Haskell generics to auto-generate the implementation of the impossible function:

instance (Empty a, Empty b, Empty c) => Empty (Example a b c)
    impossible (C1 a) = impossible a
    impossible (C2 b) = impossible b
    impossible (C3 c) = impossible c

That says that the type Example a b c is uninhabited if and only if the types a, b, and c are themselves uninhabited.

When you write makePrisms ''Example, that creates the following three prisms:

_C1 :: Prism (Example a b c) (Example a' b  c ) a a'
_C2 :: Prism (Example a b c) (Example a  b' c ) b b'
_C3 :: Prism (Example a b c) (Example a  b  c') c c'

These Prisms are useful in their own right. You can export them in lieu of your real constructors and they are more powerful in several respects.

Backwards compatibility

There is one important benefit to exporting Prisms and hiding constructors: if you export Prisms you can change your data type's internal representation without changing your API.

For example, let's say that I change my mind and implement Example as two nested Eithers:

type Example a b c = Either a (Either b c)

Had I exposed my constructors, this would be a breaking change for my users. However, if I expose Prisms, then I can preserve the old API by just changing the Prism definition. Instead of auto-generating them using Template Haskell, I can just write:

import Lens.Family.Stock (_Left, _Right)

_C1 = _Left
_C2 = _Right . _Left
_C3 = _Right . _Right

Done! That was easy and the user of my Prisms are completely unaffected by the changes to the internals.

Under the hood

The trick that makes total work is that every branch of the pattern match subtly alters the type. The value that you match on starts out as:

Example String Char Int

... and every time you pattern match against a branch, the on function Voids one of the type parameters. The pattern matching flows from bottom to top:

example = _case
    & on _C1 (\s -> s              )  -- Step 3: Example Void   Void Void
    & on _C2 (\c -> replicate 3  c )  -- Step 2: Example String Void Void
    & on _C3 (\n -> replicate n '!')  -- Step 1: Example String Char Void
                                      -- Step 0: Example String Char Int

Finally, _case just checks that the final type (Example Void Void Void in this case) is uninhabited. All it does is check if the given type is an instance of the Empty type class, which only provides instances for uninhabited types. In this case Example Void Void Void is definitely uninhabited because Void (from Data.Void) is itself uninhabited:

instance Empty Void where
    impossible = absurd

Lenses

What's neat is that this solution works not only for Prisms and Traversals, but for Lenses, too! Check this out:

example :: (Int, Char) -> String     -- Same as:
example = _case                      -- example = \case
    & on _1 (\n -> replicate n '1')  --     (n, _) -> replicate n '1'

... and the identity lens works (which is just id) works exactly the way you'd expect:

example :: (Int, Char) -> String        -- Same as:
example = _case                         -- example = \case
    & on id (\(n, c) -> replicate n c)  --     (n, c) -> replicate n c

I didn't intend that when I wrote the library: it just fell naturally out of the types.

Conclusion

The total library now allows library authors to hide their constructors and export a backwards compatible Prism API all while still preserving exhaustive pattern matching.

Note that I do not recommend switching to a lens-only API for all libraries indiscriminately. Use your judgement when deciding whether perfect backwards compatibility is worth the overhead of a lens-only API, both for the library maintainers and your users.

I put the code under the Lens.Family.Total module, not necessarily because I prefer the Lens.Family.* hierarchy, but rather because I would like to see Edward Kmett add these utilities to lens, leaving my library primarily for users of the complementary lens-family-* ecosystem.

You can find the total library on Hackage or on GitHub.

No comments:

Post a Comment