This post documents a small trick that I use to avoid deeply-nested error-handling code. This trick is a common piece of Haskell folklore that many people either learn from others or figure out on their own, but I’m not sure what the official name of this trick is (so I had difficulty searching for prior art explaining this trick). However, I’ve taught this trick to others enough times that I think it merits a blog post of its own.
This post assumes some familiarity with Haskell’s Either type and do notation, but the Appendix at the end of the post will walk through all of the details using equational reasoning if you’re having trouble following along with how things work.
The motivating example
Let’s begin from the following contrived Either-based example that uses deeply nested error-handling:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
example :: String -> String -> Either String Person
example ageString aliveString = do
    case readMaybe ageString of
        Nothing -> do
            Left "Invalid age string"
        Just age -> do
            if age < 0
                then do
                    Left "Negative age"
                else do
                    case readMaybe aliveString of
                        Nothing -> do
                            Left "Invalid alive string"
                        Just alive -> do
                            return Person{ age, alive }… which we can use like this:
>>> example "24" "True"
Right (Person {age = 24, alive = True})
>>> example "24" "true"
Left "Invalid alive string"
>>> example "" "True"
Left "Invalid age string"
>>> example "-5" "True"
Left "Negative age"Notice how the above coding style increases the nesting / indentation level every time we parse or validate the input in some way. We could conserve indentation by using 2-space indents or indenting only once for each level of nesting:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
example :: String -> String -> Either String Person
example ageString aliveString = case readMaybe ageString of
  Nothing  -> Left "Invalid age string"
  Just age -> if age < 0
    then Left "Negative age"
    else case readMaybe aliveString of
      Nothing    -> Left "Invalid alive string"
      Just alive -> return Person{ age, alive }However, I think most people writing code like that would prefer to keep the indentation level stable, no matter how many validations the code requires.
The trick
Fortunately, we can avoid nesting the code with the following change:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
example :: String -> String -> Either String Person
example ageString aliveString = do
    age <- case readMaybe ageString of
        Nothing  -> Left "Invalid age string"
        Just age -> return age
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe aliveString of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }Here we make use of several useful properties:
- returnin Haskell does not short-circuit or exit from the surrounding code- In fact, some Haskell programmers prefer to use - pure(a synonym for- return) to better convey the absence of short-circuiting behavior:- {-# LANGUAGE NamedFieldPuns #-} import Text.Read (readMaybe) data Person = Person { age :: Int, alive :: Bool } deriving (Show) example :: String -> String -> Either String Person example ageString aliveString = do age <- case readMaybe ageString of Nothing -> Left "Invalid age string" Just age -> pure age if age < 0 then Left "Negative age" else pure () alive <- case readMaybe aliveString of Nothing -> Left "Invalid alive string" Just alive -> pure alive pure Person{ age, alive }
- Leftdoes short-circuit from the surrounding code- In fact, we can define a synonym for - Leftnamed- throwif we want to better convey the presence of short-circuiting behavior::- {-# LANGUAGE NamedFieldPuns #-} import Text.Read (readMaybe) data Person = Person { age :: Int, alive :: Bool } deriving (Show) example :: String -> String -> Either String Person example ageString aliveString = do age <- case readMaybe ageString of Nothing -> throw "Invalid age string" Just age -> pure age if age < 0 then throw "Negative age" else pure () alive <- case readMaybe aliveString of Nothing -> throw "Invalid alive string" Just alive -> pure alive pure Person{ age, alive } throw :: String -> Either String a throw = Left
- Left/- throw“return” any type of value- If you look at the type of - throw, the “return” type is a polymorphic type- abecause- throwshort-circuits (and therefore doesn’t need to return a real value of that type).- This is why the type checker doesn’t complain when we do this: - age <- case readMaybe ageString of Nothing -> throw "Invalid age string" Just age -> pure age- … because both branches of the - caseexpression type-check as an expression that “return”s an- Int. The left branch type-checks as a branch that returns an- Intbecause it cheats and never needs to return anything and the right branch returns a bona-fide- Int(the- age).- We can make this explicit by giving both branches of the - caseexpression an explicit type annotation:- age <- case readMaybe ageString of Nothing -> throw "Invalid age string" :: Either String Int Just age -> pure age :: Either String Int- This means that both branches of a - caseexpression will always share the same return type if at least one branch is a- Left/- throw. Similarly, both branches of an- ifexpression will share the same return type if at least one branch is a- Left/- throw:- if age < 0 then throw "Negative age" :: Either String () else pure () :: Either String ()
- We can return a value from a - caseexpression- New Haskell programmers might not realize that - caseexpressions can return a value (just like any other expression), which means that their result can be stored as a new value within the surrounding scope:- age <- case readMaybe ageString of Nothing -> throw "Invalid age string" :: Either String Int Just age -> pure age :: Either String Int- The type-checker doesn’t mind that only the second branch returns a valid - agebecause it knows that the outer- ageis unreachable if the first branch short-circuits. This understanding is not built into the compiler, but is rather an emergent property of how- donotation works for- Either. See the Appendix for a fully-worked equational reasoning example showing why this works.
Combinators
You can build upon this trick by defining helpful utility functions to simplify things further. For example, I sometimes like to define the following helper function:
orDie :: Maybe a -> String -> Either String a
Just a  `orDie` _      = return a
Nothing `orDie` string = Left string
{- Equivalent, more explicit, implementation:
maybe `orDie` string =
    case maybe of
        Nothing -> Left string
        Just a  -> return a
-}… which you can use like this:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
orDie :: Maybe a -> String -> Either String a
Just a  `orDie` _      = Right a
Nothing `orDie` string = Left string
example :: String -> String -> Either String Person
example ageString aliveString = do
    age <- readMaybe ageString `orDie` "Invalid age string"
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- readMaybe aliveString `orDie` "Invalid alive string"
    return Person{ age, alive }… which is even more clear (in my opinion).
Conclusion
That is the entirety of the trick. You can return values from case expressions to avoid deeply-nesting your Either code, or you can define utility functions (such as orDie) which do essentially the same thing.
This trick applies equally well to any other Monad that supports some notion of short-circuiting on failure, such as ExceptT (from transformers / mtl). The only essential ingredient is some throw-like function that short-circuits and therefore has a polymorphic return type.
Appendix
You can build a better intuition for why this works using equational reasoning. We’ll begin from an example usage of our function and at each step of the process we will either desugar the code or substitute one subexpression with another equal subexpression.
In all of the examples, we’ll begin from this definition for the example function:
example :: String -> String -> Either String Person
example ageString aliveString = do
    age <- case readMaybe ageString of
        Nothing  -> Left "Invalid age string"
        Just age -> return age
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe aliveString of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }Let’s first begin with the example that fails the most quickly:
example "" "True"
-- Substitute `example` with its definition:
--
-- … replacing `ageString` with `""`
--
-- … replacing `aliveString` with `"True"`
do  age <- case readMaybe "" of
        Nothing  -> Left "Invalid age string"
        Just age -> return age
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- Definition of `readMaybe` (not shown):
--
-- (readMaybe "" :: Maybe Int) = Nothing
do  age <- case Nothing of
        Nothing  -> Left "Invalid age string"
        Just age -> return age
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- Simplify the first `case` expression
do  age <- Left "Invalid age string"
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Left "Invalid age string" >>= \age ->
    do  if age < 0
            then Left "Negative age"
            else return ()
        alive <- case readMaybe "True" of
            Nothing    -> Left "Invalid alive string"
            Just alive -> return alive
        return Person{ age, alive }
-- Definition of `>>=` for `Either`
--
-- (Left x >>= _) = (Left x)
Left "Invalid age string"… and we’re done! The key step is the last bit where we simplify (Left … >>= _) to (Left …), which is how Either short-circuits on failure. Because this simplification does not use the downstream code everything type-checks just fine even though we never supply a valid age.
For completeness, let’s also walk through the example where everything succeeds:
example "24" "True"
-- Substitute `example` with its definition:
--
-- … replacing `ageString` with `"24"`
--
-- … replacing `aliveString` with `"True"`
do  age <- case readMaybe "24" of
        Nothing  -> Left "Invalid age string"
        Just age -> return age
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- Definition of `readMaybe` (not shown):
--
-- (readMaybe "24" :: Maybe Int) = (Just 24)
do  age <- case Just 24 of
        Nothing  -> Left "Invalid age string"
        Just age -> return age
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- Simplify the first `case` expression
do  age <- return 24
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- return = Right
do  age <- Right 24
    if age < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Right 24 >>= \age ->
    do  if age < 0
            then Left "Negative age"
            else return ()
        alive <- case readMaybe "True" of
            Nothing    -> Left "Invalid alive string"
            Just alive -> return alive
        return Person{ age, alive }
-- Definition of `>>=` for `Either`
--
-- (Right x >>= f) = (f x)
--
-- This means that we substitute `age` with `24`
do  if 24 < 0
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- (24 < 0) = False
do  if False
        then Left "Negative age"
        else return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- (if False then l else r) = r
do  return ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- return = Right
do  Right ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- (do m; n) = (do _ <- m; n)
do  _ <- Right ()
    alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Right () >>= _ -> 
    do  alive <- case readMaybe "True" of
            Nothing    -> Left "Invalid alive string"
            Just alive -> return alive
        return Person{ age = 24, alive }
-- Definition of `>>=` for `Either`
--
-- (Right x >>= f) = (f x)
--
-- This means that we substitute `_` (unused) with `()`
do  alive <- case readMaybe "True" of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- Definition of `readMaybe` (not shown):
--
-- (readMaybe "True" :: Maybe Bool) = (Just True)
do  alive <- case Just True of
        Nothing    -> Left "Invalid alive string"
        Just alive -> return alive
    return Person{ age = 24, alive }
-- Simplify the `case` expression
do  alive <- return True
    return Person{ age = 24, alive }
-- return = Right
do  alive <- Right True
    return Person{ age = 24, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Right True >>= \alive ->
    do  return Person{ age = 24, alive }
-- Definition of `>>=` for `Either`
--
-- (Right x >>= f) = (f x)
--
-- This means that we substitute `alive` with `True`
do  return Person{ age = 24, alive = True }
-- Desugar `do` notation
--
-- do m = m
return Person{ age = 24, alive = True }
-- return = Right
Right Person{ age = 24, alive = True }As an exercise, you can try to extrapolate between those two examples to reason through what happens when we evaluate the remaining two examples which fail somewhere in between:
>>> example "24" "true"
>>> example "-5" "True"
It's also nice to use `when` to test age:
ReplyDeletewhen (age < 0) $ Left "Negative age"
Someone on Reddit pointed out that:
Deleteguard (age < 0) `orDie` "Negative age"
… would also work
should the guard arg be (age => 0)?
DeleteYes, you're correct
DeleteRust calls this pattern "Early returns": https://doc.rust-lang.org/rust-by-example/error/result/early_returns.html
ReplyDeleteI'm surprised you didn't mention the note and hush functions in your errors package?
ReplyDeletehttps://hackage.haskell.org/package/errors-2.3.0/docs/Control-Error-Util.html#v:note
I love these functions. note is the same as orDie.