This post documents a small trick that I use to avoid deeply-nested error-handling code. This trick is a common piece of Haskell folklore that many people either learn from others or figure out on their own, but I’m not sure what the official name of this trick is (so I had difficulty searching for prior art explaining this trick). However, I’ve taught this trick to others enough times that I think it merits a blog post of its own.
This post assumes some familiarity with Haskell’s Either
type and do
notation, but the Appendix at the end of the post will walk through all of the details using equational reasoning if you’re having trouble following along with how things work.
The motivating example
Let’s begin from the following contrived Either
-based example that uses deeply nested error-handling:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
example :: String -> String -> Either String Person
= do
example ageString aliveString case readMaybe ageString of
Nothing -> do
Left "Invalid age string"
Just age -> do
if age < 0
then do
Left "Negative age"
else do
case readMaybe aliveString of
Nothing -> do
Left "Invalid alive string"
Just alive -> do
return Person{ age, alive }
… which we can use like this:
>>> example "24" "True"
Right (Person {age = 24, alive = True})
>>> example "24" "true"
Left "Invalid alive string"
>>> example "" "True"
Left "Invalid age string"
>>> example "-5" "True"
Left "Negative age"
Notice how the above coding style increases the nesting / indentation level every time we parse or validate the input in some way. We could conserve indentation by using 2-space indents or indenting only once for each level of nesting:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
example :: String -> String -> Either String Person
= case readMaybe ageString of
example ageString aliveString Nothing -> Left "Invalid age string"
Just age -> if age < 0
then Left "Negative age"
else case readMaybe aliveString of
Nothing -> Left "Invalid alive string"
Just alive -> return Person{ age, alive }
However, I think most people writing code like that would prefer to keep the indentation level stable, no matter how many validations the code requires.
The trick
Fortunately, we can avoid nesting the code with the following change:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
example :: String -> String -> Either String Person
= do
example ageString aliveString <- case readMaybe ageString of
age Nothing -> Left "Invalid age string"
Just age -> return age
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe aliveString of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
Here we make use of several useful properties:
return
in Haskell does not short-circuit or exit from the surrounding codeIn fact, some Haskell programmers prefer to use
pure
(a synonym forreturn
) to better convey the absence of short-circuiting behavior:{-# LANGUAGE NamedFieldPuns #-} import Text.Read (readMaybe) data Person = Person { age :: Int, alive :: Bool } deriving (Show) example :: String -> String -> Either String Person = do example ageString aliveString <- case readMaybe ageString of age Nothing -> Left "Invalid age string" Just age -> pure age if age < 0 then Left "Negative age" else pure () <- case readMaybe aliveString of alive Nothing -> Left "Invalid alive string" Just alive -> pure alive pure Person{ age, alive }
Left
does short-circuit from the surrounding codeIn fact, we can define a synonym for
Left
namedthrow
if we want to better convey the presence of short-circuiting behavior::{-# LANGUAGE NamedFieldPuns #-} import Text.Read (readMaybe) data Person = Person { age :: Int, alive :: Bool } deriving (Show) example :: String -> String -> Either String Person = do example ageString aliveString <- case readMaybe ageString of age Nothing -> throw "Invalid age string" Just age -> pure age if age < 0 then throw "Negative age" else pure () <- case readMaybe aliveString of alive Nothing -> throw "Invalid alive string" Just alive -> pure alive pure Person{ age, alive } throw :: String -> Either String a = Left throw
Left
/throw
“return” any type of valueIf you look at the type of
throw
, the “return” type is a polymorphic typea
becausethrow
short-circuits (and therefore doesn’t need to return a real value of that type).This is why the type checker doesn’t complain when we do this:
<- case readMaybe ageString of age Nothing -> throw "Invalid age string" Just age -> pure age
… because both branches of the
case
expression type-check as an expression that “return”s anInt
. The left branch type-checks as a branch that returns anInt
because it cheats and never needs to return anything and the right branch returns a bona-fideInt
(theage
).We can make this explicit by giving both branches of the
case
expression an explicit type annotation:<- case readMaybe ageString of age Nothing -> throw "Invalid age string" :: Either String Int Just age -> pure age :: Either String Int
This means that both branches of a
case
expression will always share the same return type if at least one branch is aLeft
/throw
. Similarly, both branches of anif
expression will share the same return type if at least one branch is aLeft
/throw
:if age < 0 then throw "Negative age" :: Either String () else pure () :: Either String ()
We can return a value from a
case
expressionNew Haskell programmers might not realize that
case
expressions can return a value (just like any other expression), which means that their result can be stored as a new value within the surrounding scope:<- case readMaybe ageString of age Nothing -> throw "Invalid age string" :: Either String Int Just age -> pure age :: Either String Int
The type-checker doesn’t mind that only the second branch returns a valid
age
because it knows that the outerage
is unreachable if the first branch short-circuits. This understanding is not built into the compiler, but is rather an emergent property of howdo
notation works forEither
. See the Appendix for a fully-worked equational reasoning example showing why this works.
Combinators
You can build upon this trick by defining helpful utility functions to simplify things further. For example, I sometimes like to define the following helper function:
orDie :: Maybe a -> String -> Either String a
Just a `orDie` _ = return a
Nothing `orDie` string = Left string
{- Equivalent, more explicit, implementation:
maybe `orDie` string =
case maybe of
Nothing -> Left string
Just a -> return a
-}
… which you can use like this:
{-# LANGUAGE NamedFieldPuns #-}
import Text.Read (readMaybe)
data Person = Person { age :: Int, alive :: Bool } deriving (Show)
orDie :: Maybe a -> String -> Either String a
Just a `orDie` _ = Right a
Nothing `orDie` string = Left string
example :: String -> String -> Either String Person
= do
example ageString aliveString <- readMaybe ageString `orDie` "Invalid age string"
age
if age < 0
then Left "Negative age"
else return ()
<- readMaybe aliveString `orDie` "Invalid alive string"
alive
return Person{ age, alive }
… which is even more clear (in my opinion).
Conclusion
That is the entirety of the trick. You can return values from case
expressions to avoid deeply-nesting your Either
code, or you can define utility functions (such as orDie
) which do essentially the same thing.
This trick applies equally well to any other Monad
that supports some notion of short-circuiting on failure, such as ExceptT
(from transformers
/ mtl
). The only essential ingredient is some throw
-like function that short-circuits and therefore has a polymorphic return type.
Appendix
You can build a better intuition for why this works using equational reasoning. We’ll begin from an example usage of our function and at each step of the process we will either desugar the code or substitute one subexpression with another equal subexpression.
In all of the examples, we’ll begin from this definition for the example
function:
example :: String -> String -> Either String Person
= do
example ageString aliveString <- case readMaybe ageString of
age Nothing -> Left "Invalid age string"
Just age -> return age
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe aliveString of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
Let’s first begin with the example that fails the most quickly:
"" "True"
example
-- Substitute `example` with its definition:
--
-- … replacing `ageString` with `""`
--
-- … replacing `aliveString` with `"True"`
do age <- case readMaybe "" of
Nothing -> Left "Invalid age string"
Just age -> return age
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Definition of `readMaybe` (not shown):
--
-- (readMaybe "" :: Maybe Int) = Nothing
do age <- case Nothing of
Nothing -> Left "Invalid age string"
Just age -> return age
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Simplify the first `case` expression
do age <- Left "Invalid age string"
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Left "Invalid age string" >>= \age ->
do if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Definition of `>>=` for `Either`
--
-- (Left x >>= _) = (Left x)
Left "Invalid age string"
… and we’re done! The key step is the last bit where we simplify (Left … >>= _)
to (Left …)
, which is how Either
short-circuits on failure. Because this simplification does not use the downstream code everything type-checks just fine even though we never supply a valid age
.
For completeness, let’s also walk through the example where everything succeeds:
"24" "True"
example
-- Substitute `example` with its definition:
--
-- … replacing `ageString` with `"24"`
--
-- … replacing `aliveString` with `"True"`
do age <- case readMaybe "24" of
Nothing -> Left "Invalid age string"
Just age -> return age
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Definition of `readMaybe` (not shown):
--
-- (readMaybe "24" :: Maybe Int) = (Just 24)
do age <- case Just 24 of
Nothing -> Left "Invalid age string"
Just age -> return age
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Simplify the first `case` expression
do age <- return 24
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- return = Right
do age <- Right 24
if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Right 24 >>= \age ->
do if age < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age, alive }
-- Definition of `>>=` for `Either`
--
-- (Right x >>= f) = (f x)
--
-- This means that we substitute `age` with `24`
do if 24 < 0
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- (24 < 0) = False
do if False
then Left "Negative age"
else return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- (if False then l else r) = r
do return ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- return = Right
do Right ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- (do m; n) = (do _ <- m; n)
do _ <- Right ()
<- case readMaybe "True" of
alive Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Right () >>= _ ->
do alive <- case readMaybe "True" of
Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- Definition of `>>=` for `Either`
--
-- (Right x >>= f) = (f x)
--
-- This means that we substitute `_` (unused) with `()`
do alive <- case readMaybe "True" of
Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- Definition of `readMaybe` (not shown):
--
-- (readMaybe "True" :: Maybe Bool) = (Just True)
do alive <- case Just True of
Nothing -> Left "Invalid alive string"
Just alive -> return alive
return Person{ age = 24, alive }
-- Simplify the `case` expression
do alive <- return True
return Person{ age = 24, alive }
-- return = Right
do alive <- Right True
return Person{ age = 24, alive }
-- Desugar `do` notation one step
--
-- (do age <- m; n) = (m >>= \age -> n)
Right True >>= \alive ->
do return Person{ age = 24, alive }
-- Definition of `>>=` for `Either`
--
-- (Right x >>= f) = (f x)
--
-- This means that we substitute `alive` with `True`
do return Person{ age = 24, alive = True }
-- Desugar `do` notation
--
-- do m = m
return Person{ age = 24, alive = True }
-- return = Right
Right Person{ age = 24, alive = True }
As an exercise, you can try to extrapolate between those two examples to reason through what happens when we evaluate the remaining two examples which fail somewhere in between:
>>> example "24" "true"
>>> example "-5" "True"
It's also nice to use `when` to test age:
ReplyDeletewhen (age < 0) $ Left "Negative age"
Someone on Reddit pointed out that:
Deleteguard (age < 0) `orDie` "Negative age"
… would also work
should the guard arg be (age => 0)?
DeleteYes, you're correct
DeleteRust calls this pattern "Early returns": https://doc.rust-lang.org/rust-by-example/error/result/early_returns.html
ReplyDeleteI'm surprised you didn't mention the note and hush functions in your errors package?
ReplyDeletehttps://hackage.haskell.org/package/errors-2.3.0/docs/Control-Error-Util.html#v:note
I love these functions. note is the same as orDie.