This is a short post explaining why you should prefer do
notation when assembling a record, instead of using Applicative
operators (i.e. (<$>)
/(<*>)
).
This advice applies both for type constructors that implement Monad
(e.g. IO
)
and also for type constructors that implement Applicative
but not Monad
(e.g. the
Parser
type constructor from the
optparse-applicative
package). The only difference is
that in the latter case you would need to enable the
ApplicativeDo
language extension.
The guidance is pretty simple. Instead of doing this:
data Person = Person
firstName :: String
{ lastName :: String
,
}
getPerson :: IO Person
= Person <$> getLine <*> getLine getPerson
… you should do this:
{-# LANGUAGE RecordWildCards #-}
{-# OPTIONS_GHC -Werror=missing-fields #-}
data Person = Person
firstName :: String
{ lastName :: String
,
}
getPerson :: IO Person
= do
getPerson <- getLine
firstName <- getLine
lastName return Person{..}
Why is the latter version better? There are a few reasons.
Ergonomics
It’s more ergonomic to assemble a record using do
notation because you’re less pressured to try to cram all the logic into
a single expression.
For example, suppose we wanted to explicitly prompt the user to enter
their first and last name. The typical way people would do extend the
former example using Applicative
operators would be something like this:
getPerson :: IO Person
=
getPerson Person
<$> (putStrLn "Enter your first name:" *> getLine)
<*> (putStrLn "Enter your last name:" *> getLine)
The expression gets so large that you end up having to split it over
multiple lines, but if we’re already splitting it over multiple lines
then why not use do
notation?
getPerson :: IO Person
= do
getPerson putStrLn "Enter your first name:"
<- getLine
firstName
putStrLn "Enter your last name:"
<- getLine
lastName
return Person{..}
Wow, much clearer! Also, the version using do
notation
doesn’t require that the reader is familiar with all of the Applicative
operators, so it’s more approachable to Haskell beginners.
Order insensitivity
Suppose we take that last example and then change the
Person
type to reorder the two fields:
data Person = Person
lastName :: String
{ firstName :: String
, }
… then the former version using Applicative
operators would silently break: the first name and last name would now
be read in the wrong order. The latter version (using do
notation) is unaffected.
More generally, the approach using do
notation never
breaks or changes its behavior if you reorder the fields in the datatype
definition. It’s completely order-insensitive.
Better error messages
If you add a new argument to the Person
constructor,
like this:
data Person = Person
alive :: Bool
{ firstName :: String
, lastName :: String
, }
… and you don’t make any other changes to the code then the former version will produce two error messages, neither of which is great:
Example.hs:
• Couldn't match type ‘String -> Person’ with ‘Person’
Expected: Bool -> String -> Person
Actual: Bool -> String -> String -> Person
• Probable cause: ‘Person’ is applied to too few arguments
In the first argument of ‘(<$>)’, namely ‘Person’
In the first argument of ‘(<*>)’, namely ‘Person <$> getLine’
In the expression: Person <$> getLine <*> getLine
|
| getPerson = Person <$> getLine <*> getLine
| ^^^^^^
Example.hs:
• Couldn't match type ‘[Char]’ with ‘Bool’
Expected: IO Bool
Actual: IO String
• In the second argument of ‘(<$>)’, namely ‘getLine’
In the first argument of ‘(<*>)’, namely ‘Person <$> getLine’
In the expression: Person <$> getLine <*> getLine
|
| getPerson = Person <$> getLine <*> getLine
| ^^^^^^^
… whereas the latter version produces a much more direct error message:
Example.hs:…
• Fields of ‘Person’ not initialised:
alive :: Bool
• In the first argument of ‘return’, namely ‘Person {..}’
In a stmt of a 'do' block: return Person {..}
In the expression:
do putStrLn "Enter your first name: "
firstName <- getLine
putStrLn "Enter your last name: "
lastName <- getLine
....
|
| return Person{..}
| ^^^^^^^^^^
^^^^^^^^^^
… and that error message more clearly suggests to the developer what
needs to be fixed: the alive
field needs to be initialized.
The developer doesn’t have to understand or reason about curried
function types to fix things.
Caveats
This advice obviously only applies for datatypes that are defined using record syntax. The approach I’m advocating here doesn’t work at all for datatypes with positional arguments (or arbitrary functions).
However, this advice does still apply for type constructors
that are Applicative
s
and not Monad
s;
you just need to enable the
ApplicativeDo
language extension. For example, this
means that you can use this same trick for defining command-line Parser
s
from the
optparse-applicative
package:
{-# LANGUAGE ApplicativeDo #-}
{-# LANGUAGE RecordWildCards #-}
{-# OPTIONS_GHC -Werror=missing-fields #-}
import Options.Applicative (Parser, ParserInfo)
import qualified Options.Applicative as Options
data Person = Person
firstName :: String
{ lastName :: String
,deriving (Show)
}
parsePerson :: Parser Person
= do
parsePerson <- Options.strOption
firstName "first-name"
( Options.long <> Options.help "Your first name"
<> Options.metavar "NAME"
)
<- Options.strOption
lastName "last-name"
( Options.long <> Options.help "Your last name"
<> Options.metavar "NAME"
)
return Person{..}
parserInfo :: ParserInfo Person
=
parserInfo
Options.info parsePerson"Parse and display a person's first and last name")
(Options.progDesc
main :: IO ()
= do
main <- Options.execParser parserInfo
person
print person
Interesting post Gabriella, it's interesting that we have these discussions so long after the emergence of ApplicativeDo.
ReplyDeleteRegarding "Order insensitivity": If you have a lot of newtypes, you're less likely to pass arguments out of order. You're not mentioning newtypes, so I wonder if you actually have this many raw types in your records? Some people would argue against that. Of course newtypes wouldn't fix the other issues you mention.
Another point is that you're using RecordWildCards. One could argue that this makes it difficult to see what goes into the record and what doesn't. But I suppose you'd still like ApplicativeDo if you had NamedFieldPuns instead of RecordWildCards?
Good one! Readibility, easier modification, better error messages, robustness to position change, all good arguments. It is tempting in Haskell to golf the code, and often it can be ok to do it, but here I agree it is often better to go for `do` notation.
ReplyDeleteGreat post, Gabriella! As a newcomer to world of Haskell programming, the `do` syntax is familiar as it reminds me of how I'd work with analogous constructs in other languages like Rust. The improvement in error messages using this syntax is reason enough to prefer it in situations where both alternatives are equally valid.
ReplyDelete