Ergonomic newtypes for Haskell strings and numbers
This blog post summarizes a very brief trick I commonly recommend
whenever I see something like this:
{-# LANGUAGE OverloadedStrings #-}
import Data.Text (Text)
import Numeric.Natural (Natural)
newtype Name = Name { getName :: Text }
deriving (Show)
newtype Age = Age { getAge :: Natural }
deriving (Show)
data Person = Person { name :: Name, age :: Age }
deriving (Show)
example :: Person
example = Person{ name = Name "John Doe", age = Age 42 }
… where the newtypes are not opaque (i.e. the
newtype constructors are exported), so the
newtypes are more for documentation purposes rather than
type safety.
The issue with the above code is that the newtypes add
extra boilerplate for both creating and displaying those types. For
example, in order to create the Name and Age
newtypes you need to explicitly specify the
Name and Age constructors (like in the
definition for example above) and they also show up when
displaying values for debugging purposes (e.g. in the REPL):
>>> example
Person {name = Name {getName = "John Doe"}, age = Age {getAge = 42}}
Fortunately, you can easily elide these noisy constructors if you
follow these rules of thumb:
Derive IsString for newtypes around
string-like types
Derive Num for newtypes around numeric
types
Change the Show instances to use the underlying
Show for the wrapped type
For example, I would suggest amending the original code like
this:
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE OverloadedStrings #-}
module Example1 where
import Data.Text (Text)
import Data.String (IsString)
import Numeric.Natural (Natural)
newtype Name = Name { getName :: Text }
deriving newtype (IsString, Show)
newtype Age = Age { getAge :: Natural }
deriving newtype (Num, Show)
data Person = Person { name :: Name, age :: Age }
deriving stock (Show)
example :: Person
example = Person{ name = "John Doe", age = 42 }
… and now the Age and Name constructors are
invisible, even when displaying these types (using their
Show instances):
>>> example
Person {name = "John Doe", age = 42}
That is the entirety of the trick, but if you still don’t follow,
I’ll expand upon that below.
Explanation
Revisiting the starting code:
{-# LANGUAGE OverloadedStrings #-}
import Data.Text (Text)
import Numeric.Natural (Natural)
newtype Name = Name { getName :: Text }
deriving (Show)
newtype Age = Age { getAge :: Natural }
deriving (Show)
data Person = Person { name :: Name, age :: Age }
deriving (Show)
example :: Person
example = Person{ name = Name "John Doe", age = Age 42 }
… the first thing we’re going to do is to enable the
DerivingStrategies language extension because I’m going to
lean pretty heavily on Haskell’s support for deriving typeclass
instances in this post and I want to be more explicit about how these
instances are being derived:
{-# LANGUAGE DerivingStrategies #-}
newtype Name = Name { getName :: Text }
deriving stock (Show)
newtype Age = Age { getAge :: Natural }
deriving stock (Show)
I’ve changed the code to explicitly specify that we’re
deriving Show using the “stock”
deriving strategy, meaning that Haskell has built-in language support
for deriving Show and we’re going to use that.
The next step is that we’re going to add an IsString
instance for Name because it wraps a string-like type
(Text). However, at first we’ll write out the instance by
hand:
import Data.String (IsString(..))
instance IsString Name where
fromString string = Name (fromString string)
This IsString instance works in conjunction with
Haskell’s OverloadedStrings so that we can directly use a
string literal in place of a Name, like this:
example :: Person
example = Person{ name = "John Doe", age = Age 42 }
-- ↑
-- No more Name constructor required here
… and the reason that works is because the compiler implicitly
inserts fromString around all string literals when you
enable OverloadedStrings, as if we had written this:
example :: Person
example = Person{ name = fromString "John Doe", age = Age 42 }
The IsString instance for Name:
instance IsString Name where
fromString string = Name (fromString string)
… essentially defers to the IsString instance for the
underlying wrapped type (Text). In fact, this pattern of
deferring to the underlying instance is common enough that Haskell
provides a language extension for this purpose:
GeneralizedNewtypeDeriving. If we enable that language
extension, then we can simplify the IsString instance to
this:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
newtype Name = Name { getName :: Text }
deriving stock (Show)
deriving newtype (IsString)
The deriving newtype indicates that we’re explicitly
using the GeneralizedNewtypeDeriving extension to derive
the implementation for the IsString instance.
In this particular case we don’t have to specify the deriving
strategy; we could have just said deriving (IsString) and
it still would have worked because it wasn’t ambiguous; no other
deriving strategy would have worked in this case. However, as we’re
about to see there are cases where you want to explicitly disambiguate
between multiple possible deriving strategies.
The next step is that we implement Num for our
Age type since it wraps a numeric type
(Natural):
instance Num Age where
Age x + Age y = Age (x + y)
Age x - Age y = Age (x - y)
Age x * Age y = Age (x * y)
negate (Age x) = Age (negate x)
abs (Age x) = Age (abs x)
signum (Age x) = Age (signum x)
fromInteger integer = Age (fromInteger integer)
Bleh! That’s a lot of work to do when really we were most interested
in the fromInteger method (so that we could use numeric
literals directly to create an Age).
The reason we care about the fromInteger method is
because Haskell lets you use integer literals for any type that
implements Num (without any language extension; this is
part of the base language). So, for example, we can further simplify our
example Person to:
example :: Person
example = Person{ name = "John Doe", age = 42 }
-- ↑
-- No more Age constructor required here
… and the reason that works is because the compiler implicitly
inserts fromInteger around all integer literals, as if we
had written this:
example :: Person
example = Person{ name = "John Doe", age = fromInteger 42 }
It would be nice if Haskell had a dedicated class for just the
fromInteger method (e.g. IsInteger), but alas
if we want ergonomic support for numeric literals then we have to add
support for other numeric operations, too, even if they might not
necessarily make sense for our newtype.
Like before, though, we can use the
GeneralizedNewtypeDeriving extension to derive
Num instead:
newtype Age = Age { getAge :: Natural }
deriving stock (Show)
deriving newtype (Num)
Much better!
However, we’re not done, yet, because at the moment these
Name and Age constructors still appear in the
debug output:
>>> example
Person {name = Name {getName = "John Doe"}, age = Age {getAge = 42}}
Yuck!
Okay, so the final step is to change the Show instances
for Name and Age to defer to the
Show instances for their underlying types:
instance Show Name where
show (Name string) = show string
instance Show Age where
show (Age natural) = show natural
These are still valid Show instances! The
Show class requires that the displayed representation
should be valid Haskell code for creating a value of that type, and in
both cases that’s what we get.
For example, if you show a value like
Name "John Doe" you will get "John Doe", and
that’s valid Haskell code for creating a Name if you enable
OverloadedStrings.
Note: You might argue that this is not a valid Show
instance because it requires the use of a language extension
(e.g. OverloadedStrings) in order to be valid code.
However, this is no different than the Show instance for
Text (which is also only valid if you enable
OverloadedStrings), and most people do not take issue with
that Show instance for Text either.
Similarly, if you show a value like Age 42
you will get 42, and that’s valid Haskell code for creating
an Age.
So with those two new Show instances our
Person type now renders much more compactly:
>>> example
Person {name = "John Doe", age = 42}
… but we’re not done! The last part of the trick is to use
GeneralizedNewtypeDeriving to derive the Show
instances, like this:
newtype Name = Name { getName :: Text }
deriving newtype (IsString, Show)
newtype Age = Age { getAge :: Natural }
deriving newtype (Num, Show)
… and this is where the DerivingStrategies language
extension really matters! Without that extension there would be no way
to tell the compiler to derive Show by deferring to the
underlying type. By default, if you don’t specify the deriving strategy
then the compiler assumes that derived Show instances use
the stock deriving strategy.
Conclusion
There’s one last bonus to doing things in this way: you might now be
able to hide the newtype constructor by not exporting it! I
think this is actually the most important benefit of all because a
newtype with an exposed constructor doesn’t really improve
upon the type safety of the underlying type.
When a newtype like Name or
Age exposes the newtype constructor then the
newtype serves primarily as documentation and I’m not a big
fan of this “newtypes as documentation” design pattern.
However, I’m not that strongly opposed to it either; I wouldn’t use it
in own code, but I also wouldn’t insist that others don’t use it.
Another post which takes a stronger stance on this is Names
are not type safety, especially the section on “Newtypes as
tokens”.
I’m personally okay with other people using newtypes in
this way, but if you do use “newtypes as documentation”
then please add IsString / Num /
Show instances as described in this post so that they’re
more ergonomic for others to use.