Sunday, October 18, 2015

Explicit is better than implicit

Many of the limitations associated with Haskell type classes can be solved very cleanly with lenses. This lens-driven programming is more explicit but significantly more general and (in my opinion) easier to use.

All of these examples will work with any lens-like library, but I will begin with the lens-simple library to provide simpler types with better type inference and better type errors and then later transition to the lens library which has a larger set of utilities.

Case study #1 - fmap bias

Let's begin with a simple example - the Functor instance for Either:

fmap (+ 1) (Right 2   ) = Right 3

fmap (+ 1) (Left "Foo") = Left "Foo"

Some people object to this instance because it's biased to Right values. The only way we can use fmap to transform Left values is to wrap Either in a newtype.

These same people would probably like the lens-simple library which provides an over function that generalizes fmap. Instead of using the type to infer what to transform we can explicitly specify what we wish to transform by supplying _Left or _Right:

$ stack install lens-simple --resolver=lts-3.9
$ stack ghci --resolver=lts-3.9
>>> import Lens.Simple
>>> over _Right (+ 1) (Right 2)
Right 3
>>> over _Right (+ 1) (Left "Foo")
Left "Foo"
>>> over _Left (++ "!") (Right 2)
Right 2
>>> over _Left (++ "!") (Left "Foo")
Left "Foo!"

The inferred types are exactly what we would expect:

>>> :type over _Right
over _Right :: (b -> b') -> Either a b -> Either a b'
>>> :type over _Left
over _Left :: (b -> b') -> Either b b1 -> Either b' b1

Same thing for tuples. fmap only lets us transform the second value of a tuple, but over lets us specify which one we want to transform:

>>> over _1 (+ 1)    (2, "Foo")
(3,"Foo")
>>> over _2 (++ "!") (2, "Foo")
(2,"Foo!")

We can even transform both of the values in the tuple if they share the same type:

>>> over both (+ 1) (3, 4)
(4,5)

Again, the inferred types are exactly what we expect:

>>> :type over _2
over _2 :: (b -> b') -> (a, b) -> (a, b')
>>> :type over _1
over _1 :: (b -> b') -> (b, b1) -> (b', b1)
>>> :type over both
over both :: (b -> b') -> (b, b) -> (b', b')

Case study #2 - length confusion

Many people have complained about the tuple instance for Foldable, which gives weird behavior like this in ghc-7.10 or later:

>>> length (3, 4)
1

We could eliminate all confusion by specifying what we intend to count at the term level instead of the type level:

>>> lengthOf _2   (3, 4)
1
>>> lengthOf both (3, 4)
2

This works for Either, too:

>>> lengthOf _Right (Right 1)
1
>>> lengthOf _Right (Left "Foo")
0
>>> lengthOf _Left  (Right 1)
0
>>> lengthOf _Left  (Left "Foo")
1

... and this trick is not limited to length. We can improve any Foldable function by taking a lens instead of a type class constraint:

>>> sumOf both (3, 4)
7
>>> mapMOf_ both print (3, 4)
3
4

Case study #3 - Monomorphic containers

fmap doesn't work on ByteString because ByteString is not a type constructor and has no type parameter that we can map over. Some people use the mono-foldable or mono-traversable packages to solve this problem, but I prefer to use lenses. These examples will require the lens library which has more batteries included.

For example, if I want to transform each character of a Text value I can use the text optic:

$ stack install lens --resolver=lts-3.9  # For `text` optics
$ stack ghci --resolver=lts-3.9
>>> import Control.Lens
>>> import Data.Text.Lens
>>> import qualified Data.Text as Text
>>> let example = Text.pack "Hello, world!"
>>> over text succ example
"Ifmmp-!xpsme\""

I can use the same optic to loop over each character:

>>> mapMOf_ text print example
'H'
'e'
'l'
'l'
'o'
','
' '
'w'
'o'
'r'
'l'
'd'
'!'

There are also optics for ByteStrings, too:

>>> import Data.ByteString.Lens
>>> import qualified Data.ByteString as ByteString
>>> let example2 = ByteString.pack [0, 1, 2]
>>> mapMOf_ bytes print example2
0
1
2

The lens approach has one killer feature over mono-foldable and mono-traversable which is that you can be explicit about what exactly you want to map over. For example, suppose that I want to loop over the bits of a ByteString instead of the bytes. Then I can just provide an optic that points to the bits and everyting "just works":

>>> import Data.Bits.Lens
>>> mapMOf_ (bytes . bits) print example2
False
False
False
False
False
False
False
False
True
False
False
False
False
False
False
False
False
True
False
False
False
False
False
False

The mono-traversable or mono-foldable packages do not let you specify what you want to loop over. Instead, the MonoFoldable and MonoTraversable type classes guess what you want the elements to be, and if they guess wrong then you are out of luck.

Conclusion

Here are some more examples to illustrate how powerful and general the lens approach is over the type class approach.

>>> lengthOf (bytes . bits) example2
24
>>> sumOf (both . _1) ((2, 3), (4, 5))
6
>>> mapMOf_ (_Just . _Left) print (Just (Left 4))
4
>>> over (traverse . _Right) (+ 1) [Left "Foo", Right 4, Right 5]
[Left "Foo",Right 5,Right 6]

Once you get used to this style of programming you begin to prefer specifying things at the term level instead of relying on type inference or wrangling with newtypes.

Wednesday, October 7, 2015

Basic Haskell Examples

The Haskell community self-selects for people interested in unique things that Haskell can do that other languages cannot do. Consequently, a large chunk of Haskell example code in the wild uses advanced idioms (and I'm guilty of that, too).

So I would like to swing the pendulum in the other direction by just writing five small but useful programs without any imports, language extensions, or advanced features. These are programs that you could write in any other language and that's the point: you can use Haskell in the same way that you use other languages.

They won't be the prettiest or most type-safe programs, but that's okay.

Example #1: TODO program

This program is an interactive TODO list:

putTodo :: (Int, String) -> IO ()
putTodo (n, todo) = putStrLn (show n ++ ": " ++ todo)

prompt :: [String] -> IO ()
prompt todos = do
    putStrLn ""
    putStrLn "Current TODO list:"
    mapM_ putTodo (zip [0..] todos)
    command <- getLine
    interpret command todos

interpret :: String -> [String] -> IO ()
interpret ('+':' ':todo) todos = prompt (todo:todos)
interpret ('-':' ':num ) todos =
    case delete (read num) todos of
        Nothing -> do
            putStrLn "No TODO entry matches the given number"
            prompt todos
        Just todos' -> prompt todos'
interpret  "q"           todos = return ()
interpret  command       todos = do
    putStrLn ("Invalid command: `" ++ command ++ "`")
    prompt todos

delete :: Int -> [a] -> Maybe [a]
delete 0 (_:as) = Just as
delete n (a:as) = do
    as' <- delete (n - 1) as
    return (a:as')
delete _  []    = Nothing

main = do
    putStrLn "Commands:"
    putStrLn "+ <String> - Add a TODO entry"
    putStrLn "- <Int>    - Delete the numbered entry"
    putStrLn "q          - Quit"
    prompt []

Example usage:

$ runghc todo.hs
Commands:
+ <String> - Add a TODO entry
- <Int>    - Delete the numbered entry
q          - Quit

Current TODO list:
+ Go to bed

Current TODO list:
0: Go to bed
+ Buy some milk

Current TODO list:
0: Buy some milk
1: Go to bed
+ Shampoo the hamster

Current TODO list:
0: Shampoo the hamster
1: Buy some milk
2: Go to bed
- 1

Current TODO list:
0: Shampoo the hamster
1: Go to bed
q

Example #2 - Rudimentary TSV to CSV

The following program transforms tab-separated standard input into comma-separated standard output. The program does not handle more complex cases like quoting and is not standards-compliant:

tabToComma :: Char -> Char
tabToComma '\t' = ','
tabToComma  c   = c

main = interact (map tabToComma)

Example usage:

$ cat file.tsv
1   2   3
4   5   6
$ cat file.tsv | runghc tsv2csv.hs
1,2,3
4,5,6

Example #3 - Calendar

This program prints a calendar for 2015

data DayOfWeek
    = Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday
    deriving (Eq, Enum, Bounded)

data Month
    = January | February | March     | April   | May      | June
    | July    | August   | September | October | November | December
    deriving (Enum, Bounded, Show)

next :: (Eq a, Enum a, Bounded a) => a -> a
next x | x == maxBound = minBound
       | otherwise     = succ x

pad :: Int -> String
pad day = case show day of
    [c] -> [' ', c]
    cs  -> cs

month :: Month -> DayOfWeek -> Int -> String
month m startDay maxDay = show m ++ " 2015\n" ++ week ++ spaces Sunday
  where
    week = "Su Mo Tu We Th Fr Sa\n"

    spaces currDay | startDay == currDay = days startDay 1
                   | otherwise           = "   " ++ spaces (next currDay)

    days Sunday    n | n > maxDay = "\n"
    days _         n | n > maxDay = "\n\n"
    days Saturday  n              = pad n ++ "\n" ++ days  Sunday    (succ n)
    days day       n              = pad n ++ " "  ++ days (next day) (succ n)

year = month January   Thursday  31
    ++ month February  Sunday    28
    ++ month March     Sunday    31
    ++ month April     Wednesday 30
    ++ month May       Friday    31
    ++ month June      Monday    30
    ++ month July      Wednesday 31
    ++ month August    Saturday  31
    ++ month September Tuesday   30
    ++ month October   Thursday  31
    ++ month November  Sunday    30
    ++ month December  Tuesday   31

main = putStr year

Example usage:

$ runghc calendar.hs
January 2015
Su Mo Tu We Th Fr Sa
             1  2  3
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

February 2015
Su Mo Tu We Th Fr Sa
 1  2  3  4  5  6  7
 8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28

March 2015
Su Mo Tu We Th Fr Sa
 1  2  3  4  5  6  7
 8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 

April 2015
Su Mo Tu We Th Fr Sa
          1  2  3  4
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 

May 2015
Su Mo Tu We Th Fr Sa
                1  2
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31 

June 2015
Su Mo Tu We Th Fr Sa
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 

July 2015
Su Mo Tu We Th Fr Sa
          1  2  3  4
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 

August 2015
Su Mo Tu We Th Fr Sa
                   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 

September 2015
Su Mo Tu We Th Fr Sa
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 

October 2015
Su Mo Tu We Th Fr Sa
             1  2  3
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

November 2015
Su Mo Tu We Th Fr Sa
 1  2  3  4  5  6  7
 8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 

December 2015
Su Mo Tu We Th Fr Sa
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31 

Example #4 - Decode RNA

This program converts an RNA sequence read from standard input into the equivalent sequence of amino acids written to standard output, using the genetic code:

data RNA = A | U | C | G
    deriving (Read)

data AminoAcid
    = Ala | Cys | Asp | Glu | Phe | Gly | His | Ile | Lys | Leu
    | Met | Asn | Pro | Gln | Arg | Ser | Thr | Val | Trp | Tyr
    | Stop
    deriving (Show)

decode :: RNA -> RNA -> RNA -> AminoAcid 
decode U U U = Phe
decode U U C = Phe
decode U U A = Leu
decode U U G = Leu
decode U C _ = Ser
decode U A U = Tyr
decode U A C = Tyr
decode U A A = Stop
decode U A G = Stop
decode U G U = Cys
decode U G C = Cys
decode U G A = Stop
decode U G G = Trp
decode C U _ = Leu
decode C C _ = Pro
decode C A U = His
decode C A C = His
decode C A A = Gln
decode C A G = Gln
decode C G _ = Arg
decode A U U = Ile
decode A U C = Ile
decode A U A = Ile
decode A U G = Met
decode A C _ = Thr
decode A A U = Asn
decode A A C = Asn
decode A A A = Lys
decode A A G = Lys
decode A G U = Ser
decode A G C = Ser
decode A G A = Arg
decode A G G = Arg
decode G U _ = Val
decode G C _ = Ala
decode G A U = Asp
decode G A C = Asp
decode G A A = Glu
decode G A G = Glu
decode G G _ = Gly

decodeAll :: [RNA] -> [AminoAcid]
decodeAll (a:b:c:ds) = decode a b c : decodeAll ds
decodeAll  _         = []

main = do
    str <- getContents
    let rna :: [RNA]
        rna = map (\c -> read [c]) str

    let aminoAcids :: [AminoAcid]
        aminoAcids = decodeAll rna

    putStrLn (concatMap show aminoAcids)

Example usage:

$ echo "ACAUGUCAGUACGUAGCUAC" | runghc decode.hs
ThrCysGlnTyrValAlaThr

Example #5 - Bedtime story generator

This generates a "random" bedtime story:

stories :: [String]
stories = do
    let str0 = "There once was "

    str1 <- ["a princess ", "a cat ", "a little boy "]

    let str2 = "who lived in "

    str3 <- ["a shoe.", "a castle.", "an enchanted forest."]

    let str4 = "  They found a "

    str5 <- ["giant ", "frog ", "treasure chest " ]

    let str6 = "while "

    str7 <- ["when they got lost ", "while strolling along "]

    let str8 = "and immediately regretted it.  Then a "

    str9 <- ["lumberjack ", "wolf ", "magical pony "]

    let str10 = "named "

    str11 <- ["Pinkie Pie ", "Little John ", "Boris "]

    let str12 = "found them and "

    str13 <- ["saved the day.", "granted them three wishes."]

    let str14 = "  The end."

    return (  str0
           ++ str1
           ++ str2
           ++ str3
           ++ str4
           ++ str5
           ++ str6
           ++ str7
           ++ str8
           ++ str9
           ++ str10
           ++ str11
           ++ str12
           ++ str13
           ++ str14
           )

main = do
    let len = length stories
    putStrLn ("Enter a number from 0 to " ++ show (len - 1))
    n <- readLn
    putStrLn ""
    putStrLn (stories !! n)

Example usage:

$ runghc story.hs
Enter a number from 0 to 971
238

There once was a princess who lived in an enchanted forest.  They found a giant
while while strolling along and immediately regretted it.  Then a lumberjack
named Boris found them and saved the day.  The end.

Conclusion

If you would like to contribute a simple example of your own, try sharing a paste of your program under the #Haskell #BackToBasics hashtags.

Friday, October 2, 2015

Polymorphism for dummies

This tutorial explains how polymorphism is implemented under the hood in Haskell using the least technical terms possible.

The simplest example of a polymorphic function is the identity function:

id :: a -> a
id x = x

The identity function works on any type of value. For example, I can apply id to an Int or to a String:

$ ghci
Prelude> id 4
4
Prelude> id "Test"
"Test"

Under the hood, the id function actually takes two arguments, not one.

-- Under the hood:

id :: forall a . a -> a
id @a x = x

The first argument of id (the @a) is the same as the a in the type signature of id. The type of the second argument (x) can refer to the value of the first argument (the @a).

If you don't believe me, you can prove this yourself by just taking the following module:

module Id where

import Prelude hiding (id)

id :: a -> a
id x = x

... and ask ghc to output the low-level "core" representation of the above id function:

$ ghc -ddump-simpl id.hs
[1 of 1] Compiling Id               ( id.hs, id.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 4, types: 5, coercions: 0}

Id.id :: forall a_apw. a_apw -> a_apw
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]
Id.id = \ (@ a_aHC) (x_apx :: a_aHC) -> x_apx

The key part is the last line, which if you clean up looks like this:

id = \(@a) (x :: a) -> x

ghc prefixes types with @ when using them as function arguments.

In other words, every time we "generalize" a function (i.e. make it more polymorphic), we add a new hidden argument to that function corresponding to the polymorphic type.

Specialization

We can "specialize" id to a narrower type that is less polymorphic (a.k.a. "monomorphic"):

idString :: String -> String
idString = id

Under the hood, what actually happened was that we applied the id function to the String type, like this:

-- Under the hood:

idString :: String -> String
idString = id @String

We can prove this ourselves by taking this module:

module Id where

import Prelude hiding (id)

id :: a -> a
id x = x

idString :: String -> String
idString = id

... and studying the core that this module generates:

$ ghc -ddump-simpl id.hs
[1 of 1] Compiling Id               ( id.hs, id.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 6, types: 8, coercions: 0}

Id.id :: forall a_apx. a_apx -> a_apx
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]
Id.id = \ (@ a_aHL) (x_apy :: a_aHL) -> x_apy

Id.idString :: GHC.Base.String -> GHC.Base.String
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]
Id.idString = Id.id @ GHC.Base.String

If we clean up the last line, we get:

idString = id @String

In other words, every time we "specialize" a function (i.e. make it less polymorphic), we apply it to a hidden type argument. Again, ghc prefixes types with @ when using them as values.

So back in the REPL, when we ran code like this:

>>> id 4
4
>>> id "Test"
"Test"

... ghc was implicitly inserting hidden type arguments for us, like this:

>>> id @Int 4
4
>>> id @String "Test"

Conclusion

That's it! There's really nothing more to it than that.

The general trick for passing around type parameters as ordinary function arguments was first devised as part of System F.

This is the same way that many other languages encode polymorphism (like ML, Idris, Agda, Coq) except that some of them use a more general mechanism. However, the basic principles are the same:

  • When you make something more polymorphic you are adding additional type arguments to your function
  • When you make something less polymorphic you apply your function to type values

In ghc-8.0, you will be allowed to explicitly provide type arguments yourself if you prefer. For example, consider the read function:

read :: Read a => String -> a

If you wanted to specify what type to read without a type annotation, you could provide the type you desired as an argument to read:

read @Int :: String -> Int

Here are some other examples:

[] @Int :: [Int]

show @Int :: Int -> String

Previously, the only way to pass a type as a value was to use the following Proxy type:

data Proxy a = Proxy

... which reified a type as a value. Now you will be able to specialize functions by providing the type argument directly instead of adding a type annotation.