## Wednesday, November 18, 2015

### Interactive and composable charts

I've added a diagrams backend to my typed-spreadsheet library which you can use to build composable graphical programs that update in response to user input.

Here's an example program that displays a circle that changes in response to various user inputs:

import Diagrams.Backend.Cairo (Cairo)
import Diagrams.Prelude

data AColor = Red | Orange | Yellow | Green | Blue | Purple
deriving (Enum, Bounded, Show)

toColor :: AColor -> Colour Double
toColor Red    = red
toColor Orange = orange
toColor Yellow = yellow
toColor Green  = green
toColor Blue   = blue
toColor Purple = purple

main :: IO ()
main = graphicalUI "Example program" logic
where
logic = combine <\$> radioButton      "Color"        Red [Orange .. Purple]
<*> spinButton       "X Coordinate" 1
<*> spinButton       "Y Coordinate" 1

combine :: AColor -> Double -> Double -> Double -> Diagram Cairo
combine color r x y =
circle r # fc (toColor color) # translate (r2 (x, -y))

Here is a video showing the example program in action:

#### Applicatives

The first half of the main function (named logic) requests four users inputs to parametrize the displayed circle:

• A radio button for selecting the circle's color
• A spin button for controlling radius which defaults to 100 (pixels)
• A spin button for controlling the x coordinate for the center of the circle
• A spin button for controlling the y coordinate for the center of the circle

Each of these inputs is an Updatable value and we can express that using Haskell's type system:

radioButton      "Color"        Red [Orange .. Purple] :: Updatable AColor
spinButtonAt 100 "Radius"       1                      :: Updatable Double
spinButton       "X Coordinate" 1                      :: Updatable Double
spinButton       "Y Coordinate" 1                      :: Updatable Double

The Updatable type implements Haskell's Applicative interface, meaning that you can combine smaller Updatable values into larger Updatable values using Applicative operations.

For example, consider this pure function that consumes four pure inputs and produces a pure diagram:

combine
:: AColor
-> Double
-> Double
-> Double
-> Diagram Cairo

Normally if we compute a pure function we would write something like this:

combine Green 40 10 20
:: Diagram Cairo

However, this result is static and unchanging. I would like to transform this function into one that accepts Updatable arguments and produces an Updatable result.

Fortunately, Haskell's Applicative interface lets us do just that. We can lift any pure function to operate on any type that implements the Applicative interface like the Updatable type. All we have to do is separate the function from the first argument using the (<\$>) operator and separate each subsequent argument using the (<*>) operator:

combine <\$> radioButton      "Color"        Red [Orange .. Purple]
<*> spinButton       "X Coordinate" 1
<*> spinButton       "Y Coordinate" 1
:: Updatable (Diagram Cairo)

Now the combine function accepts four Updatable arguments and produces an Updatable result! I can then pass this result to the graphicalUI function which builds a user interface from any Updatable Diagram:

graphicalUI :: Text -> Updatable Diagram -> IO ()

main = graphicalUI "Example program" logic

The Applicative operations ensure that every time one of our primitive Updatable inputs change, the composite Updatable Diagram immediately reflects that change.

#### Charts

One reason I wanted diagrams integration was to begin building interactive charts for typed spreadsheets. I'll illustrate this using a running example where I building up a successively more complex chart piece-by-piece.

Let's begin with a simple rectangle with an adjustable height (starting at 100 pixels):

import Diagrams.Backend.Cairo (Cairo)
import Diagrams.Prelude

import qualified Data.Text as Text

bar :: Int -> Updatable (Diagram Cairo)
bar i = fmap buildRect (spinButtonAt 100 label 1)
where
buildRect height = alignB (rect 30 height)

label = "Bar #" <> Text.pack (show i)

main :: IO ()
main = graphicalUI "Example program" (bar 1)

When we run this example we get a boring chart with a single bar:

However, the beauty of Haskell is composable abstractions like Applicative. We can take smaller pieces and very easily combine them into larger pieces. Each piece does one thing and does it well, and we compose them to build larger functionality from sound building blocks.

For example, if I want to create a bar chart with five individually updatable bars, I only need to add a few lines of code to create five bars and connect them:

import Diagrams.Backend.Cairo (Cairo)
import Diagrams.Prelude

import qualified Data.Text as Text

bar :: Int -> Updatable (Diagram Cairo)
bar i = fmap buildRect (spinButtonAt 100 label 1)
where
buildRect height = alignB (rect 30 height)

label = "Bar #" <> Text.pack (show i)

bars :: Int -> Updatable (Diagram Cairo)
bars n = fmap combine (traverse bar [1..n])
where
combine bars = alignX 0 (hcat bars)

main :: IO ()
main = graphicalUI "Example program" (bars 5)

This not only creates a bar chart with five bars, but also auto-generates a corresponding input cell for each bar:

Even better, all the inputs are strongly typed! The program enforces that all inputs are well-formed and won't let us input non-numeric values.

We also benefit from all the features of Haskell's diagrams library, which is an powerful Haskell library for building diagrams. Let's spruce up the diagram a little bit by adding color, spacing, and other embellishments:

{-# LANGUAGE FlexibleContexts  #-}
{-# LANGUAGE TypeFamilies      #-}

import Diagrams.Backend.Cairo (Cairo)
import Diagrams.Prelude

import qualified Data.Text as Text

bar :: Int -> Updatable (Diagram Cairo)
bar i = fmap buildBar (spinButtonAt 100 label 0.2)
where
color = case i `mod` 3 of
0 -> red
1 -> green
2 -> yellow

buildBar height =
(  alignB (   vine
<>  bubbles
)
<> alignB (   roundedRect 25 (height - 5) 5 # fc white
<>  roundedRect 30  height      5 # fc color
)
)
where
stepSize = 15

vine = strokeP (fromVertices (map toPoint [0..height]))
where
toPoint n = p2 (5 * cos (pi * n / stepSize), n)

bubble n =
# translate (r2 (0, n * stepSize))
# fc lightblue
where
radius = max 1 (min stepSize (height - n * stepSize)) / 5

bubbles = foldMap bubble [1 .. (height / stepSize) - 1]

label = "Bar #" <> Text.pack (show i)

bars :: Int -> Updatable (Diagram Cairo)
bars n = fmap combine (traverse bar [1..n])
where
combine bars = alignX 0 (hsep 5 [alignL yAxis, alignL (hsep 5 bars)])

yAxis = arrowV (r2 (0, 150))

main :: IO ()
main = graphicalUI "Example program" (bars 5)

One embellishment is a little animation where bubbles fade in and out near the top of the bar:

We can customize the visuals to our heart's content becuse the spreadsheet and diagram logic are both embedded within a fully featured programming language.

#### Conclusion

The typed-spreadsheet library illustrates how you can use the Haskell language to build high-level APIs that abstract way low-level details such as form building or reactive updates in this case.

In many languages high-level abstractions come at a price: you typically have to learn a domain-specific language in order to take advantage of high-level features. However, Haskell provides reusable interfaces like Applicatives that you learn once and apply over and over and over to each new library that you learn. This makes the Haskell learning curve very much like a "plateau": initially steep as you learn the reusable interfaces but then very flat as you repeatedly apply those interfaces in many diverse domains.

If you would like contribute to the typed-spreadsheet library you can contribute out-of-the-box charting functionality to help the library achieve feature parity with real spreadsheet software.

## Wednesday, November 11, 2015

I'm releasing the typed-spreadsheet library, which lets you build spreadsheets integrated with Haskell. I use the term "spreadsheet" a bit loosely, so I'll clarify what I mean by comparing and contrasting this library with traditional spreadsheets.

The best way to explain how this works is to begin with a small example:

import Control.Applicative

main :: IO ()
main = textUI "Example program" logic
where
-- Hate weird operators?  Read on!
logic = combine <\$> checkBox   "a"     -- Input #1
<*> spinButton "b" 1   -- Input #2
<*> spinButton "c" 0.1 -- Input #3
<*> entry      "d"     -- Input #4

combine a b c d = display (a, b + c, d) -- The output is a
-- function of all
-- four inputs

The above program builds a graphical user interface with four user inputs in the left panel and one output in the right panel:

The output is a text representation of a 3-tuple whose:

• 1st element is the checkbox state (False for unchecked, True for checked)
• 2nd element is the sum of the two numeric fields (labeled "b" and "c")
• 3rd element is the text entry field

The right panel immediately updates in response to any user input from the left panel. For example, every time we toggle the checkbox or enter numbers/text the right panel changes:

So in one sense this resembles a spreadsheet in that the output "cell" on the right (the text panel) updates immediately in response to the input "cell"s on the left (the checkbox, and numeric/text entry fields).

However, this also departs significantly from the traditional spreadsheet model: input controls reflect the type of input in order to make invalid inputs unrepresentable. For example, a Bool input corresponds to a checkbox.

#### Distribution

The generated executable is an ordinary binary so you can distribute the program to other users without needing to supply the Haskell compiler or toolchain. You can even fully statically link the executable for extra portability.

For example, say that I want to create a mortage calculator for somebody else to use. I can just write the following program:

import Control.Applicative
import Data.Monoid
import Data.Text (Text)

payment :: Double -> Double -> Double -> Text
payment mortgageAmount numberOfYears yearlyInterestRate
=  "Monthly payment: \$"
<> display (mortgageAmount * (i * (1 + i) ^ n) / ((1 + i) ^ n - 1))
where
n = truncate (numberOfYears * 12)
i = yearlyInterestRate / 12 / 100

logic :: Updatable Text
logic = payment <\$> spinButton "Mortgage Amount"          1000
<*> spinButton "Number of years"             1
<*> spinButton "Yearly interest rate (%)"    0.01

main :: IO ()
main = textUI "Mortgage payment" logic

... and compile that into an executable which I can give them. When they run the program they will get the following simple user interface:

Or maybe I want to write a simple "mad libs" program for my son to play:

import Data.Monoid

noun = entry "Noun"

verb = entry "Verb"

example =
"I want to " <> verb <> " every " <> noun <> " because they are so " <> adjective

main :: IO ()
main = textUI "Mad libs" example

This generates the following interface:

All the above examples have one thing in common: they only generate a single Text output. The typed-spreadsheet library does not permit multiple outputs or outputs other than Text. If we want to display multiple outputs then we need to somehow format and render all of them within a single Text value.

In the future the library may provide support for diagrams output instead of Text but for now I only provide Text outputs for simplicity.

#### Applicatives

The central type of this library is the Updatable type, which implements the Applicative interface. This interface lets us combine smaller Updatable values into larger Updatable values. For example, a checkBox takes a single Text argument (the label) and returns an Updatable Bool:

checkBox :: Text -> Updatable Bool

Using Applicative operators, (<\$>) and (<*>), we can lift any function over an arbitrary number of Updatable values. For example, here is how I would lift the binary (&&) operator over two check boxes:

z :: Updatable Bool
z = (&&) <\$> checkBox "x" <*> checkBox "y"

... or combine their output into a tuple:

both :: Updatable (Bool, Bool)
both = (,) <\$> checkBox "x" <*> checkBox "y"

However, to the untrained eye these will look like operator soup. Heck, even to the trained eye they aren't that pretty (in my opinion).

Fortunately, ghc-8.0 will come with a new ApplicativeDo which will greatly simplify programs that use the Applicative interface. For example, the above two examples would become much more readable:

z :: Updatable Bool
z = do
x <- checkBox "x"
y <- checkBox "y"
return (x && y)

both :: Updatable (Bool, Bool)
both = do
x <- checkBox "x"
y <- checkBox "y"
return (x, y)

Similarly, the very first example simplifies to:

{-# LANGUAGE ApplicativeDo     #-}

main :: IO ()
main = textUI "Example program" (do
a <- checkBox   "a"
b <- spinButton "b" 1
c <- spinButton "c" 0.1
d <- entry      "d"
return (display (a, b + c, d)) )

That's much easier on the eyes. ApplicativeDo helps the code look much less like operator soup and presents a comfortable syntax for people used to imperative programming.

#### Conclusion

The obvious advantage is that you get the full power of the Haskell ecosystem. You can transform input to output using arbitrary Haskell code. You also get the benefit of the strong type system, so if you need extra assurance for critical calculations you can build that into your program.

The big disadvantage is that you have to relaunch the application in order to change the code. The library does not support live code reloading. This is technically feasible but requires substantially more work and would make the application much less portable.

If you follow my previous work this is very similar to a previous post of mine on spreadsheet-like programming in Haskell. This library simplifies many of the types and ideas from that previous post and packages them in a polished library.

If you would like to contribute to this library there are two main ways that you can help:

• Adding new types of input controls
• Adding new backends for output (like diagrams)

If you would like to use this code you can find the library on Github or Hackage.

## Sunday, October 18, 2015

### Explicit is better than implicit

Many of the limitations associated with Haskell type classes can be solved very cleanly with lenses. This lens-driven programming is more explicit but significantly more general and (in my opinion) easier to use.

All of these examples will work with any lens-like library, but I will begin with the lens-simple library to provide simpler types with better type inference and better type errors and then later transition to the lens library which has a larger set of utilities.

#### Case study #1 - fmap bias

Let's begin with a simple example - the Functor instance for Either:

fmap (+ 1) (Right 2   ) = Right 3

fmap (+ 1) (Left "Foo") = Left "Foo"

Some people object to this instance because it's biased to Right values. The only way we can use fmap to transform Left values is to wrap Either in a newtype.

These same people would probably like the lens-simple library which provides an over function that generalizes fmap. Instead of using the type to infer what to transform we can explicitly specify what we wish to transform by supplying _Left or _Right:

\$ stack install lens-simple --resolver=lts-3.9
\$ stack ghci --resolver=lts-3.9
>>> import Lens.Simple
>>> over _Right (+ 1) (Right 2)
Right 3
>>> over _Right (+ 1) (Left "Foo")
Left "Foo"
>>> over _Left (++ "!") (Right 2)
Right 2
>>> over _Left (++ "!") (Left "Foo")
Left "Foo!"

The inferred types are exactly what we would expect:

>>> :type over _Right
over _Right :: (b -> b') -> Either a b -> Either a b'
>>> :type over _Left
over _Left :: (b -> b') -> Either b b1 -> Either b' b1

Same thing for tuples. fmap only lets us transform the second value of a tuple, but over lets us specify which one we want to transform:

>>> over _1 (+ 1)    (2, "Foo")
(3,"Foo")
>>> over _2 (++ "!") (2, "Foo")
(2,"Foo!")

We can even transform both of the values in the tuple if they share the same type:

>>> over both (+ 1) (3, 4)
(4,5)

Again, the inferred types are exactly what we expect:

>>> :type over _2
over _2 :: (b -> b') -> (a, b) -> (a, b')
>>> :type over _1
over _1 :: (b -> b') -> (b, b1) -> (b', b1)
>>> :type over both
over both :: (b -> b') -> (b, b) -> (b', b')

#### Case study #2 - length confusion

Many people have complained about the tuple instance for Foldable, which gives weird behavior like this in ghc-7.10 or later:

>>> length (3, 4)
1

We could eliminate all confusion by specifying what we intend to count at the term level instead of the type level:

>>> lengthOf _2   (3, 4)
1
>>> lengthOf both (3, 4)
2

This works for Either, too:

>>> lengthOf _Right (Right 1)
1
>>> lengthOf _Right (Left "Foo")
0
>>> lengthOf _Left  (Right 1)
0
>>> lengthOf _Left  (Left "Foo")
1

... and this trick is not limited to length. We can improve any Foldable function by taking a lens instead of a type class constraint:

>>> sumOf both (3, 4)
7
>>> mapMOf_ both print (3, 4)
3
4

#### Case study #3 - Monomorphic containers

fmap doesn't work on ByteString because ByteString is not a type constructor and has no type parameter that we can map over. Some people use the mono-foldable or mono-traversable packages to solve this problem, but I prefer to use lenses. These examples will require the lens library which has more batteries included.

For example, if I want to transform each character of a Text value I can use the text optic:

\$ stack install lens --resolver=lts-3.9  # For `text` optics
\$ stack ghci --resolver=lts-3.9
>>> import Control.Lens
>>> import Data.Text.Lens
>>> import qualified Data.Text as Text
>>> let example = Text.pack "Hello, world!"
>>> over text succ example
"Ifmmp-!xpsme\""

I can use the same optic to loop over each character:

>>> mapMOf_ text print example
'H'
'e'
'l'
'l'
'o'
','
' '
'w'
'o'
'r'
'l'
'd'
'!'

There are also optics for ByteStrings, too:

>>> import Data.ByteString.Lens
>>> import qualified Data.ByteString as ByteString
>>> let example2 = ByteString.pack [0, 1, 2]
>>> mapMOf_ bytes print example2
0
1
2

The lens approach has one killer feature over mono-foldable and mono-traversable which is that you can be explicit about what exactly you want to map over. For example, suppose that I want to loop over the bits of a ByteString instead of the bytes. Then I can just provide an optic that points to the bits and everyting "just works":

>>> import Data.Bits.Lens
>>> mapMOf_ (bytes . bits) print example2
False
False
False
False
False
False
False
False
True
False
False
False
False
False
False
False
False
True
False
False
False
False
False
False

The mono-traversable or mono-foldable packages do not let you specify what you want to loop over. Instead, the MonoFoldable and MonoTraversable type classes guess what you want the elements to be, and if they guess wrong then you are out of luck.

#### Conclusion

Here are some more examples to illustrate how powerful and general the lens approach is over the type class approach.

>>> lengthOf (bytes . bits) example2
24
>>> sumOf (both . _1) ((2, 3), (4, 5))
6
>>> mapMOf_ (_Just . _Left) print (Just (Left 4))
4
>>> over (traverse . _Right) (+ 1) [Left "Foo", Right 4, Right 5]
[Left "Foo",Right 5,Right 6]

Once you get used to this style of programming you begin to prefer specifying things at the term level instead of relying on type inference or wrangling with newtypes.

## Wednesday, October 7, 2015

The Haskell community self-selects for people interested in unique things that Haskell can do that other languages cannot do. Consequently, a large chunk of Haskell example code in the wild uses advanced idioms (and I'm guilty of that, too).

So I would like to swing the pendulum in the other direction by just writing five small but useful programs without any imports, language extensions, or advanced features. These are programs that you could write in any other language and that's the point: you can use Haskell in the same way that you use other languages.

They won't be the prettiest or most type-safe programs, but that's okay.

#### Example #1: TODO program

This program is an interactive TODO list:

putTodo :: (Int, String) -> IO ()
putTodo (n, todo) = putStrLn (show n ++ ": " ++ todo)

prompt :: [String] -> IO ()
prompt todos = do
putStrLn ""
putStrLn "Current TODO list:"
mapM_ putTodo (zip [0..] todos)
command <- getLine
interpret command todos

interpret :: String -> [String] -> IO ()
interpret ('+':' ':todo) todos = prompt (todo:todos)
interpret ('-':' ':num ) todos =
case delete (read num) todos of
Nothing -> do
putStrLn "No TODO entry matches the given number"
prompt todos
Just todos' -> prompt todos'
interpret  "q"           todos = return ()
interpret  command       todos = do
putStrLn ("Invalid command: `" ++ command ++ "`")
prompt todos

delete :: Int -> [a] -> Maybe [a]
delete 0 (_:as) = Just as
delete n (a:as) = do
as' <- delete (n - 1) as
return (a:as')
delete _  []    = Nothing

main = do
putStrLn "Commands:"
putStrLn "+ <String> - Add a TODO entry"
putStrLn "- <Int>    - Delete the numbered entry"
putStrLn "q          - Quit"
prompt []

Example usage:

\$ runghc todo.hs
Commands:
+ <String> - Add a TODO entry
- <Int>    - Delete the numbered entry
q          - Quit

Current TODO list:
+ Go to bed

Current TODO list:
0: Go to bed

Current TODO list:
1: Go to bed
+ Shampoo the hamster

Current TODO list:
0: Shampoo the hamster
2: Go to bed
- 1

Current TODO list:
0: Shampoo the hamster
1: Go to bed
q

#### Example #2 - Rudimentary TSV to CSV

The following program transforms tab-separated standard input into comma-separated standard output. The program does not handle more complex cases like quoting and is not standards-compliant:

tabToComma :: Char -> Char
tabToComma '\t' = ','
tabToComma  c   = c

main = interact (map tabToComma)

Example usage:

\$ cat file.tsv
1   2   3
4   5   6
\$ cat file.tsv | runghc tsv2csv.hs
1,2,3
4,5,6

#### Example #3 - Calendar

This program prints a calendar for 2015

data DayOfWeek
= Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday
deriving (Eq, Enum, Bounded)

data Month
= January | February | March     | April   | May      | June
| July    | August   | September | October | November | December
deriving (Enum, Bounded, Show)

next :: (Eq a, Enum a, Bounded a) => a -> a
next x | x == maxBound = minBound
| otherwise     = succ x

pad day = case show day of
[c] -> [' ', c]
cs  -> cs

month :: Month -> DayOfWeek -> Int -> String
month m startDay maxDay = show m ++ " 2015\n" ++ week ++ spaces Sunday
where
week = "Su Mo Tu We Th Fr Sa\n"

spaces currDay | startDay == currDay = days startDay 1
| otherwise           = "   " ++ spaces (next currDay)

days Sunday    n | n > maxDay = "\n"
days _         n | n > maxDay = "\n\n"
days Saturday  n              = pad n ++ "\n" ++ days  Sunday    (succ n)
days day       n              = pad n ++ " "  ++ days (next day) (succ n)

year = month January   Thursday  31
++ month February  Sunday    28
++ month March     Sunday    31
++ month April     Wednesday 30
++ month May       Friday    31
++ month June      Monday    30
++ month July      Wednesday 31
++ month August    Saturday  31
++ month September Tuesday   30
++ month October   Thursday  31
++ month November  Sunday    30
++ month December  Tuesday   31

main = putStr year

Example usage:

\$ runghc calendar.hs
January 2015
Su Mo Tu We Th Fr Sa
1  2  3
4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

February 2015
Su Mo Tu We Th Fr Sa
1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28

March 2015
Su Mo Tu We Th Fr Sa
1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31

April 2015
Su Mo Tu We Th Fr Sa
1  2  3  4
5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30

May 2015
Su Mo Tu We Th Fr Sa
1  2
3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31

June 2015
Su Mo Tu We Th Fr Sa
1  2  3  4  5  6
7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30

July 2015
Su Mo Tu We Th Fr Sa
1  2  3  4
5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31

August 2015
Su Mo Tu We Th Fr Sa
1
2  3  4  5  6  7  8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

September 2015
Su Mo Tu We Th Fr Sa
1  2  3  4  5
6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30

October 2015
Su Mo Tu We Th Fr Sa
1  2  3
4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

November 2015
Su Mo Tu We Th Fr Sa
1  2  3  4  5  6  7
8  9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30

December 2015
Su Mo Tu We Th Fr Sa
1  2  3  4  5
6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31

#### Example #4 - Decode RNA

This program converts an RNA sequence read from standard input into the equivalent sequence of amino acids written to standard output, using the genetic code:

data RNA = A | U | C | G

data AminoAcid
= Ala | Cys | Asp | Glu | Phe | Gly | His | Ile | Lys | Leu
| Met | Asn | Pro | Gln | Arg | Ser | Thr | Val | Trp | Tyr
| Stop
deriving (Show)

decode :: RNA -> RNA -> RNA -> AminoAcid
decode U U U = Phe
decode U U C = Phe
decode U U A = Leu
decode U U G = Leu
decode U C _ = Ser
decode U A U = Tyr
decode U A C = Tyr
decode U A A = Stop
decode U A G = Stop
decode U G U = Cys
decode U G C = Cys
decode U G A = Stop
decode U G G = Trp
decode C U _ = Leu
decode C C _ = Pro
decode C A U = His
decode C A C = His
decode C A A = Gln
decode C A G = Gln
decode C G _ = Arg
decode A U U = Ile
decode A U C = Ile
decode A U A = Ile
decode A U G = Met
decode A C _ = Thr
decode A A U = Asn
decode A A C = Asn
decode A A A = Lys
decode A A G = Lys
decode A G U = Ser
decode A G C = Ser
decode A G A = Arg
decode A G G = Arg
decode G U _ = Val
decode G C _ = Ala
decode G A U = Asp
decode G A C = Asp
decode G A A = Glu
decode G A G = Glu
decode G G _ = Gly

decodeAll :: [RNA] -> [AminoAcid]
decodeAll (a:b:c:ds) = decode a b c : decodeAll ds
decodeAll  _         = []

main = do
str <- getContents
let rna :: [RNA]
rna = map (\c -> read [c]) str

let aminoAcids :: [AminoAcid]
aminoAcids = decodeAll rna

putStrLn (concatMap show aminoAcids)

Example usage:

\$ echo "ACAUGUCAGUACGUAGCUAC" | runghc decode.hs
ThrCysGlnTyrValAlaThr

#### Example #5 - Bedtime story generator

This generates a "random" bedtime story:

stories :: [String]
stories = do
let str0 = "There once was "

str1 <- ["a princess ", "a cat ", "a little boy "]

let str2 = "who lived in "

str3 <- ["a shoe.", "a castle.", "an enchanted forest."]

let str4 = "  They found a "

str5 <- ["giant ", "frog ", "treasure chest " ]

let str6 = "while "

str7 <- ["when they got lost ", "while strolling along "]

let str8 = "and immediately regretted it.  Then a "

str9 <- ["lumberjack ", "wolf ", "magical pony "]

let str10 = "named "

str11 <- ["Pinkie Pie ", "Little John ", "Boris "]

let str12 = "found them and "

str13 <- ["saved the day.", "granted them three wishes."]

let str14 = "  The end."

return (  str0
++ str1
++ str2
++ str3
++ str4
++ str5
++ str6
++ str7
++ str8
++ str9
++ str10
++ str11
++ str12
++ str13
++ str14
)

main = do
let len = length stories
putStrLn ("Enter a number from 0 to " ++ show (len - 1))
putStrLn ""
putStrLn (stories !! n)

Example usage:

\$ runghc story.hs
Enter a number from 0 to 971
238

There once was a princess who lived in an enchanted forest.  They found a giant
while while strolling along and immediately regretted it.  Then a lumberjack
named Boris found them and saved the day.  The end.

#### Conclusion

If you would like to contribute a simple example of your own, try sharing a paste of your program under the #Haskell #BackToBasics hashtags.

## Friday, October 2, 2015

### Polymorphism for dummies

This tutorial explains how polymorphism is implemented under the hood in Haskell using the least technical terms possible.

The simplest example of a polymorphic function is the identity function:

id :: a -> a
id x = x

The identity function works on any type of value. For example, I can apply id to an Int or to a String:

\$ ghci
Prelude> id 4
4
Prelude> id "Test"
"Test"

Under the hood, the id function actually takes two arguments, not one.

-- Under the hood:

id :: forall a . a -> a
id @a x = x

The first argument of id (the @a) is the same as the a in the type signature of id. The type of the second argument (x) can refer to the value of the first argument (the @a).

If you don't believe me, you can prove this yourself by just taking the following module:

module Id where

import Prelude hiding (id)

id :: a -> a
id x = x

... and ask ghc to output the low-level "core" representation of the above id function:

\$ ghc -ddump-simpl id.hs
[1 of 1] Compiling Id               ( id.hs, id.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 4, types: 5, coercions: 0}

Id.id :: forall a_apw. a_apw -> a_apw
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]
Id.id = \ (@ a_aHC) (x_apx :: a_aHC) -> x_apx

The key part is the last line, which if you clean up looks like this:

id = \(@a) (x :: a) -> x

ghc prefixes types with @ when using them as function arguments.

In other words, every time we "generalize" a function (i.e. make it more polymorphic), we add a new hidden argument to that function corresponding to the polymorphic type.

#### Specialization

We can "specialize" id to a narrower type that is less polymorphic (a.k.a. "monomorphic"):

idString :: String -> String
idString = id

Under the hood, what actually happened was that we applied the id function to the String type, like this:

-- Under the hood:

idString :: String -> String
idString = id @String

We can prove this ourselves by taking this module:

module Id where

import Prelude hiding (id)

id :: a -> a
id x = x

idString :: String -> String
idString = id

... and studying the core that this module generates:

\$ ghc -ddump-simpl id.hs
[1 of 1] Compiling Id               ( id.hs, id.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 6, types: 8, coercions: 0}

Id.id :: forall a_apx. a_apx -> a_apx
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]
Id.id = \ (@ a_aHL) (x_apy :: a_aHL) -> x_apy

Id.idString :: GHC.Base.String -> GHC.Base.String
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]
Id.idString = Id.id @ GHC.Base.String

If we clean up the last line, we get:

idString = id @String

In other words, every time we "specialize" a function (i.e. make it less polymorphic), we apply it to a hidden type argument. Again, ghc prefixes types with @ when using them as values.

So back in the REPL, when we ran code like this:

>>> id 4
4
>>> id "Test"
"Test"

... ghc was implicitly inserting hidden type arguments for us, like this:

>>> id @Int 4
4
>>> id @String "Test"

#### Conclusion

That's it! There's really nothing more to it than that.

The general trick for passing around type parameters as ordinary function arguments was first devised as part of System F.

This is the same way that many other languages encode polymorphism (like ML, Idris, Agda, Coq) except that some of them use a more general mechanism. However, the basic principles are the same:

• When you make something less polymorphic you apply your function to type values

In ghc-8.0, you will be allowed to explicitly provide type arguments yourself if you prefer. For example, consider the read function:

If you wanted to specify what type to read without a type annotation, you could provide the type you desired as an argument to read:

read @Int :: String -> Int

Here are some other examples:

[] @Int :: [Int]

show @Int :: Int -> String

Previously, the only way to pass a type as a value was to use the following Proxy type:

data Proxy a = Proxy

... which reified a type as a value. Now you will be able to specialize functions by providing the type argument directly instead of adding a type annotation.

## Thursday, September 17, 2015

This post summarizes a few tips that increase the readability of Haskell code in my anecdotal experience. Each guideline will have a corresponding rationale.

Do not take this post to mean that all Haskell code should be written this way. These are guidelines for code that you wish to use as expository examples to introduce people to the language without scaring them with unfamiliar syntax.

#### Rule #1: Don't use (\$)

This is probably the most controversial guideline but I believe this is the recommendation which has the highest impact on readability.

A typical example of this issue is something like the following code:

print \$ even \$ 4 * 2

... which is equivalent to this code:

print (even (4 * 2))

The biggest issue with the dollar sign is that most people will not recognize it as an operator! There is no precedent for using the dollar sign as an operator in any other languages. Indeed, the vast majority of developers program in languages that do not support adding new operators at all, such as Javascript, Java, C++, or Python, so you cannot reasonably expect them to immediately recognize that the dollar symbol is an operator.

This then leads people to believe that the dollar sign is some sort of built-in language syntax, which in turn convinces them that Haskell's syntax is needlessly complex and optimized for being terse over readable. This perception is compounded by the fact that the most significant use of the dollar symbol outside of Haskell is in Perl (a language notorious for being write-only).

Suppose that they do recognize that the symbol represents an operator. They still cannot guess at what the operator means. There is no obvious mental connection between a symbol used for currency and function application. There is also no prior art for this connection outside of the Haskell language.

Even if a newcomer is lucky enough to guess that the symbol represents function application, it's still ambiguous because they cannot tell if the symbol is left- or right-associative. Even people who do actually take the time to learn Haskell in more depth have difficulty understanding how (\$) behaves and frequently confuse it with the composition operator, (.). If people earnestly learning the language have difficulty understanding (\$), what chance do skeptics have?

By this point you've already lost many people who might have been potentially interested in the language, and for what? The dollar sign does not even shorten the expression.

#### Rule #2: Use operators sparingly

Rule #1 is a special case of Rule #2.

My rough guideline for which operators to use is that assocative operators are okay, and all other operators are not okay.

Okay:

• (.)
• (+) / (*)
• (&&) / (||)
• (++)

Not okay:

• (<\$>) / (<*>) - Use liftA{n} or ApplicativeDo in the future
• (^.) / (^..) / %~ / .~ - Use view / toListOf / over / set instead

You don't have to agree with me on the specific operators to keep or reject. The important part is just using them more sparingly when teaching Haskell.

The issues with operators are very similar in principle to the issue with the dollar sign:

• They are not recognizable as operators to some people, especially if they have no equivalent in other languages
• Their meaning is not immediately obvious
• Their precedence and fixity are not obvious, particular for Haskell-specific operators

The main reason I slightly prefer associative operators is that their fixity does not matter and they usually have prior art outside the language as commonly used mathematical operators.

#### Rule #3: Use do notation generously

Prefer do notation over (>>=) or fmap when available, even if it makes your code a few lines longer. People won't reject a language on the basis of verbosity (Java and Go are living proof of that), but they will reject languages on the basis of unfamiliar operators or functions.

This means that instead of writing this:

example = getLine >>= putStrLn . (++ "!")

You instead write something like this:

example = do
str <- getLine
putStrLn (str ++ "!")

If you really want a one-liner you can still use do notation, just by adding a semicolon:

example = do str <- getLine; putStrLn (str ++ "!")

do notation and semicolons are immediately recognizable to outsiders because they resemble subroutine syntax and in the most common case (IO) it is in fact subroutine syntax.

A corollary of this is to use the newly added ApplicativeDo extension, which was recently merged into the GHC mainline and will be available in the next GHC release. I believe that ApplicativeDo will be more readable to outsiders than the (<\$>) and (<*>) operators.

#### Rule #4: Don't use lenses

Don't get me wrong: I'm one of the biggest advocates for lenses and I think they firmly belong as a mainstream Haskell idiom. However, I don't feel they are appropriate for beginners.

The biggest issues are that:

• It's difficult to explain to beginners how lenses work
• They require Template Haskell or boilerplate lens definitions
• They require separate names for function accessors and lenses, and one or the other is bound to look ugly as a result
• They lead to poor inferred types and error message, even when using the more monomorphic versions in lens-family-core

Lenses are wonderful, but there's no hurry to teach them. There are already plenty of uniquely amazing things about the Haskell language worth learning before even mentioning lenses.

#### Rule #5: Use where and let generously

Resist the temptation to write one giant expression spanning multiple lines. Break it up into smaller sub-expressions each defined on their own line using either where or let.

This rule exists primarily to ease imperative programmers into functional programming. These programmers are accustomed to frequent visual "punctuation" in the form of statement boundaries when reading code. let and where visually simulate decomposing a larger program into smaller "statements" even if they are really sub-expressions and not statements.

#### Rule #6: Use point-free style sparingly

Every Haskell programmer goes through a phase where they try to see if they can eliminate all variable names. Spoiler alert: you always can, but this just makes the code terse and unreadable.

For example, I'll be damned if I know what this means without some careful thought and some pen and paper:

((==) <*>)

... but I can tell at a glance what this equivalent expression does:

\f x -> x == f x

This is a real example, by the way.

There's no hard and fast rule for where to draw the line, but when in doubt err on the side of being less point-free.

#### Conclusion

That's it! Those six simple rules go a very long way towards improving the readability of Haskell code to outsiders.

Haskell is actually a supremely readable language once you familiarize yourself with the prevailing idioms, thanks to:

• purity
• minimization of needless side effects and state

However, we should make an extra effort to make our code readable even to complete outsiders with absolutely no familiarity or experience with the language. The entry barrier is one of the most widely cited criticisms of the language and I believe that a simple and clean coding style can lower that barrier.

## Monday, August 31, 2015

### State of the Haskell ecosystem - August 2015

Note: This went out as a RFC draft a few weeks ago, which is now a live wiki. See the Conclusions section at the end for more details.

In this post I will describe the current state of the Haskell ecosystem to the best of my knowledge and its suitability for various programming domains and tasks. The purpose of this post is to discuss both the good and the bad by advertising where Haskell shines while highlighting where I believe there is room for improvement.

This post is grouped into two sections: the first section covers Haskell's suitability for particular programming application domains (i.e. servers, games, or data science) and the second section covers Haskell's suitability for common general-purpose programming needs (such as testing, IDEs, or concurrency).

The topics are roughly sorted from greatest strengths to greatest weaknesses. Each programming area will also be summarized by a single rating of either:

• Best in class: the best experience in any language
• Mature: suitable for most programmers
• Immature: only acceptable for early-adopters

The more positive the rating the more I will support the rating with success stories in the wild. The more negative the rating the more I will offer constructive advice for how to improve things.

Disclaimer #1: I obviously don't know everything about the Haskell ecosystem, so whenever I am unsure I will make a ballpark guess and clearly state my uncertainty in order to solicit opinions from others who have more experience. I keep tabs on the Haskell ecosystem pretty well, but even this post is stretching my knowledge. If you believe any of my ratings are incorrect, I am more than happy to accept corrections (both upwards and downwards)

Disclaimer #2: There are some "Educational resource" sections below which are remarkably devoid of books, since I am not as familiar with textbook-related resources. If you have suggestions for textbooks to add, please let me know.

Disclaimer #3: I am very obviously a Haskell fanboy if you haven't guessed from the name of my blog and I am also an author of several libraries mentioned below, so I'm highly biased. I've made a sincere effort to honestly appraise the language, but please challenge my ratings if you believe that my bias is blinding me! I've also clearly marked Haskell sales pitches as "Propaganda" in my external link sections. :)

# Application Domains

## Compilers

Rating: Best in class

Haskell is an amazing language for writing your own compiler. If you are writing a compiler in another language you should genuinely consider switching.

Haskell originated in academia, and most languages of academic origin (such as the ML family of languages) excel at compiler-related tasks for obvious reasons. As a result the language has a rich ecosystem of libraries dedicated to compiler-related tasks, such as parsing, pretty-printing, unification, bound variables, syntax tree manipulations, and optimization.

Anybody who has ever written a compiler knows how difficult they are to implement because by necessity they manipulate very weakly typed data structures (trees and maps of strings and integers). Consequently, there is a huge margin for error in everything a compiler does, from type-checking to optimization, to code generation. Haskell knocks this out of the park, though, with a really powerful type system with many extensions that can eliminate large classes of errors at compile time.

I also believe that there are many excellent educational resources for compiler writers, both papers and books. I'm not the best person to summarize all the educational resources available, but the ones that I have read have been very high quality.

Finally, there are a large number of parsers and pretty-printers for other languages which you can use to write compilers to or from these languages.

Notable libraries:

Educational resources:

## Server-side programming

Rating: Mature

Haskell's second biggest strength is the back-end, both for web applications and services. The main features that the language brings to the table are:

• Server stability
• Performance
• Ease of concurrent programming
• Excellent support for web standards

The strong type system and polished runtime greatly improve server stability and simplify maintenance. This is the greatest differentiator of Haskell from other backend languages, because it significantly reduces the total-cost-of-ownership. You should expect that you can maintain Haskell-based services with significantly fewer programmers than other languages, even when compared to other statically typed languages.

However, the greatest weakness of server stability is space leaks. The most common solution that I know of is to use ekg (a process monitor) to examine a server's memory stability before deploying to production. The second most common solution is to learn to detect and prevent space leaks with experience, which is not as hard as people think.

Haskell's performance is excellent and currently comparable to Java. Both languages give roughly the same performance in beginner or expert hands, although for different reasons.

Where Haskell shines in usability is the runtime support for the following three features:

• software transactional memory (which differentiate Haskell from Go)
• garbage collection (which differentiate Haskell from Rust)

Many languages support two of the above three features, but Haskell is the only one that I know of that supports all three.

If you have never tried out Haskell's software transactional memory you should really, really, really give it a try, since it eliminates a large number of concurrency logic bugs. STM is far and away the most underestimated feature of the Haskell runtime.

Notable libraries:

• warp / wai - the low-level server and API that all server libraries share, with the exception of snap
• scotty - A beginner-friendly server framework analogous to Ruby's Sinatra
• spock - Lighter than the "enterprise" frameworks, but more featureful than scotty (type-safe routing, sessions, conn pooling, csrf protection, authentication, etc)
• yesod / yesod-* / snap / snap-* / happstack-server / happstack-* - "Enterprise" server frameworks with all the bells and whistles
• servant / servant-* - This server framework might blow your mind
• authenticate / authenticate-* - Shared authentication libraries
• ekg / ekg-* - Haskell service monitoring
• stm - Software-transactional memory

Propaganda:

Educational resources:

## Scripting / Command-line applications

Rating: Mature

Haskell's biggest advantage as a scripting language is that Haskell is the most widely adopted language that support global type inference. Many languages support local type inference (such as Rust, Go, Java, C#), which means that function argument types and interfaces must be declared but everything else can be inferred. In Haskell, you can omit everything: all types and interfaces are completely inferred by the compiler (with some caveats, but they are minor).

Global type inference gives Haskell the feel of a scripting language while still providing static assurances of safety. Script type safety matters in particular for enterprise environments where glue scripts running with elevated privileges are one of the weakest points in these software architectures.

The second benefit of Haskell's type safety is ease of script maintenance. Many scripts grow out of control as they accrete arcane requirements and once they begin to exceed 1000 LOC they become difficult to maintain in a dynamically typed language. People rarely budget sufficient time to create a sufficiently extensive test suite that exercises every code path for each and every one of their scripts. Having a strong type system is like getting a large number of auto-generated tests for free that exercise all script code paths. Moreover, the type system is more resilient to refactoring than a test suite.

However, the main reason I mark Haskell as mature because the language is also usable even for simple one-off disposable scripts. These Haskell scripts are comparable in size and simplicity to their equivalent Bash or Python scripts. This lets you easily start small and finish big.

Haskell has one advantage over many dynamic scripting languages, which is that Haskell can be compiled into a native and statically linked binary for distribution to others.

Haskell's scripting libraries are feature complete and provide all the niceties that you would expect from scripting in Python or Ruby, including features such as:

• rich suite of Unix-like utilities
• POSIX support
• light-weight idioms for exception safety and automatic resource disposal

Notable libraries:

Some command-line tools written in Haskell:

Educational resources:

## Numerical programming

Rating: Immature? (Uncertain)

My main experience in this area was from a few years ago doing numerical programming for bioinformatics that involved a lot of vector and matrix manipulation and my rating is largely colored by that experience.

The biggest issues that the ecosystem faces are:

• Really clunky matrix library APIs
• Fickle rewrite-rule-based optimizations

When the optimizations work they are amazing and produce code competitive with C. However, small changes to your code can cause the optimizations to suddenly not trigger and then performance drops off a cliff.

There is one Haskell library that avoids this problem entirely which I believe holds a lot of promise: accelerate generates LLVM and CUDA code at runtime and does not rely on Haskell's optimizer for code generation, which side-steps the problem. accelerate has a large set of supported algorithms that you can find by just checking the library's reverse dependencies:

However, I don't have enough experience with accelerate or enough familiarity with numerical programming success stories in Haskell to vouch for this just yet. If somebody has more experience then me in this regard and can provide evidence that the ecosystem is mature then I might consider revising my rating upward.

Notable libraries:

Propaganda:

Educational Resources:

## Front-end web programming

Rating: Immature

This boils down to Haskell's ability to compile to Javascript. ghcjs is the front-runner, but for a while setting up ghcjs was non-trivial. However, ghcjs appears to be very close to having a polished setup story now that ghc-7.10.2 is out (Source).

One of the distinctive features of ghcjs compared to other competing Haskell-to-Javascript compilers is that a huge number of Haskell libraries work out of the box with ghcjs because it supports most Haskell primitive operations.

I would also like to mention that there are two Haskell-like languages that you should also try out for front-end programming: elm and purescript. These are both used in production today and have equally active maintainers and communities of their own.

Areas for improvement:

• There needs to be a clear story for smooth integration with existing Javascript projects
• There need to be many more educational resources targeted at non-experts explaining how to translate existing front-end programming idioms to Haskell
• There need to be several well-maintained and polished Haskell libraries for front-end programming

Notable libraries:

• reflex-dom - Functional reactive programming library for DOM manipulation

## Distributed programming

Rating: Immature

This is sort of a broad area since I'm using this topic to refer to both distributed computation (for analytics) and distributed service architectures. However, in both regards Haskell is lagging behind its peers.

The JVM, Go, and Erlang have much better support for this sort of things, particularly in terms of libraries.

There has been a lot of work in replicating Erlang-like functionality in Haskell through the Cloud Haskell project, not just in creating the low-level primitives for code distribution / networking / transport, but also in assembling a Haskell analog of Erlang's OTP. I'm not that familiar with how far progress is in this area, but people who love Erlang should check out Cloud Haskell.

Areas for improvement:

• We need more analytics libraries. Haskell has no analog of scalding or spark. The most we have is just a Haskell wrapper around hadoop
• We need a polished consensus library (i.e. a high quality Raft implementation in Haskell)

Notable libraries:

## Standalone GUI applications

Rating: Immature

Haskell really lags behind the C# and F# ecosystem in this area.

My experience on this is based on several private GUI projects I wrote several years back. Things may have improved since then so if you think my assessment is too negative just let me know.

All Haskell GUI libraries are wrappers around toolkits written in other languages (such as GTK+ or Qt). The last time I checked the gtk bindings were the most comprehensive, best maintained, and had the best documentation.

However, the Haskell bindings to GTK+ have a strongly imperative feel to them. The way you do everything is communicating between callbacks by mutating IORefs. Also, you can't take extensive advantage of Haskell's awesome threading features because the GTK+ runtime is picky about what needs to happen on certain threads. I haven't really seen a Haskell library that takes this imperative GTK+ interface and wraps it in a more idiomatic Haskell API.

My impression is that most Haskell programmers interested in applications programming have collectively decided to concentrate their efforts on improving Haskell web applications instead of standalone GUI applications. Honestly, that's probably the right decision in the long run.

Another post that goes into more detail about this topic is this post written by Keera Studios:

Areas for improvement:

• A GUI toolkit binding that is maintained, comprehensive, and easy to use
• Polished GUI interface builders

Notable libraries:

• gtk / glib / cairo / pango - The GTK+ suite of libraries
• wx - wxWidgets bindings
• X11 - X11 bindings
• threepenny-gui - Framework for local apps that use the web browser as the interface
• hsqml - A Haskell binding for Qt Quick, a cross-platform framework for creating graphical user interfaces.
• fltkhs - A Haskell binding to FLTK. Easy install/use, cross-platform, self-contained executables.

Some example applications:

Educational resources:

## Machine learning

Rating: Immature? (Uncertain)

This area has been pioneered almost single-handedly by one person: Mike Izbicki. He maintains the HLearn suite of libraries for machine learning in Haskell.

I have essentially no experience in this area, so I can't really rate it that well. However, I'm pretty certain that I would not rate it mature because I'm not aware of any company successfully using machine learning in Haskell.

For the same reason, I can't really offer constructive advice for areas for improvement.

Notable libraries: * HLearn-*

## Data science

Rating: Immature

Haskell really lags behind Python and R in this area. Haskell is somewhat usable for data science, but probably not ready for expert use under deadline pressure.

I'll primarily compare Haskell to Python since that's the data science ecosystem that I'm more familiar with. Specifically, I'll compare to the scipy suite of libraries:

The Haskell analog of NumPy is the hmatrix library, which provides Haskell bindings to BLAS, LAPACK. hmatrix's main limitation is that the API is a bit clunky, but all the tools are there.

Haskell's charting story is okay. Probably my main criticism of most charting APIs is that their APIs tend to be large, the types are a bit complex, and they have a very large number of dependencies.

Fortunately, Haskell does integrate into IPython so you can use Haskell within an IPython shell or an online notebook. For example, there is an online "IHaskell" notebook that you can use right now located here:

The closest thing to Python's pandas is the frames library. I haven't used it that much personally so I won't comment on it much other than to link to some tutorials in the Educational Resources section.

I'm not aware of a Haskell analog to SciPy (the library) or sympy. If you know of an equivalent Haskell library then let me know.

One Haskell library that deserves honorable mention here is the diagrams library which lets you produce complex data visualizations very easily if you want something a little bit fancier than a chart. Check out the diagrams project if you have time:

Areas for improvement:

• Smooth user experience and integration across all of these libraries
• Simple types and APIs. The data science programmers I know dislike overly complex or verbose APIs
• Beautiful data visualizations with very little investment

Notable libraries:

## Game programming

Haskell has SDL and OpenGL bindings, which are actually quite good, but that's about it. You're on your own from that point onward. There is not a rich ecosystem of higher-level libraries built on top of those bindings. There is some work in this area, but I'm not aware of anything production quality.

There is also one really fundamental issue with the language, which is garbage collection, which runs the risk of introducing perceptible pauses in gameplay if your heap grows too large.

For this reason I don't see Haskell ever being used for AAA game programming. I suppose you could use Haskell for simpler games that don't require keeping a lot of resources in memory.

Haskell could maybe be used for the scripting layer of a game or to power the backend for an online game, but for rendering or updating an extremely large graph of objects you should probably stick to another language.

The company that has been doing the most to push the envelope for game programming in Haskell is Keera Studios, so if this is an area that interests you then you should follow their blog:

Areas for improvement:

• Improve the garbage collector and benchmark performance with large heap sizes
• Provide higher-level game engines
• Improve distribution of Haskell games on proprietary game platforms

Notable libraries:

## Systems / embedded programming

Rating: Bad / Immature (?) (See description)

Since systems programming is an abused word, I will clarify that I mean programs where speed, memory layout, and latency really matter.

Haskell fares really poorly in this area because:

• The language is garbage collected, so there are no latency guarantees
• Executable sizes are large
• Memory usage is difficult to constrain (thanks to space leaks)
• Haskell has a large and unavoidable runtime, which means you cannot easily embed Haskell within larger programs
• You can't easily predict what machine code that Haskell code will compile to

Typically people approach this problem from the opposite direction: they write the low-level parts in C or Rust and then write Haskell bindings to the low-level code.

It's worth noting that there is an alternative approach which is Haskell DSLs that are strongly typed that generate low-level code at runtime. This is the approach championed by the company Galois.

Notable libraries:

• atom / ivory - DSL for generating embedded programs
• copilot - Stream DSL that generates C code
• improve - High-assurance DSL for embedded code that generates C and Ada

Educational resources:

## Mobile apps

This greatly lags behind using the language that is natively supported by the mobile platform (i.e. Java for Android or Objective-C / Swift for iOS).

I don't know a whole lot about this area, but I'm definitely sure it is far from mature. All I can do is link to the resources I know of for Android and iPhone development using Haskell.

I also can't really suggest improvements because I'm pretty out of touch with this branch of the Haskell ecosystem.

Educational resources:

## ARM processor support

On hobbyist boards like the raspberry pi its possible to compile haskell code with ghc. But some libraries have problems on the arm platform, ghci only works on newer compilers, and the newer compilers are flaky.

If haskell code builds, it runs with respectable performance on these machines.

Raspian (raspberry pi, pi2, others) * current version: ghc 7.4, cabal-install 1.14 * ghci doesn't work.

Debian Jesse (Raspberry Pi 2) * current version: ghc 7.6 * can install the current ghc 7.10.2 binary and ghci starts. However, fails to build cabal, with 'illegal instruction'

Arch (Raspberry Pi 2) * current version 7.8.2, but llvm is 3.6, which is too new. * downgrade packages for llvm not officially available. * with llvm downgrade to 3.4, ghc and ghci work, but problems compiling yesod, scotty.
* compiler crashes, segfaults, etc.

Arch (Banana Pi) * similar to raspberry pi 2, ghc is 7.8.2, works with llvm downgrade * have had success compiling a yesod project on this platform.

# Common Programming Needs

## Maintenance

Rating: Best in class

Haskell is unbelievably awesome for maintaining large projects. There's nothing that I can say that will fully convey how nice it is to modify existing Haskell code. You can only appreciate this through experience.

When I say that Haskell is easy to maintain, I mean that you can easily approach a large Haskell code base written by somebody else and make sweeping architectural changes to the project without breaking the code.

You'll often hear people say: "if it compiles, it works". I think that is a bit of an exaggeration, but a more accurate statement is: "if you refactor and it compiles, it works". This lets you move fast without breaking things.

Most statically typed languages are easy to maintain, but Haskell is on its own level for the following reasons:

• Strong types
• Global type inference
• Type classes
• Laziness

The latter two features are what differentiate Haskell from other statically typed languages.

If you've ever maintained code in other languages you know that usually your test suite breaks the moment you make large changes to your code base and you have to spend a significant amount of effort keeping your test suite up to date with your changes. However, Haskell has a very powerful type system that lets you transform tests into invariants that are enforced by the types so that you can statically eliminate entire classes of errors at compile time. These types are much more flexible than tests when modifying code and types require much less upkeep as you make large changes.

The Haskell community and ecosystem use the type system heavily to "test" their applications, more so than other programming language communities. That's not to say that Haskell programmers don't write tests (they do), but rather they prefer types over tests when they have the option.

Global type inference means that you don't have to update types and interfaces as you change the code. Whenever I do a large refactor the first thing I do is delete all type signatures and let the compiler infer the types and interfaces for me as I go. When I'm done refactoring I just insert back the type signatures that the compiler infers as machine-checked documentation.

Type classes also assist refactoring because the compiler automatically infers type class constraints (analogous to interfaces in other languages) so that you don't need to explicitly annotate interfaces. This is a huge time saver.

Laziness deserves special mention because many outsiders do not appreciate how laziness simplifies maintenance. Many languages require tight coupling between producers and consumers of data structures in order to avoid wasteful evaluation, but laziness avoids this problem by only evaluating data structures on demand. This means that if your refactoring process changes the order in which data structures are consumed or even stops referencing them altogether you don't need to reorder or delete those data structures. They will just sit around patiently waiting until they are actually needed, if ever, before they are evaluated.

## Single-machine Concurrency

Rating: Best in class

I give Haskell a "Best in class" rating because Haskell's concurrency runtime performs as well or better than mainstream languages and is significantly easier to use due to the runtime support for software-transactional memory.

The best explanation of Haskell's threading module is the documentation in Control.Concurrent:

Concurrency is "lightweight", which means that both thread creation and context switching overheads are extremely low. Scheduling of Haskell threads is done internally in the Haskell runtime system, and doesn't make use of any operating system-supplied thread packages.

The best way to explain the performance of Haskell's threaded runtime is to give hard numbers:

• Each thread requires 1 kb of memory, so the hard limitation to thread count is memory (1 GB per million threads).
• Haskell channel overhead for the standard library (using TQueue) is on the order of one microsecond per message and degrades linearly with increasing contention
• Haskell channel overhead using the unagi-chan library is on the order of 100 nanoseconds (even under contention)
• Haskell's MVar (a low-level concurrency communication primitive) requires 10-20 ns to add or remove values (roughly on par with acquiring or releasing a lock in other languages)

Haskell also provides software-transactional memory, which allows programmers build composable and atomic memory transactions. You can compose transactions together in multiple ways to build larger transactions:

• You can sequence two transactions to build a larger atomic transaction
• You can combine two transactions using alternation, falling back on the second transaction if the first one fails
• Transactions can retry, rolling back their state and sleeping until one of their dependencies changes in order to avoid wasteful polling

A few other languages provide software-transactional memory, but Haskell's implementation has two main advantages over other implementations:

• The type system enforces that transactions only permit reversible memory modifications. This guarantees at compile time that all transactions can be safely rolled back.
• Haskell's STM runtime takes advantage of enforced purity to improve the efficiency of transactions, retries, and alternation.

Notable libraries:

• stm - Software transactional memory
• unagi-chan - High performance channels
• async - Futures library

Educational resources:

## Types / Type-driven development

Rating: Best in class

Haskell definitely does not have the most advanced type system (not even close if you count research languages) but out of all languages that are actually used in production Haskell is probably at the top. Idris is probably the closest thing to a type system more powerful than Haskell that has a realistic chance of use in production in the foreseeable future.

The killer features of Haskell's type system are:

• Type classes
• Global type and type class inference
• Light-weight type syntax

Haskell's type system really does not get in your way at all. You (almost) never need to annotate the type of anything. As a result, the language feels light-weight to use like a dynamic language, but you get all the assurances of a static language.

Many people are familiar with languages that support "local" type inference (like Rust, Java, C#), where you have to explicitly type function arguments but then the compiler can infer the types of local variables. Haskell, on the other hand, provides "global" type inference, meaning that the types and interfaces of all function arguments are inferred, too. Type signatures are optional (with some minor caveats) and are primarily for the benefit of the programmer.

This really benefits projects where you need to prototype quickly but refactor painlessly when you realize you are on the wrong track. You can leave out all type signatures while prototyping but the types are still there even if you don't see them. Then when you dramatically change course those strong and silent types step in and keep large refactors painless.

Some Haskell programmers use a "type-driven development" programming style, analogous to "test-driven development":

• they specify desired behavior as a type signature which initially fails to type-check (analogous to adding a test which starts out "red")
• they create a quick and dirty solution that satisfies the type-checker (analogous to turning the test "green")
• they improve on their initial solution while still satisfying the type-checker (analogous to a "red/green refactor")

"Type-driven development" supplements "test-driven development" and has different tradeoffs:

• The biggest disadvantage of types is that test as many things as full-blown tests, especially because Haskell is not dependently typed
• The biggest advantage of types is that they can prove the complete absence of programming errors for all possible cases, whereas tests cannot examine every possibility
• Type-checking is much faster than running tests
• Type error messages are informative: they explain what went wrong and never get stale
• Type-checking never hangs and never gives flaky results

Haskell also provides the "Typed Holes" extension, which lets you add an underscore (i.e. "_") anywhere in the code whenever you don't know what expression belongs there. The compiler will then tell you the expected type of the hole and suggest terms in scope with related types that you can use to fill the hole.

Educational resources:

Propaganda:

## Domain-specific languages (DSLs)

Rating: Mature

Haskell rocks at DSL-building. While not as flexible as a Lisp language I would venture that Haskell is the most flexible of the non-Lisp languages. You can overload a large amount of built-in syntax for your custom DSL.

The most popular example of overloaded syntax is do notation, which you can overload to work with any type that implements the Monad interface. This syntactic sugar for Monads in turn led to a huge overabundance of Monad tutorials.

However, there are lesser known but equally important things that you can overload, such as:

• numeric and string literals
• if/then/else expressions
• list comprehensions
• numeric operators

Educational resources:

## Testing

Rating: Mature

There are a few places where Haskell is the clear leader among all languages:

• property-based testing
• mocking / dependency injection

Haskell's QuickCheck is the gold standard which all other property-based testing libraries are measured against. The reason QuickCheck works so smoothly in Haskell is due to Haskell's type class system and purity. The type class system simplifies automatic generation of random data from the input type of the property test. Purity means that any failing test result can be automatically minimized by rerunning the check on smaller and smaller inputs until QuickCheck identifies the corner case that triggers the failure.

Mocking is another area where Haskell shines because you can overload almost all built-in syntax, including:

• do notation
• if statements
• numeric literals
• string literals

Haskell programmers overload this syntax (particularly do notation) to write code that looks like it is doing real work:

example = do str <- readLine
putLine str

... and the code will actually evaluate to a pure syntax tree that you can use to mock in external inputs and outputs:

example = ReadLine (\str -> PutStrLn str (Pure ()))

Haskell also supports most testing functionality that you expect from other languages, including:

• standard package interfaces for testing
• unit testing libraries
• test result summaries and visualization

Notable libraries:

• QuickCheck - property-based testing
• doctest - tests embedded directly within documentation
• free - Haskell's abstract version of "dependency injection"
• hspec - Testing library analogous to Ruby's RSpec
• HUnit - Testing library analogous to Java's JUnit
• tasty - Combination unit / regression / property testing library

Educational resources:

## Data structures and algorithms

Rating: Mature

Haskell primarily uses persistent data structures, meaning that when you "update" a persistent data structure you just create a new data structure and you can keep the old one around (thus the name: persistent). Haskell data structures are immutable, so you don't actually create a deep copy of the data structure when updating; any new structure will reuse as much of the original data structure as possible.

The Notable libraries sections contains links to Haskell collections libraries that are heavily tuned. You should realistically expect these libraries to compete with tuned Java code. However, you should not expect Haskell to match expertly tuned C++ code.

The selection of algorithms is not as broad as in Java or C++ but it is still pretty good and diverse enough to cover the majority of use cases.

Notable libraries:

## Benchmarking

Rating: Mature

This boils down exclusively to the criterion library, which was done so well that nobody bothered to write a competing library. Notable criterion features include:

• Detailed statistical analysis of timing data
• Beautiful graph output: (Example)
• High-resolution analysis (accurate down to nanoseconds)
• Customizable HTML/CSV/JSON output
• Garbage collection insensitivity

Notable libraries:

Educational resources:

## Unicode

Rating: Mature

Haskell's Unicode support is excellent. Just use the text and text-icu libraries, which provide a high-performance, space-efficient, and easy-to-use API for Unicode-aware text operations.

Note that there is one big catch: the default String type in Haskell is inefficient. You should always use Text whenever possible.

Notable libraries:

## Parsing / Pretty-printing

Rating: Mature

Haskell is amazing at parsing. Recursive descent parser combinators are far-and-away the most popular parsing paradigm within the Haskell ecosystem, so much so that people use them even in place of regular expressions. I strongly recommend reading the "Monadic Parsing in Haskell" functional pearl linked below if you want to get a feel for why parser combinators are so dominant in the Haskell landscape.

If you're not sure what library to pick, I generally recommend the parsec library as a default well-rounded choice because it strikes a decent balance between ease-of-use, performance, good error messages, and small dependencies (since it ships with GHC).

attoparsec deserves special mention as an extremely fast backtracking parsing library. The speed and simplicity of this library will blow you away. The main deficiency of attoparsec is the poor error messages.

The pretty-printing front is also excellent. Academic researchers just really love writing pretty-printing libraries in Haskell for some reason.

Notable libraries:

• parsec - best overall "value"
• attoparsec - Extremely fast backtracking parser
• trifecta - Best error messages (clang-style)
• alex / happy - Like lexx / yacc but with Haskell integration
• Earley - Early parsing embedded within the Haskell language
• ansi-wl-pprint - Pretty-printing library
• text-format - High-performance string formatting

Educational resources:

Propaganda:

## Stream programming

Rating: Mature

Haskell's streaming ecosystem is mature. Probably the biggest issue is that there are too many good choices (and a lot of ecosystem fragmentation as a result), but each of the streaming libraries listed below has a sufficiently rich ecosystem including common streaming tasks like:

• Network transmissions
• Compression
• External process pipes
• High-performance streaming aggregation
• Concurrent streams
• Incremental parsing

Notable libraries:

• conduit / io-streams / pipes - Stream programming libraries (Full disclosure: I authored pipes and wrote the official io-streams tutorial)
• machines - Networked stream transducers library

Educational resources:

## Serialization / Deserialization

Rating: Mature

Haskell's serialization libraries are reasonably efficient and very easy to use. You can easily automatically derive serializers/deserializers for user-defined data types and it's very easy to encode/decode values.

Haskell's serialization does not suffer from any of the gotchas that object-oriented languages deal with (particularly Java/Scala). Haskell data types don't have associated methods or state to deal with so serialization/deserialization is straightforward and obvious. That's also why you can automatically derive correct serializers/deserializers.

Serialization performance is pretty good. You should expect to serialize data at a rate between 100 Mb/s to 1 Gb/s with careful tuning. Serialization performance still has about 3x-5x room for improvement by multiple independent estimates. See the "Faster binary serialization" link below for details of the ongoing work to improve the serialization speed of existing libraries.

Notable libraries:

• binary / cereal - serialization / deserialization libraries

Educational resources:

## Support for file formats

Rating: Mature

Haskell supports all the common domain-independent serialization formats (i.e. XML/JSON/YAML/CSV). For more exotic formats Haskell won't be as good as, say, Python (which is notorious for supporting a huge number of file formats) but it's so easy to write your own quick and dirty parser in Haskell that this is not much of an issue.

Notable libraries:

• aeson - JSON encoding/decoding
• cassava - CSV encoding/decoding
• yaml - YAML encoding/decoding
• xml - XML encoding/decoding

## Package management

Rating: Mature

If you had asked me a few months back I would have rated Haskell immature in this area. This rating is based entirely on the recent release of the stack package tool by FPComplete which greatly simplifies package installation and dependency management. This tool was created in response to a broad survey of existing Haskell users and potential users where cabal-install was identified as the single greatest issue for professional Haskell development.

The stack tool is not just good by Haskell standards but excellent even compared to other language package managers. Key features include:

• Excellent project isolation (including compiler isolation)
• Global caching of shared dependencies to avoid wasteful rebuilds
• Easily add local repositories or remote Github repositories as dependencies

stack is also powered by Stackage, which is a very large Hackage mono-build that ensures that a large subset of Hackage builds correctly against each other and automatically notifies package authors to fix or update libraries when they break the mono-build. Periodically this package set is frozen as a Stackage LTS release which you can supply to the stack tool in order to select dependencies that are guaranteed to build correctly with each other. Also, if all your projects use the same or similar LTS releases they will benefit heavily from the shared global cache.

Educational resources:

Propaganda:

## Logging

Haskell has decent logging support. That's pretty much all there is to say.

Rating: Mature

• fast-logger - High-performance multicore logging system
• hslogger - Logging library analogous to Python's ConfigParser library

## Education

Rating: Immature

The primary reason for the "Immature" rating is two big deficiencies in Haskell learning materials:

• Intermediate-level books
• Beginner-level material targeted at people with no previous programming experience

Other than that the remaining learning resources are okay. If the above holes were filled then I would give a "Mature" rating.

The most important advice I can give to Haskell beginners is to learn by doing. I observe that many Haskell beginners dwell too long trying to learn by reading instead of trying to build something useful to hone their understanding.

Educational resources:

## Debugging

Rating: Immature

The main Haskell debugging features are:

• Memory and performance profiling
• Stack traces
• Source-located errors, using the assert function
• Breakpoints, single-stepping, and tracing within the GHCi REPL
• Informal printf-style tracing using Debug.Trace

The two reasons I still mark debugging "Immature" are:

• GHC's stack traces require profiling to be enabled
• There is only one IDE that I know of (leksah) that integrates support for breakpoints and single-stepping and leksah still needs more polish

ghc-7.10 also added preliminary support for DWARF symbols which allow support for gdb-based debugging and perf-based profiling, but there is still more work that needs to be done. See the following page for more details:

Educational resources:

## Cross-platform support

Rating: Immature

I give Haskell an "Immature" rating primarily due to poor user experience on Windows:

• Most Haskell tutorials assume a Unix-like system
• Several Windows-specific GHC bugs
• Poor IDE support (Most Windows programmers don't use a command-line editor)

This is partly a chicken-and-egg problem. Haskell has many Windows-specific issues because it has such a small pool of Windows developers to contribute fixes. Most Haskell developers are advised to use another operating system or a virtual machine to avoid these pain points, which exacerbates the problem.

The situation is not horrible, though. I know because I do half of my Haskell programming on Windows in order to familiarize myself with the pain points of the Windows ecosystem and most of the issues affect beginners and can be worked around by more experienced developers. I wouldn't say any individual issue is an outright dealbreaker; it's more like a thousand papercuts which turn people off of the language.

If you're a Haskell developer using Windows, I highly recommend the following installs to get started quickly and with as few issues as possible:

• Git for Windows - A Unix-like command-line environment bundled with git that you can use to follow along with tutorials
• MinGHC - Use this for project-independent Haskell experimentation
• Stack - Use this for project development

Additionally, learn to use the command line a little bit until Haskell IDE support improves. Plus, it's a useful skill in general as you become a more experienced programmer.

For Mac, the recommended installation is:

• Haskell for Mac OS X - A self-contained relocatable GHC build for project-independent Haskell experimentation
• Stack - Use this for project development

For other operating systems, use your package manager of choice to install ghc and stack.

Educational resources:

## Databases and data stores

Rating: Immature

This is is not one of my areas of expertise, but what I do know is that Haskell has bindings to most of the open source databases and datastores such as MySQL, Postgres, SQLite, Cassandra, Redis, DynamoDB and MongoDB. However, I haven't really evaluated the quality of these bindings other than the postgresql-simple library, which is the only one I've personally used and was decent as far as I could tell.

The "Immature" ranking is based on the recommendation of Stephen Diehl who notes:

Raw bindings are mature, but the higher level ORM tooling is a lot less mature than its Java, Scala, Python counterparts Source

However, Haskell appears to be deficient in bindings to commercial databases like Microsoft SQL server and Oracle. So whether or not Haskell is right for you probably depends heavily on whether there are bindings to the specific data store you use.

Notable libraries:

Rating: Immature

Haskell does provide support for hot code loading, although nothing in the same ballpark as in languages like Clojure.

• Compiling and linking object code at runtime (i.e. the plugins or hint libraries)
• Recompiling the entire program and then reinitializing the program with the program's saved state (i.e. the dyre or halive libraries)

You might wonder how Cloud Haskell sends code over the wire and my understanding is that it doesn't. Any function you wish to send over the wire is instead compiled ahead of time on both sides and stored in a shared symbol table which each side references when encoding or decoding the function.

Haskell does not let you edit a live program like Clojure does so Haskell will probably never be "Best in class" short of somebody releasing a completely new Haskell compiler built from the ground up to support this feature. The existing Haskell tools for hot code swapping seem as good as they are reasonably going to get, but I'm waiting for commercial success stories of their use before rating this "Mature".

The halive library has the best hot code swapping demo by far:

Notable libraries:

• plugins / hint - Runtime compilation and linking
• dyre / halive - Program reinitialization with saved state

## IDE support

Rating: Immature

I am not the best person to review this area since I do not use an IDE myself. I'm basing this "Immature" rating purely on what I have heard from others.

The impression I get is that the biggest pain point is that Haskell IDEs, IDE plugins, and low-level IDE tools keep breaking with every new GHC release.

Most of the Haskell early adopters have been vi/vim or emacs users so those editors have gotten the most love. Support for more traditional IDEs has improved recently with Haskell plugins for IntelliJ and Eclipse and also the Haskell-native leksah IDE.

FPComplete has also released a web IDE for Haskell programming that is also worth checking out which is reasonably polished but cannot be used offline.

Notable tools:

• hoogle - Type-based function search
• hlint - Code linter
• ghc-mod - editor agnostic tool that powers many IDE-like features
• ghcid - lightweight background type-checker that triggers on code changes
• codex - Tags file generator for cabal project dependencies.
• hdevtools - Persistent GHC-powered background server for development tools
• ghc-imported-from - editor agnostic tool that finds Haddock documentation page for a symbol

IDE plugins:

• IntelliJ (the official plugin or Haskforce)
• Eclipse (the EclipseFP plugin)

IDEs:

Educational resources:

# Conclusions

I originally hosted this post as a draft on Github in order to solicit review from people more knowledgeable than myself. In the process it turned into a collaboratively edited wiki which you can find here:

I will continue to accept pull requests and issues to make sure that it stays up to date and once or twice a year I will post announcements if there have been any major changes or improvements in the Haskell ecosystem.

The main changes since the draft initially went out were:

• The "Type system" section was upgraded to "Best in class" (originally ranked "Mature")
• The "Concurrency" section was renamed to "Single-machine concurrency" and upgraded to "Best in class" (originally ranked "Mature")
• The "Database" section was downgraded to "Immature" (originally ranked "Mature")

# Contributors

• Aaron Levin
• Alois Cochard
• Ben Kovach
• Benno Fünfstück
• Carlo Hamalainen
• Chris Allen
• Curtis Gagliardi
• Deech
• David Howlett
• David Johnson
• Edward Cho
• Greg Weber
• Gregor Uhlenheuer
• Juan Pedro Villa Isaza
• Kazu Yamamoto
• Kirill Zaborsky
• Liam O'Connor-Davis
• Luke Randall
• Marcio Klepacz
• Mitchell Rosen
• Nicolas Kaiser
• Oliver Charles