Monday, January 21, 2013

Introduction to Haskell IO

I fell in love with Haskell neither because of types nor functional programming. Rather, I admired Haskell's beautiful approach to I/O and I hope that after reading this you will, too. I'm writing this IO tutorial to underscore Haskell's simplicity and consistency.

If you want to follow along, download the Haskell Platform, which provides a Haskell compiler (ghc) and Haskell interpreter (ghci).


Hello, world!


Let's begin from the simplest possible Haskell program:
main = putStrLn "Hello, world"
If you save the above program to example.hs, you can compile and run it using:
$ ghc -O2 example.hs  # '-O2' is a good habit to learn
[1 of 1] Compiling Main             ( example.hs, example.o )
Linking example ...
$ ./example
Hello, world!

Types - Part 1


All Haskell values have types, but we usually don't need to supply any type signatures because the compiler can mechanically deduce the types for us.

This means we can ask the ghci Haskell interpreter to infer the types of arbitrary Haskell expressions. Let's fire up ghci so we can ask for the types of the values we used:
$ ghci
GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> 
Let's start off with something simple. What is the type of "Hello, world!"?
Prelude> :type "Hello, world!"
"Hello, world!" :: [Char]
ghci says that "Hello, world!"'s type is [Char], which means "a list of characters". Char is the type of a single character and the brackets around Char mean "list of".

Wait, shouldn't it be a String? Actually, String is just a type synonym for [Char], and we can verify that using ghci's :info command:
Prelude> :info String
type String = [Char]    -- Defined in `GHC.Base'
That means that we can use String or [Char] interchangeably in types. They are equal.

Now let's study the type of putStrLn:
Prelude> :t putStrLn  -- You can use ":t" instead of ":type"
putStrLn :: String -> IO ()
putStrLn is a function that takes a single argument of type String and produces a value of type IO (). The IO () signifies an executable action that returns a value of type (). In this case, the action just print the given String, and putStrLn is just a mnemonic for "put String with newLine.

All IO actions must return a value, so Haskell programmers return () when they have nothing useful to return. This exactly mirrors how imperative programmers mark a function void when they don't return anything useful. For example, if we were to translate putStrLn's type to other languages, it would look like:
// C
void putStrLn(char *str)

// Java
public static void putStrLn(String str)
We can ask ghci about the types of expressions, too:
Prelude> :t putStrLn "Hello, world!"
putStrLn "Hello, world!" :: IO ()
This tells us what we already knew: when we supply putStrLn with a String argument we get an executable IO action which will print the supplied String. All of this so far is very straight-forward and we haven't deviated yet from traditional programming languages.


Equality


Now let's compare our original program side-by-side with equivalent main functions in imperative languages:
-- Haskell
main = putStrLn "Hello, world!"

// C (non-standards-conformant, to simplify the comparison)
void main() {
    printf("Hello, world!");
}

// Java
public static void main(String[] args) {
    System.out.println("Hello, world!");
}
One difference should stand out: the Haskell version does not use traditional block syntax for defining the main function. Instead it uses the equals sign, which may seems quite curious to imperative eyes.

It turns out that Haskell takes equality very seriously. When you write:
x = 5
... you simply declare x to be synonymous with 5. You cannot change the value of x any more than you can change the value of the number 5. x and 5 become completely interchangeable since they both refer to the same thing. They are equal.

So when we write:
main = putStrLn "Hello, world!"
... all that says is that main is nothing more than a synonym for putStrLn "Hello, world!". Therefore, they must have the same type and we can prove this by loading our program into ghci:
$ ghci example.hs
GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Ok, modules loaded: Main.
Prelude Main> :t putStrLn "Hello, world!"
putStrLn "Hello, world!" :: IO ()
Prelude Main> :t main
main :: IO ()

Types - Part 2


Haskell has one and only one rule for what constitutes a valid main:
  • main must have type IO a (where a may be any return value)
That's it!

Anything of type IO a is executable, and that rule ensures that main is executable. If I define main to be something non-executable, I get a type error:

main = 'A'
$ ghc -O2 example.hs
[1 of 1] Compiling Main             ( example.hs, example.o )

example.hs:1:1:
    Couldn't match expected type `IO t0' with actual type `Char'
    In the expression: main
    When checking the type of the function `main'
The compiler tells us that a Char is not executable, which makes sense! Notice that we rejected the above main not on the basis of syntax or grammar, but solely on the basis of its type. The type alone communicates whether or not something is executable. If something has type is IO a, then it is executable. If something does not have type IO a, then it is not executable.

Wait... How does main do more than one thing, then? putStrLn "Hello, world" has type IO (), so does that mean that it uses up all of main's "executable juice"? Not at all!


do notation


Haskell lets you combine multiple IO actions into a single IO action using do notation. I'll introduce the following two functions to demonstrate this:
getLine  :: IO String        -- I read one line from stdin
putStrLn :: String -> IO ()  -- We've met
We can feed the String return value of getLine into putStrLn using do notation:
main = do str <- getLine
          putStrLn str
Haskell also supports semicolons and braces for those who don't like significant whitespace:
main = do {
    str <- getLine;
    putStrLn str;
}
do notation combines these two IO actions into a single IO action. This combined action prompts the user for one line of input and then echoes it back. Let's try it!
$ ./example
Test<Enter>
Test
We can also sequence two IO actions like this:
main = do putStrLn "Hello,"
          putStrLn "world!"
./example
Hello,
world!
We can combine as many actions as we please:
main = do putStrLn "Enter a string:"
          str <- getLine
          putStrLn str
$ ./example
Enter a string:
Apple<Enter>
Apple
Our do block only lets us combine IO commands, though. If we attempt to insert a non-IO statement, we get a type error:
main = do putStrLn "Enter a string:"
          str <- getLine
          "Hello"
example.hs:3:11:
    Couldn't match expected type `IO b0' with actual type `[Char]'
    In a stmt of a 'do' block: "Hello"
    In the expression:
      do { putStrLn "Enter a string:";
           str <- getLine;
           "Hello" }
    In an equation for `main':
        main
          = do { putStrLn "Enter a string:";
                 str <- getLine;
                 "Hello" }
The first line says it all. The do block expected another executable command on the last line, but we gave it a String, which isn't executable. That's a pretty reasonable type error if you ask me.


Types - Part 3


Haskell has one and only one rule for deciding if a do block is valid:
-- Given:
m :: IO a
f :: a -> IO b

-- Then the following 'do' block is valid
block :: IO b  -- This must match the return value of 'f'
block = do a <- m
           f a
do blocks are first class expressions in Haskell. Like all expressions, we can ask for ghci for their type:
Prelude> :t do { str <- getLine; putStrLn str; }
do { str <- getLine; putStrLn str; } :: IO ()
putStrLn str was the last action and it's type is IO (), so according to the rule the whole block has the same type: IO ().

Now, why on earth should a do block have a return value at all? That would only make sense if we could use a do block within another do block. Does that work?
main = do str <- do putStrLn "Enter a string:"
                    getLine
          putStrLn str
$ ./example
Enter a string:
Apple<Enter>
Apple
Huh.

do blocks use the same return value as their last command because they actually return the last command's value if you bind them within a larger do block. In the above example, the inner do block forwards the String returned by getLine to the outer do block, and that String is bound to the str variable.

This approach maintains consistency in the face of refactorings. Let's define a synonym for the inner do block so we can refer to it by name:
main = do str <- prompt
          putStrLn str

prompt :: IO String
prompt = do putStrLn "Enter a string:"
            getLine
prompt's type signature is entirely for our own benefit. Remember that the compiler can deduce the type mechanically. Haskell programmers still like to include these type signatures to organize their thoughts or to clearly communicate their code's intention to other programmers. prompt's type signature tells other programmers "This is an executable action that retrieves a String".


Impurity


Notice how we don't use the equals sign when we store the result of IO actions. Instead we use the <- symbol:
main = do str <- prompt
          ...
That's because str is not truly equal to prompt. In fact, they don't even have the same type. str has type String and prompt has type IO String. When we mark something as IO a, we communicate that its result may change each time we bind its result within a do block.

That is obvious in the case of prompt since the user may enter a different string each time:
main = do str1 <- prompt
          putStrLn str1
          str2 <- prompt
          putStrLn str2

prompt :: IO String
prompt = do putStrLn "Enter a string:"
            getLine
Enter a string:
String 1
String 1
Enter a string:
String 2
String 2
However, notice that we still use the equal sign when defining prompt:
prompt :: IO String
prompt = do putStrLn "Enter a string:"
            getLine
prompt is truly equal to that executable action, which seems strange. How does that reconcile with what I just said?

It turns out that Haskell distinguishes between an executable program and the program's return value, a distinction that almost no other language makes. A value of type IO a is an executable program that retrieves an a, possibly with side effects, but the program itself is not the same as its return value. So if getLine has type IO String, that means that getLine is an executable program that knows how to retrieve a String. It does not mean that getLine is a String

When I say prompt equals the given do block, I mean that they are equal in the sense that they are the same program. Since they are equal I can freely substitute one for the other. For example, I can go back and substitute the do block in place of prompt and trust that this substitution does not change anything:
main = do str1 <- do putStrLn "Enter a string:"
                     getLine
          putStrLn str1
          str2 <- do putStrLn "Enter a string:"
                     getLine
          putStrLn str2
This means that do notation does not actually do any computation. All it does is combine program descriptions into larger program descriptions. The compiler then translates the main program description into an executable.

This might seem like a trivial distinction at first until you realize that this allows you to separate your program into two parts:
  • The part you reason about at compile-time (i.e. non-IO code)
  • The part you reason about at run-time (i.e. IO code)
The more you factor your program into non-IO code, the more of your program that you can prove is correct at compile time, which means that no run-time circumstances will ever break it.

With traditional programming languages, all code only exists at run-time (i.e. it is all IO code by default), so you can never be entirely sure that it is truly correct. No matter how many tests you write, all you can prove to yourself is that a certain run-time snapshot of your program happens to be in a correct state.


Associativity


Haskell guarantees the following property of all IO programs:
-- Given:
m :: IO a
f :: a -> IO b
g :: b -> IO c

  do b <- do a <- m
             f a
     g b

= do a <- m
     b <- f a
     g b

= do a <- m
     do b <- f a
        g b
In other words, the program's behavior does not change no matter how you group IO actions. They will still always execute in the same order and preserve the same flow of information.

This guarantees that we can always safely refactor arbitrary code blocks without worrying that we will get weird behavior as a result of creating or removing blocks. Using the above example, I could choose to factor out the first two actions:
block12 = do a <- m
             f a

block123 = do b <- block12
              g b
... or I can factor out the last two actions:
block23 a = do b <- f a
               g b

block123 = do a <- m
              block23 a
I can similarly trust that "unfactoring" things is safe and won't cause weird behavior. In other words, I can inline the two refactored blocks safely and trust that it produces the same program:
block123 = do a <- m
              b <- f a
              g b
Notice that all these programs are all truly equal, which is why we use the equal sign when we say they are the same. This is an example of how we reason about correctness at compile time. We can't reason about the specific values of a, b, and c because they do not exist until run-time, but the program descriptions themselves exist at compile time, so we can statically reason about their correctness and prove that certain transformations are correct.


Identity


What if we want to create a "trivial" program that just returns a value without any side effects? That's what the return function does:
return :: a -> IO a
return takes a value and then defines a trivial program that has no side effects and just returns the given value.

Note that it does not behave like the traditional return command in imperative languages. It does not escape from the surrounding block because Haskell do notation has no concept of a "surrounding block". If it did, all the transformations I gave in the last section would not be correct. More generally, return does not affect control flow in any way. All it does is create an "empty" program that yields a value. That's it.

We call it return because we commonly use it as the last statement in a do block to combine the results of several previous IO actions:
twoStrings :: IO (String, String)
twoStrings = do str1 <- getLine
                str2 <- getLine
                return (str1, str2)
Other than that, it bears no resemblance to traditional returns.

What equalities might we expect return to satisfy? Well, we expect that:
do x <- m
   return x

= do m
In other words, there is no need to "re-return" the last action's value, because do notation already does that.

We also expect that:
do y <- return x
   f y

= do f x
This just formalizes what I already said: return is an empty program that just binds the same value that you gave it.


Ordering


Now let's define two programs that retrieve input from the outside world:
getLine :: IO String  -- Provided by the Prelude

getInt :: IO Int
getInt = do str <- getLine
            return (read str)
... and then a function designed to use both of those values:
f :: String -> Int -> IO ()
f str n = do
    if (n > 0)
        then putStrLn "n is positive"
        else putStrLn "n is not positive"
    putStrLn str
New Haskell programmers will commonly make the following mistake:
main = f getLine getInt
... which the compiler will promptly correct:
example.hs:12:10:
    Couldn't match expected type `String' with actual type `IO String'
    In the first argument of `f', namely `getLine'
    In the expression: f getLine getInt
    In an equation for `main': main = f getLine getInt
Imperative programmers make this mistake because imperative languages let them write:
f(getLine(), getInt())
... and the imperative compiler doesn't mind that you passed an impure function as an argument because imperative languages don't distinguish between the side effects and their result.

The imperative approach may seem acceptable until you ask yourself the question: "Which function will prompt the user first: getLine() or getInt()? Typically an imperative language will evaluate the arguments in order, first evaluating getLine followed by getInt. However, this evaluation order is entirely arbitrary and requires extending the language definition.

In fact, most imperative programmers would probably consider the above imperative example "bad form". We'd rather make the ordering more explicit by writing:
str = getLine(); // First prompt the user for a string
n   = getInt();  // Now prompt the user for an int
f(str, n);       // Now use both values
Haskell differs from imperative languages by forcing you to explicitly specify the order of all computations with side effects, so that there is never any ambiguity. Haskell forces us to write:
main = do str <- getLine
          n   <- getInt
          f str n
In other words, Haskell requires imperative best practices by forcing the programmer to order all side effects and not rely on any implicit ordering by the language. This improves the readability of Haskell code because you can always reason about the order of side effects: a do block guarantees that side effects always order from top to bottom.


Miscellany


Haskell prides itself on being an orthogonal language. Almost all concepts are just syntactic sugar on just a few core concepts. This section just describes a few last syntactic niceties that Haskell provides that build entirely on top of the previous concepts.

First off, when we sequence two commands:
do command1
   command2
... that is just equivalent to discarding the result of the first action:
do _ <- command1
   command2
Also, a do block with one command is just equal to that command:
main = do putStrLn "Hello, world!"

-- is equivalent to:

main = putStrLn "Hello, world!"
Finally, you can define pure computations within a do block using let:
do let x = y
   f x

= do f y
For example:
main = do
    str <- getLine
    let response = "You entered: " ++ str
    putStrLn response

{- equivalent to:
main = do
    str <- getLine
    putStrLn ("You entered: " ++ str)
-}
$ ./example
Orange<Enter>
You entered: Orange

Conclusions


Actually, all of that is just the tip of the iceberg because do notation is also just syntactic sugar! If you want to learn more about that, then I recommend you read You Could Have Invented Monads, which explains how do notation itself is built on top of an even more primitive core concept. The more you learn Haskell, the more you realize how it is just a bunch of syntactic sugar on a very small core of orthogonal concepts.

20 comments:

  1. I think I may have found a couple of small errors:

    "More generally, return does it affect control flow in anyway"
    Should be
    "More generally, return does not affect control flow in any way"

    ---

    do y <- return x
    f y
    = do f y

    The last line should be "do f x"

    ReplyDelete
    Replies
    1. «anyway» → «any way» is not fixed yet :P

      Delete
    2. Oh, I didn't notice there were two mistakes! Thanks for catching that.

      Delete
  2. I like how you describe do notation without ever resorting to the fact that it's just a sugar for monadic bind. And link to "You Could Have Invented Monads" just makes it all complete. Nice!

    You should not mix IO (Haskell type) with I/O (abbreviation for input/output) — it just feels wrong to read "IO" where there should be "I/O".

    Also, using :type on an expression that has explicit type annotation is not the best idea ever (I'm speaking about that part where you tell that do notation is first-class citizen of Haskell).

    ReplyDelete
    Replies
    1. Thanks!

      I only fixed one "IO" (the first one). I kept the other ones mainly because I wanted to convey that I was talking about the very specific Haskell notion of IO rather than general IO.

      I actually prefer not to overly associate Haskell IO with the general notion of I/O. The reason is that I actually consider free monads to be a better underlying model for how to think of Haskell IO (and I don't really like the State formulation), and that's why I try not to give too specific of a meaning to what IO is other than "a 'list' of instructions".

      I fixed the explicit type annotation. That was just a copy/paste gone wrong.

      Delete
  3. "we never need to supply any type signatures because the compiler can mechanically deduce every type for us."

    What about RankNTypes?

    ReplyDelete
  4. void main() isn't valid C, and makes this post rather suspect.

    ReplyDelete
    Replies
    1. I chose the closest C signature to the Haskell version that would still compile, so as not to distract from the comparison. The topic of this post is how to use IO in Haskell, not how to write a standards-conforming C program, so I believe I made an acceptable trade-off.

      Delete
  5. "It turns out that Haskell distinguishes between an executable program and the program's return value, a distinction that almost no other language makes."

    You could make a comparison with Unix commands, which often produce output, and also have an exit status. For instance, 'grep' outputs the lines that match its pattern, and has an exit status (stored in the shell variable $?) of either 0 or 1, indicating whether or not any lines matched.

    I don't know if the example would be more useful or confusing, but it is there.

    ReplyDelete
    Replies
    1. Actually, that analogy would be more appropriate for my `pipes` library, which distinguishes between the streaming output and the return value.

      However, you could say that it is the same as distinction between a program's source code and its exit code.

      Delete
  6. That was one of the clearest descriptions if Haskell IO types I've ever read.

    Thank you.

    ReplyDelete
  7. This is a great post, very understandable and it makes IO in Haskell look less convoluted. IO in Haskell if for me a source of great displeasure and it just defeated every try I have given to learn the language. But if put those difficulties aside, I find the language very great.

    ReplyDelete
    Replies
    1. Monadic IO is one of the *best* features of Haskell and probably the second most important thing that distinguishes it from other languages (the first being enforced purity). Haskell separates side effect order from evaluation order, which makes it significantly easier to reason about side effects in Haskell.

      For example, consider the getLine function. Notice how it accepts no arguments, not even a trivial () argument, so it is already fully evaluated. In another language this would cause all sorts of problems because you would accidentally re-trigger its side effect each time you passed it around or copied it. These other languages use a hack to work around the issue where they require all functions to take arguments (even an empty set of arguments) in order to prevent them from accidentally triggering. In contrast, you will notice that Haskell functions like get line are remarkably devoid of superfluous arguments because evaluating them fully has absolutely nothing to do with the order of their side effects. The only thing that determines their side effect order is their position in the main block relative to other IO actions.

      Most other languages conflate evaluation order with side effect order, which means that you need a non-trivial mental model of how the language evaluates code to understand its behavior. Haskell totally decouples the two, which makes code so much easier to reason about both informally and equationally.

      Delete
  8. Damn I wrote a comment but it disappeared when I hit 'Preview'. Oh well, the gist of it:

    I've found that a lot of informal descriptions of Haskell's I/O model don't emphasize enough the fact that values of type IO represent side-effecting actions that have yet to be executed. I've seen functions of the form 'a -> IO b' described as functions that produce side-effects, whereby the return value merely "tags" this fact. But semantically this couldn't be the case because otherwise the run-time couldn't perform optimizations such as memoization to such functions. Such functions are in fact pure because they merely produce the "recipe" for doing the side-effecting stuff.

    In a way it seems like Haskell cheats when it comes to I/O: one describes a pure program that evaluated to an impure program that can be run by the runtime. Seems like passing the buck in a way. Haskell you sneaky fucker!! ;)

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. typo: "one describes a pure program that *evaluates* to"

      Delete
    3. It's as if in Haskell you are not describing an executable program directly but rather describing how to construct such a program. I guess strictly this is true of all compiled programming languages however, as you say, in Haskell the evaluation order isn't tied to the order in which side-effects occur. It achieves a powerful separation of concerns.

      Delete