Haskell is a pure functional language, almost to a fault. Purity means that functions never have side effects, ever. Technically, you can write impure functions in Haskell, but the language design strongly discourages it and idiomatic Haskell programs will never use side effects.
This philosophy marks a significant departure from other functional programming languages like OCaml or Lisp. These languages ceded moral ground to imperative programming languages by allowing pure functions to have side effects. This leads to clunky and inelegant code that makes it difficult to program with side-effects and reason about them. So what did Haskell do different?
Haskell solves the problem of stateful programming by replacing normal functional outputs with stateful outputs. A stateful output is something of the form:
-- actually a newtype, and I've flipped the output type State s a = s -> (a, s)In other words, a stateful output takes some old state of type s and returns the output of type a along with a new updated state of type s:
old state output new state s -> ( a , s )So if a pure function has the type:
f :: a -> b... then the "impure" stateful version would have the type:
f' :: a -> State s b... which just translates to:
f' :: a -> s -> (b, s)In other words, a stateful function takes an input of type a and an old state and returns an output of type b along with a new updated state:
input old state output new state a -> s -> ( b , s )So "stateful" functions are actually still pure functions whose input and output just happen to include state.
We can express a lot of stateful functions (actually, all of them), using this trick. For example, the get function just takes the current state and returns it in the output slot, but doesn't change it.
get :: State s s get s = (s, s)The put function updates the state with a new value:
put :: s -> State s () put s = \_ -> ((), s)Now let's say that we want to use our get and put functions to create the Haskell equivalent of the following pseudocode:
add n = do x <- get put (x + n) quadruple = do x <- get add x y <- get add yWe'd be forced to write the following really awkward (but pure!) code:
add :: Int -> State Int () add n s0 = let (x, s1) = get s0 in put (x + n) s1 quadruple :: State Int () quadruple s0 = let (x, s1) = get s0 (_, s2) = add x s1 (y, s3) = get s2 in add y s3Yuck! I could have written shorter versions of those functions but I wanted to emphasize the explicit state passing. Functional programmers hate boilerplate, so they use functions to abstract away repetitive tasks. In our above example, the repetitive task is manually passing state around:
let (a, s') = m s in f a s'In other words, we apply some stateful output m to the current state, which generates the pure output a and a new updated state s'. We then feed both of these to some new stateful function f. Let's write a function to automate this repetitive task for us:
bind :: State s a -> (a -> State s b) -> State s b bind m f s = let (a, s') = m s in f a s'We can use bind to shorten our original code:
add :: Int -> State Int () add n = bind get (\x -> put (x + n) ) quadruple :: State Int () quadruple = bind get (\x -> bind (add x) (\_ -> bind get (\y -> add y )))Wow! This looks awfully similar to our original pseudocode:
add n = bind get (\x -> | add n = do x <- get put (x + n) ) | put (x + n) | quadruple = bind get (\x -> | quadruple = do x <- get bind (add x) (\_ -> | add x bind get (\y -> | y <- get add y ))) | add yIn fact, our original pseudocode wasn't even pseudocode. That's actually valid Haskell code. The do notation is just syntactic sugar for repeated calls to bind, except that in Haskell bind is denoted by the infix function >>=.
What will really surprise newcomers to Haskell is that do notation is not just syntactic sugar for stateful functions. It's syntactic sugar for anything that's a monad, and State s is just one of many kinds of monads. In fact, >>= generalizes bind:
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b bind :: State s a -> (a -> State s b) -> State s bThis is why Haskell has so many monad tutorials.
Haskell uses State for input and output, too. In fact, Haskell's IO type is just:
type IO a = State RealWorld a -- i.e. type IO a = RealWorld -> (a, RealWorld)I'm not making this up. Of course, RealWorld doesn't actually imply that we are masters of the universe capable of manipulating reality at will. It's just a placeholder to symbolize the state of the real world so that we can interact with it in a structured, type-safe, and pure way.
Earlier I made the bold statement that Haskell has no "side effects". This is because the "effects" of Haskell don't really happen on the "side". They are dealt with front and center using the type system and ordinary, pure functions rather than hacking in extra out-of-band language features as many other functional languages are wont to do.
IO differs from State s in that the state is never exposed. You can't get or set it (obviously, since we are neither omniscient nor omnipotent), nor can you store the state to be revisited at a later date (since we aren't time-travelers). However, we can still use our understanding of State to reason about how IO works. Take for example the getLine function:
getLine :: IO String -- i.e. getLine :: RealWorld -> (String, RealWorld)We can think of getLine as starting from some RealWorld state and outputting a a String, leaving RealWorld in a new state as a result of its operation.
From this type we can immediately deduce a couple of things about getLine's behavior:
- The string it embeds in its output may change with each invocation since getLine is a function of the outside world's state.
- getLine may change the outside world (i.e. have effects, like consuming input) since it returns a new state
myString :: String myString = "Haskell for all"We can similarly deduce that:
- myString always returns the exact same string since it is not a function of any state
- myString doesn't change any state when invoked since it doesn't provide a new state for some other function to use.
Advantages of Purity
Haskell's embeds all statefulness directly within its own type system, which leads to several advantages over other languages:
- We can distinguish between pure and "impure" functions just by looking at their type. We don't have to read a function's documentation to know if it has side effects. The type documents the function's behavior.
- The compiler doesn't have to optimize "impure" functions differently. It can use the same optimizations for IO code that it would have used for pure State s code.
- The compiler can more aggressively optimize pure code because the type guarantees no side effects. For example, the compiler can safely reduce the number of calls to a pure function by caching its output.
- Haskell abstracts imperative code even better than imperative languages. Haskell can implement high-level imperative combinators (such as iterators/iteratees or foreach) directly within the language because statefulness is implemented directly within the language. This is why Haskell blows away even imperative languages when it comes to topics like concurrency, coroutines, and reactive programming.
- The State s monad possesses a strong and elegant theoretical basis in category theory and you can reason about it mathematically if you ever learn category theory. Other functional languages just implement statefulness as a dirty hack that seems like an afterthought.