Saturday, December 31, 2011

Haskell for Mainstream Programmers - Code reuse

Haskell code reuse is extraordinarily high because the language builds on a strong foundation of category theory, which is the mathematician's version of "code reuse". Mathematicians developed category theory in order to identify patterns common to every branch of mathematics, but several brilliant computer scientists discovered that those same patterns are also common in programming languages as well.

Haskell programmers work much more rapidly because of this phenomenally high code reuse. Other languages penalize you for using a new library because you must spend a considerable amount of time learning the library's API, reading documentation, and learning library-specific conventions. A proficient Haskeller adopts libraries in minutes because they all share common interfaces derived from category theory.

I'll illustrate this using Haskell's Functor interface, derived from functors in category theory.

Maybe Functor


Java programmers must constantly check for null pointers. This forces them to constantly write code like the following:
String s = null;
if (null != x)
{
    s = x.toString();
}
C programmers have the same problem, where a function might return a special value or null pointer to indicate failure:
FILE *fp = NULL;
fp = fopen("poem.txt", "w");
if (fp)
{
    fprintf(fp, "Roses are red\n");
}
In Haskell, you model values that might be empty using the following type:
data Maybe a = Just a | Nothing
Maybe a denotes something that "maybe" holds a value of type a because there are two ways you can build a value of type Maybe a. Either you provide a value of type a using the Just constructor or you provide no value at all with the Nothing constructor. For example, a map lookup might fail, so the type of Haskell's lookup function is:
lookup :: (Ord k) => k -> Map k a -> Maybe a
This works great except for one problem. All our existing functions that expect arguments of type a will reject arguments of type Maybe a. For example, the following code will not compile:
square :: Int -> Int
square x = x ^ 2

four = square (Just 2) -- wrong
This fails to compile because square expects its argument to be of type Int and not Maybe Int, so let's write a helper function to give square a hand by converting square into a function that works on Maybes. Haskell uses the name fmap for this function:
fmap:: (a -> b) -> Maybe a -> b -- wrong
fmap f (Just x) = f x
Oops! We forgot to cover all possible constructors. fmap's second argument has type Maybe a, which means that fmap must also accept Nothing for its second argument. If we try to lift our function to operate on a Nothing, fmap will fail with a run-time error:
> fmap square (Just 2)
4
> fmap square Nothing
*** Exception: Non-exhaustive patterns in function fmap
Hmm, it seems that our fmap function can sometimes fail, so why not make its return type use Maybe to encapsulate that failure:
fmap :: (a -> b) -> Maybe a -> Maybe b
fmap f (Just x) = Just (f x)
fmap f Nothing  = Nothing
Much better! Now our function handles errors transparently by simply propagating them:
> fmap square (Just 2)
Just 4
> fmap square Nothing
Nothing
We just unwittingly defined our first functor, the Maybe functor. To see why, let's look at the definition of the Functor class:
class Functor f where
    fmap :: (a -> b) -> f a -> f b
If you substitute Maybe for f then you get the type signature for the fmap function we just derived. Using Haskell terminology, Maybe is an instance of the Functor class. Haskell classes are analagous to C#/Java interfaces, so using their terminology you would say that Maybe implements the Functor interface.

Collection Functors


Haskell lists are ordered sets of objects and a Haskell list containing objects of type a is denoted [a], which is just syntactic sugar for [] a. You can treat [] as a Functor:
instance Functor [] where
    fmap :: (a -> b) -> [a] -> [b]
    fmap = map
Python programmers might recognize this is the same as their map function. Just like Python's map, fmap will work on all homogeneous Haskell data structures and generators (a.k.a. iterables in Python) because they all implement the Functor interface:
instance Functor Set where
    fmap :: (a -> b) -> Set a -> Set b
    fmap = ... -- implementation elided
instance Functor Tree where
    fmap :: (a -> b) -> Tree a -> Tree b
    fmap = ... -- implementation elided
instance Functor (Map k) where
    fmap :: (a -> b) -> Map k a -> Map k b
    fmap = ... -- implementation elided
Haskell differs from C++ and Java in that it does not have a unified collections interface, because many operations on collections are actually instances of more general classes. For example, list concatenation uses mappend from the Monoid class.

Since our fmap function surprisingly works on collections, you might wonder what else it works on.

State Functor


Haskell has a dead-simple implementation of state:
-- actually a newtype and I've flipped the output
type State s a = s -> (s, a)
In other words, a stateful function is one that takes a state s and returns a new state s along with a possible output a. In fact, this is how Haskell handles I/O, except that s is the state of the outside world. Seriously:
type IO a = State RealWorld a -- actually a newtype
As you might have guessed, State s and IO are functors:
instance Functor (State s) where
    fmap :: (a -> b) -> State a -> State b
    fmap f x s = second f (x s) -- a pure function
instance Functor IO where
    fmap :: (a -> b) -> IO a -> IO b
    fmap f x s = second f (x s) -- no different
I only included the implementations to emphasize that they are ordinary, pure functions and that there is nothing special or magic about how either functor works.

Haskell's getLine function reads a line from standard input and returns the String as that output of type a that I mentioned above. That means its type is:
getLine :: IO String -- i.e. (RealWorld -> (RealWorld, String))
Because it is a functor, I can apply a function to its output (the String), before I even run it and it will now return the transformed value when it's finally executed. So if I have a function named parseInteger that converts Strings to Integers, then I can just write:
parseInteger :: String -> Int

getInteger :: IO Integer -- i.e. (RealWorld -> (RealWorld, Int))
getInteger = fmap parseInteger getLine
Voila! I just created a new function that now reads a line from standard input and returns the parsed Integer. I don't think there is anything equivalent to the generality of fmap in the standard libraries of mainstream languages.

Parser Functor


In Haskell, a function that parses Strings has the type:
type Parser a = String -> [(a, String)] -- actually a newtype
To understand the type, it takes a string and returns a list of possible parsed values along with their corresponding unconsumed Strings. This is a functor, too, which means that if I have something that parses a list of values and I want to convert the list to a tree, I just write:
convertToList :: [a] -> Tree a

parseTree :: Parser [a]

<$> = fmap -- an infix operator synonymous with fmap

convertToList <$> parseTree :: Parser (Tree a)
Convenient! If you squint, the infix version of fmap, <$>, resembles function application. It's almost as if we were applying convertToList directly to parseTree and the <$> makes the type incompatibility just magically go away.

Functors


So if you see something resembling f a in a type signature and you guess that f is some sort of functor, chances are you will be right. However, you don't have to guess. You can just fire up ghci and type something like:
> :info Maybe
...
instance Functor Maybe -- Defined in Data.Maybe
...
... or if f is a type variable in a function's type signature, it will say something like:
functorsForAll :: (Functor f, ...) => ... -> f a -> ...
Functors are just the tip of iceberg. Haskell programmers reuse classes like crazy and Functor isn't even the most used class (which would arguably be the Monad class). The typical Haskell programmer reuses a few functions like fmap over and over again to great effect, and once you learn those few functions your Haskell productivity will soar. These functions "just work" and always do the "right thing", yet in a completely type-safe manner.

Equally important, Haskell programmers reason more easily about their programs because they reuse the same concepts over and over again. They can rapidly form a mental model of how a function or library works because everything is assembled from the same set of core primitives derived from category theory.

Haskell classes, like C#/Java interfaces, promote generality and code reuse, but they are much more general and reusable than their mainstream counterparts because they have a strong mathematical foundation in category theory. However, Haskell requires no actual knowledge of category theory to use it.

15 comments:

  1. Is there an error in the following?

    "type State s a = s -> (s, a) -- actually a newtype"

    Should (s,a) be (a,s)? I just read http://www.haskellforall.com/2012/01/haskell-for-mainstream-programmers_04.html and there it is

    "type State s a = s -> (a, s) -- actually a newtype, but whatever"


    Also, on http://www.haskellforall.com/2012/01/haskell-for-mainstream-programmers_04.html, there is an error in one of the add functions, where it should say "put (x+n) s1" instead of "put x s1".

    I much appreciate your blog posts, by the way, and I'm working my way through them. :)

    ReplyDelete
    Replies
    1. Yeah, I actually take liberty with the State type and reverse the output to simplify the implementation of the monad instance. Both versions form a monad, but the official one is not as clean. However, I'll change both posts so that they make clear that I'm reversing the convention for simplicity.

      There's another reason I prefer this different ordering, which is that when you define the state monad using the Kleisli category formulation (where `return` is `id` and `(<=<)` is `(.)`), thenyou get:

      return = curry id
      f <=< g = curry (uncurry f . uncurry g)

      ... whereas if those variables are rearranged then it ends up being:

      return = (flip . curry) id
      f <=< g = (flip . curry) ((uncurry . flip) f . (uncurry . flip) g)

      You are also right about the mistake in the other post. I will fix it.

      Some of the more recent posts are more advanced, but I will write some more intermediate posts soon.

      Delete
    2. Ok, thanks. :) I realized just after hitting submit that you could most probably define it either way ((a,s) or (s,a)).

      I actually noticed this difference in the process of trying to work through your post on state in GHCi. I got stuck on the conversion to do notation and tried to find out what I was doing wrong. If I do "let add = (\n -> do x <- get; put (x+n)) :: Int -> State Int ()" at the prompt, I get an error. The other versions of add has the type "Int -> State Int ()". Could you give me some hints? I guess there is some difference between your "bind" and "(>>=)".

      I'm quite new to Haskell and find some of the details of state to be a bit difficult, although I understand the main concept.

      Delete
    3. Jonathan Johnsson: It should work when you add a 'return ()' to the definition of 'add' in the following manner:

      "let add = (\n -> do x <- get; put (x+n); return ()) :: Int -> State Int ()",

      or equivalently:

      "let add = (\n -> get >>= put . (n +) >> return ()) :: Int -> State Int ()"

      Delete
  2. This is a good attempt to explain fmap. I've been reading Real World Haskell and it kept referring to fmap, dropping it here and there with not much of an explanation except a laconic "here, we use fmap", as if that should be obvious.

    But your explanation only goes halfway. You say that State s and IO are functors, but fail to explain what does fmap actually do in each case, instead resorting to just showing the definition and a simple example.

    Now, throwing an equation at the reader this way is probably ok for people who have an intuitive grasp of math, I'm not one of them. And math people won't be reading this particular blog entry anyway - it's all obvious to them. For me this was annoying.

    My suggestion: Why not explicitly say, in plain English:
    1) fmap as defined for the State function takes the current state and the current result, changes the result using the given function, and returns the same state but with a different result.

    2) fmap as defined for IO takes the result of the IO action, changes it using the given function, and returns another IO action with the new result.

    Here are some questions which popped up into my mind while reading this:

    - Why is fmap changing the State's result, and not the state itself?
    - What if I need a more complex transformation of state or IO, something that is more complex than a simple (a -> b) ?
    - How IS exactly changing the type of a tree, a set, a list or a map, "the same" as changing the result of a state? Those are obviously two different things.
    - Are there any rules for knowing what fmap will DO, given a certain context?
    All the examples given here seem to indicate they change "something" "inside" the "context", but this is very, very vague. Is there a better definition? Also, "inside" is poorly defined.
    - Speaking of which, I don't see any example of reuse here. Care to elaborate?
    - What IS the result of the state, anyway? Why is it necessary? Isn't it a state like in a state machine? All you need then, in terms of data, is the state variable which is some of an enum, is this the "s" thing here?
    - What's so special about the getInteger example? I

    I think I know the answers to these questions by now, I'm just mentioning them because that's the kind of questions I asked myself when first reading about the state monad and the IO monad in RWH. I'm not too happy with the answers, BTW, but I'm willing to accept them.

    So, I dunno, you seem to have a sort of disconnect from what a "real-world" imperative programmer thinks like. Again, throwing equations at the reader and then waxing about

    Also, Real World Haskell calls these fmap uses "lifting operators". This should be mentioned here, I think, unless RWH's name is not the conventional one.

    ReplyDelete
    Replies
    1. Sorry for the delayed response. I agree with all your points and I will update the post to build a better practical intuition of what is going on. It takes a bit of time to do because I'm a slow writer, which is why I haven't fixed it, yet.

      Delete
    2. Ok, before I rewrite this post, let me first take an initial stab at answering your questions and once I've answered them to your satisfaction I'll work it into the post.

      I have to split these answers into multiple replies because they are so long:

      > Why is fmap changing the State's result, and not the state itself?

      The best way to understand a lot of category theory concepts is that they are just a way of "organizing" types and operations on those types.

      For example, a Functor is nothing more than a type that exposes some way to modify its very last type variable. So when somebody says that `State s` is a `Functor`, that's just a fancy way of saying "I can change the `a` in `State s a`".

      For example, let's say that we reorganized the type variable and instead put them in the order `State a s`. You wouldn't be able to make a `Functor` out of that, because there is no function that has the type: `(s1 -> s2) -> (State a s1 -> State a s2)`.

      Note that a type can be `Functor` in more than one way. The simplest example is the pair:

      data Pair a b = P a b

      It's obviously a functor over `b` since there exists a function:

      second :: (b1 -> b2) -> (Pair a b1 -> Pair a b2)

      ... but if we switch the type variables around it is also a `Functor` over `a`, too:

      data Pair' b a = P' a b

      first :: (a1 -> a2) -> (Pair' b a1 -> Pair' b a2)

      Category theory is totally abstract. It doesn't say what that last type variable should represent. All it says is that if it is a functor then we can modify whatever that type variable points to. Functors are just a structured way of interfacing with specific type variables.

      This is one of the reasons that you almost never get a straight answer when you ask "But what IS a Functor and what does that last type variable represent?" The real answer is that it can be anything that supports modification of some sort.

      - What if I need a more complex transformation of state or IO, something that is more complex than a simple (a -> b) ?

      By that I assume you mean that instead of:

      fmap :: (a -> b) -> f a -> f b

      You instead want something like:

      fmap :: C a b -> C (f a) (f b)

      ... where `C` is some arbitrary instance of the `Category` class. Yes, that's perfectly reasonable. The only reason the `Functor` class uses just Haskell functions is just a historical artifact. With multi-parameter type classes you could generalize the `Functor` class to do what you ask:

      class GFunctor c f where
      fmap :: (Category c) => c a b -> c (f a) (f b)

      Delete
    3. > How IS exactly changing the type of a tree, a set, a list or a map, "the same" as changing the result of a state? Those are obviously two different things.

      They aren't! That's the point! The reason that Functor is so useful is precisely because it is so abstract. The definition of a Functor is completely "positional". The trick to "getting" category theory is to just ignore what everything means and think of it as just a way of positionally manipulating types. Once you get in the "Haskell zone", you start approaching type-chasing "spatially". For example, when I program I might say to myself:

      "I have an IO Int, but I want an IO String. All I need to change is the Int and I don't actually care whether or not it is wrapped in IO or anything else. That means I want to use a Functor so I can just ignore the whole IO part, and now I only need to supply fmap with a function of type `Int -> String`."

      When you think in terms of these higher-order abstractions, you basically bypass all the complex machinery you don't need to deal with (i.e. the IO in the above example) and focus the part you actually care about (i.e. the Int). By doing so, I guarantee that I don't get tripped up by some accidental complexity involving `IO`.

      For example, if I write a function of type: `IO Int -> IO String`, I can't actually be sure that such a function isn't doing something weird involving `IO`, but when I force myself to only use `fmap` to modify the `Int`, I'm guaranteeing to myself that the ONLY thing I'm modifying is the `Int` and I'm not accidentally using the extra complexity that `IO` provides. In short, many category theory abstractions like Functor are a principled way to reduce software complexity by sandbox some function's effect and reducing complex problems into simpler narrower problems that have less opportunities to screw things up.


      > Are there any rules for knowing what fmap will DO, given a certain context?

      Yes! Actually, there are exactly two rules that all instances of the Functor class must obey, no exceptions:

      fmap id = id
      -- i.e.: fmap id x = x

      fmap (f . g) = fmap f . fmap g
      -- i.e.: fmap (f . g) x = fmap f (fmap g x)

      In plain English, those two rules just say:

      * If you don't modify the wrapped value, you don't modify anything else either

      * If you do two modifications at once, it's the same as doing each one successively

      The first rule is the "sandboxing" rule. That's the rule that guarantees that `fmap` doesn't actually trigger any accidental complexity from the surrounding context. The idea is that you give 'fmap' an empty modification function (i.e. 'id') sort of like a test to see if it accidentally changes anything by mistake. If it doesn't modify anything, then you wrote the `Functor` instance correctly and you can guarantee that `fmap`'s effect is truly sandboxed.

      The second rule is the "refactoring" rule. It basically says that we can always group up or split apart multiple uses of `fmap` so that we can define whatever modularity boundaries we require.

      The really fascinating part is that it turns out that those are the ONLY two rules you ever need to prove. Those two rules alone always magically guarantee that you wrote it correctly and they always uniquely define the one correct implementation.

      The more important point is that those are the ONLY two rules you ever need to know about `fmap`. After all, the whole point of `fmap` is that it completely ignores the surrounding context (i.e. the `IO` or `State`), so by definition, any functions that modify the wrapped value cannot use any features from that surrounding context in their implementation.

      Delete
    4. > All the examples given here seem to indicate they change "something" "inside" the "context", but this is very, very vague. Is there a better definition? Also, "inside" is poorly defined.

      The definition of inside is literally "whatever is the last type variable". Any intuition more specific than that is bound to fail for some functor. All you need to know is that `fmap` changes the last type variable in something's type, whatever that type variable happens to be. That type variable *could* be a value that is wrapped, but that's not necessarily the case. For example, here's a Functor that doesn't wrap anything:

      data U a = U

      instance Functor U where
      fmap _ U = U

      Notice that the above definition of `fmap` satisfies the functor laws I described:

      fmap id = id

      fmap (f . g) = fmap f . fmap g

      ... but there is no `a` value stored inside the `U`. It's purely a phantom type.

      > Speaking of which, I don't see any example of reuse here. Care to elaborate?

      Yes, that was an omission on my part. The reuse part comes about when we program generically over the `Functor` class. For example, I can write:

      setToOne :: (Functor f) => f a -> f Int
      setToOne = fmap (\_ -> 1)

      ... and that will work on anything that implements `Functor`, so I can reuse it for a wide range of types.

      An even more powerful example of Functor reuse is the `lens` library, but I can't fit the explanation of that within this comment.

      > What IS the result of the state, anyway? Why is it necessary? Isn't it a state like in a state machine? All you need then, in terms of data, is the state variable which is some of an enum, is this the "s" thing here?

      In `State s a`, the result is the `a`, not the `s`. The reason is that the `s` type is fixed throughout the computation and cannot change. So, for example, our state type might be an `Int` (i.e. the `s` is `Int`), but we may be interested in computing things using that state which are not `Int`s. This is why we distinguish between the state type and the result we are computing.

      There are a few cases when the state and the result coincide, like the `get` function:

      get :: State s s

      ... but that's just a special case.

      > What's so special about the getInteger example?

      The special part is that I was able to modify the result of an `IO` action without even understanding how `IO` works under the hood. Even more importantly, if the internal implementation details of `IO` were to change, `parseInt` would still work because it doesn't use any implementation-specific details about `IO`. It just uses `fmap` and then lets `fmap` do all the heavy lifting to make sure that the parser gets applied to the correct place (the result, in this case).

      If those answers are still confusing, feel free to ask more questions. I'd like to improve my explanation of this.

      Delete
  3. Hi, sorry for not responding for such a lon time, I've just checked and saw you answered me back in March.

    Unfortunately recently I've had very little time left for Haskell between my two young children and an insane workload (in fact, I've written half the post above back in February with my baby son slung over my shoulder trying to get him to sleep, and when I was very tired myself, that's why it has some incomplete sentences).

    Anyway, thanks for the detailed answers, I'll try to read them again later this week and respond. From reading them just now some things were clearly explained, but others were even more confusing than the original post. I'll try reading it again next week and see if I can point out the confusing parts.

    By the way, back in February I also wrote a response to your blog post about the State monad, however due to some bug (in Blogger, I assume) it got lost somehow when I pressed "Publish", and I was too tired to rewrite everything.

    To summarize, my response there essentially said that your explanation there was the best one I've read about the State monad yet. It just seemed a shame to me now you never saw it. I think you're doing a great job on this blog.

    I also started reading your series of posts on your Lenses package, however I didn't get very far, and I realized I obviously missed some concepts, so I went back to try and finish reading Real World Haskell.

    ReplyDelete
    Replies
    1. I have two kids, too, so I understand. :)

      The lens packages are not my packages. They both belong to Edward Kmett.

      The key thing you need to read if you want to understand StateT is a paper called "Monad Transformers - Step by step". Google that and the PDF will be the first result. It is very beginner friendly.

      If you need more introductory concepts, then I recommend "Learn You a Haskell for Great Good", which walks through all the basic Haskell concepts for people new to functional programming.

      Also, thanks for the kind words!

      Delete
  4. I am trying to understand the State example but there is something which puzzles me both with the type of fmap and its implementation.

    Shouldn't
    instance Functor (State s) where
    fmap :: (a -> b) -> State a -> State b
    fmap f x s = second f (x s) -- a pure function
    actually be:

    instance Functor (State s) where
    fmap :: (a -> b) -> State s a -> State s b
    fmap f x s = (s, second f (x s))

    Also something bothers me about defining fmap using three parameters when it only has two. I kind of understand that this has to do with the fact that fmap is a function which returns a function in this case so the last parameter x is meant for the returned function but being a complete rookie I am not too sure that this is the correct interpretation and if it is then I have to admit that this formulation bothers me since I cannot (yet?) grasp how the compiler can generate code for a function using parameters meant for the function it returns.

    I suspect that if State s a was expanded into its equivalent type then this would likely make sense but it is quite a steep shortcut to take for me at this stage.

    Finally I have two remarks regarding explanations you posted in the comments:
    You say: "there is no function that has the type: `(s1 -> s2) -> (State a s1 -> State a s2)", but I guess you mean that it would not make much sense right? It seems to me that it is possible to write such function even if I have a hard time figuring out if it could be useful at all.

    Also, I'm confused when you say that the result of State s a is the a and not the s. I do not understand why since the state seems to me the most important part of the transition and a seems secondary since it's not taken into account for future transitions.

    ReplyDelete
    Replies
    1. I actually have to correct my first remark. After looking up "second" I realized that I mistook it for "and" and that it actually seems to be a completely different beast (arrow), one which is completely out of my league (rookie).

      Maybe then, is it a little bit too advanced for an example?

      Delete
    2. Damn autocorrect, by "and" I meant "snd".

      Delete
    3. Sorry for the really delayed reply!

      Yes, the real `fmap` for `State` only has two parameters if you insert all the newtype wrapping and unwrapping.

      However, to answer your more general question, it's perfectly legitimate to define a function in terms of the parameters of the function it returns. A great example of this is how function composition is defined:

      (.) :: (b -> c) -> (a -> b) -> (a -> c)

      There are two equally valid ways to define it:

      (f . g) = \x -> f (g x)

      (f . g) x = f (g x)

      The reason this works is that the latter notation is syntactic sugar for the former version. All function definitions with parameters just get desugared to lambdas:

      f x y = z

      -- is equivalent to:

      f = \x -> \y -> z

      Regarding the `State` function question, I should have been more precise and said that there is no way to define a function of type:

      (s1 -> s2) -> (s1 -> (a, s1)) -> (s2 -> (a, s2))

      The way I phrased it before was really confusing.

      Regarding `second`, I find it helps to ignore the whole `Arrow` class stuff and just think of it as a function of the following type:

      second :: (a -> b) -> (x, a) -> (x, b)

      This is a special case of the `Arrow` type, which is:

      second :: Arrow arr => arr a b -> arr (x, a) (x, b)

      Just replace `arr` with `(->)` and you get the simpler version. I personally find this `Arrow` generalization of `second` to be really unfortunate since it's very non-obvious to beginners that it can be used in the simpler way.

      Delete