Sunday, January 29, 2012

Haskell for Engineers - Unicode

The Glasgow Haskell Compiler supports Unicode variable names, which I find dramatically simplifies geometric and numerical code. To enable this extension, just include the following pragma before the module line in your source file:
{-# LANGUAGE UnicodeSyntax #-}
... then you are good to go! Of course, you still need a good way to enter Unicode symbols, so you should consult Wikipedia's great article on unicode input.

There is one catch, namely that the first character of all variables cannot be upper-case and that includes Unicode characters like Δ. The only way to get around this is to prefix it with some unintrusive character like _ or 'This also means that you should use lower-case theta, lower-case phi, lower-case psi, and so on.

I've provided some examples to whet your appetite for greek variables:
spherical (x, y, z) = (r, θ, φ)
    where r = sqrt $ x^2 + y^2
          θ = acos $ z / r
          φ = atan2 y x

π = pi

α = δω / δt
However, always keep in mind that Unicode makes it harder for other people to read and edit your source code, so don't go crazy.

Saturday, January 28, 2012


Haskell's Lens type generalizes properties (i.e. accessors/mutators) found in other programming languages. For example, C# provides language support to treat properties as ordinary variables:
class Point {
    public double x { get; set; }
    public double y { get; set; } }

class Circle {
    public Point  center { get; set; }
    public double radius { get; set; } }

public void goRight(ref Circle c) { += 10; }
The initial naive implementation of the same Haskell data types and accessor/mutator functions would be:
data Point = Point Double Double

getX :: Point -> Double
getX (Point a _) = a

setX :: Double -> Point -> Point
setX x' (Point _ y) = Point x' y

getY :: Point -> Double
getY (Point _ b) = b

setY :: Double -> Point -> Point
setY y' (Point x _) = Point x y'

data Circle = Circle Point Double

getCenter :: Circle -> Point
getCenter (Circle p _) = p

setCenter :: Point -> Circle -> Circle
setCenter p' (Circle _ r) = Circle p' r

getRadius :: Circle -> Double
getRadius (Circle _ r) = r

setRadius :: Double -> Circle -> Circle
setRadius r' (Circle p _) = Circle p r'
Hmm, this seems like a lame way to declare accessor and mutator functions. You could imagine that this would get out of hand pretty quickly if our record had lots of fields:
getField8 (Record _ _ _ _ _ _ _ x _ _) = x
Everybody recognized that this introduced a ridiculous amount of boilerplate, so Haskell's language designers added some syntactic sugar to automatically derive accessor and mutator functions for complicated records:
data Point  = Point  { x :: Double, y :: Double }
    deriving (Show)
data Circle = Circle { center :: Point, radius :: Double }
    deriving (Show)
Using this syntactic sugar, you can modify individual fields using their names:
set42Radius c = c { radius = 42.0 }
... and you can use the field names as accessor functions:
>>> radius (Circle (Point 3.0 4.0) 5.0)
But what if we wanted to implement the Haskell equivalent of the goRight function I defined in C#? Using record syntax, we would write:
goRight c = c { center = (\p -> p { x = x p + 10 }) (center c) }
Uuuuuuuggggly! As you can see, Haskell's record system is not "first-class", which is a polite way of saying it's terrible. For example, you can't use record syntax to create point-free functions:
setX10 :: Point -> Point
setX10 p = p { x = 10 } -- valid
setX10'  =   { x = 10 } -- invalid!
However, we can work around this by writing get and set functions, in the spirit of Java:
getX :: Point -> Double
getX = x

setX :: Double -> Point -> Point
setX x' p = p { x = x' }
... but this forces us to write the Haskell equivalent of ugly Java code:
Java   : p.setX(p.getX() + 10)
Haskell: setX (getX p + 10) p
Because of these limitations, Haskell's record syntax crashes and burns on more complicated record hierarchies, becoming unmaintainable in a hurry.


However, Haskell doesn't require language support for implementing properties! After all, a property consists of a getter and setter, so we can package them both up into a single data type, which we call a Lens instead of a property.
                  getter,    setter
type Lens a b = ( a -> b , b -> a -> a )
So now we can write more general get and set functions:
getL :: Lens a b -> a -> b
getL (g, _) = g
-- or  getL = fst

setL :: Lens a b -> b -> a -> a
setL (_, s) = s
-- or  setL = snd

-- Apply a function to the field
-- I consider modL more fundamental than setL
modL :: Lens a b -> (b -> b) -> a -> a
modL l f a = setL l (f (getL l a)) a

x' :: Lens Point Double
x' = (getX, setX)

>>> getL x' (Point 3.0 4.0)
>>> setL x' 10.0 (Point 3.0 4.0)
Point {x = 10.0, y = 4.0}
We are slightly improving on Java notationally, but still way behind C#. However, we're just getting started. First off, we can take advantage of Haskell's syntactic support for operators, which are just infix functions:
(^.) :: a -> Lens a b -> b
a ^. p     = getL p a
-- or (^.) = flip getL

(^=) :: Lens a b -> b -> a -> a
(p ^= b) a = setL p b a
 -- or (^=) = setL

>>> (Point 3.0 4.0) ^. x'
>>> (x' ^= 10.0) (Point 3.0 4.0)
Point {x = 10.0, y = 4.0}
That looks better, but it still resembles a functional style, especially the setter. What we really want is something stateful, in the spirit of imperative programming. Fortunately for us, we can use the State monad to implement an imperative programming style:
-- stateful version of (^=)
(^:=) :: Lens a b -> b -> State a ()
p ^:= b = do
    a <- get -- read the current state
    let a' = setL p b a
    put a'   -- set the current state
-- or p ^:= b = modify (setL p b)

access :: Lens a b -> State a b
access p = do
    a <- get -- read the current state
    let b = getL p a
    return b -- return the value
-- or access p = gets (getL p)
Now we can write all our favorite imperative functions using the above two primitives:
-- modify field 'p' using function f
(%=) :: Lens a b -> (b -> b') -> State a ()
p %= f = do
    b <- access p
    p ^:= f b

p += x = p %= (+ x)
p *= x = p %= (* x)

y' :: Lens Point Double
y' = (getY, setY)

goUp :: State Point ()
goUp = y' += 7.0

>>> :type (execState goUp)
execState goUp :: Point -> Point
>>> execState goUp (Point 3.0 4.0)
Point {x = 3.0, y = 11.0}
Fortunately for you, Edward Kmett already wrote all these functions for you and provides them in the data-lens package. The only difference is that he factored the Lens type into a more efficient and elegant form and some of the operators use different symbols.


So if all we could do was ape C# syntax I wouldn't be writing this blog post, so it's time to dig a little deeper.

Haskell's greatest strength stems from the fact that it derives all of its design patterns from mathematics, specifically category theory. Haskell borrows the Category design pattern from category theory, which is defined as anything that implements composition and identity:
class Category c where
    (.) :: c y z -> c x y -> c x z
    id :: c x x
Obviously functions form a Category:
instance Category (->) where
    (f . g) x = f (g x)
    id x = x
However, lenses form a Category, too!
instance Category Lens where
    ... -- exercise for the reader!
Any Category instance must satisfy three laws (know as the Category Laws):
  • Left Identity: id . f = f
  • Right Identity: f . id = f
  • Associativity: (f . g) . h = f . (g . h)
Other than that, anything goes, which is why categories appear in so many unexpected places. Since Lenses are categories we can compose them, which results in a new Lens which composes their their accessor and mutator functions. In other words:
getL id = id
getL (lens1 . lens2) = getL lens1 . getL lens2
-- i.e. getL defines a Functor

modL l id = id
modL l (f . g) = modL l f . modL l g
-- (modL l) defines a Functor, too
In fact, you can use the above two definitions to derive the Category instance for Lens.

This leads us to Haskell's verbatim translation of our original goRight function:
goRight = (x' . center') += 10

>>> :type x'
x' :: Lens Point Double
>>> :type center'
center' :: Lens Circle Point
>>> :type (x' . center')
x' . center' :: Lens Circle Double
>>> let c = Circle (Point 3.0 4.0) 5.0
>>> c ^. (x' . center)
>>> c ^. center' ^. x'
>>> execState goRight c
Circle {center = Point {x = 13.0, y = 4.0}, radius = 5.0}
The Category class enforces seamless composition because of the Category laws. For example, when we compose two lenses, we can't "uncompose" the result because it is absolutely indistinguishable from the equivalent hand-written lens. This is what "seamless" means: the "seam" between the two lenses completely vanishes when we compose them. The Category design pattern guarantees leak-proof abstraction.

Haskell lenses provide several advantages over object-oriented languages:
  • Haskell's lenses behave like "first-class l-values": You can pass lenses as arguments to functions, modify them or compose them and despite all of this you can still assign to the resulting expression. You don't need to read a long language specification to know whether or not a certain expression qualifies as something assignable. If it's a Lens, it's assignable, period.
  • Haskell lenses are implemented using ordinary, pure functions. Haskell language designers never built Lens into the language, which means that if you disagree with its implementation, you can implement your own version! In fact, Hackage lists several lens packages, although fclabels and data-lens are in my opinion the two best implementations at the moment.
  • Lenses are easier to combine. You don't have to write a new getter and setter to combine two lenses. Just compose them and you're done. Use all that time you save to do whatever it is Haskell programmers do with all the free time they supposedly have.
Lenses still require boilerplate lens definitions, but many solutions exist to this problem. Most lens packages provide Template Haskell macros to automatically derive lenses for records. That means that the final idiomatic Haskell equivalent to the original C# code would be:
data Point = Point { _x :: Double, _y :: Double }
data Circle = Circle { _center :: Point, _radius :: Double }
$( makeLenses [''Point, ''Circle] ) -- <- Template Haskell

goRight = (x . center) += 10
Template Haskell is poorly thought out in my opinion, but there are some ideas to replace it with a much more powerful template system that can be statically checked. That makes a very interesting topic in its own right, which I'll visit in a later post.

Wednesday, January 4, 2012

Haskell for Mainstream Programmers - State

Simon Peyton Jones touts Haskell as "the world's finest imperative programming language" and I strongly agree with that statement. In fact, I fell in love with Haskell precisely because of how elegantly it managed state so I decided to do a write-up of why I think Haskell got it right.


Haskell is a pure functional language, almost to a fault. Purity means that functions never have side effects, ever. Technically, you can write impure functions in Haskell, but the language design strongly discourages it and idiomatic Haskell programs will never use side effects.

This philosophy marks a significant departure from other functional programming languages like OCaml or Lisp. These languages ceded moral ground to imperative programming languages by allowing pure functions to have side effects. This leads to clunky and inelegant code that makes it difficult to program with side-effects and reason about them. So what did Haskell do different?


Haskell solves the problem of stateful programming by replacing normal functional outputs with stateful outputs. A stateful output is something of the form:
-- actually a newtype, and I've flipped the output
type State s a = s -> (a, s)
In other words, a stateful output takes some old state of type s and returns the output of type a along with a new updated state of type s:
old state     output   new state
    s     -> (   a   ,     s    )
So if a pure function has the type:
f :: a -> b
... then the "impure" stateful version would have the type:
f' :: a -> State s b
... which just translates to:
f' :: a -> s -> (b, s)
In other words, a stateful function takes an input of type a and an old state and returns an output of type b along with a new updated state:
input    old state     output   new state
  a   ->     s     -> (   b   ,     s    )
So "stateful" functions are actually still pure functions whose input and output just happen to include state.

We can express a lot of stateful functions (actually, all of them), using this trick. For example, the get function just takes the current state and returns it in the output slot, but doesn't change it.
get :: State s s
get s = (s, s)
The put function updates the state with a new value:
put :: s -> State s ()
put s = \_ -> ((), s)
Now let's say that we want to use our get and put functions to create the Haskell equivalent of the following pseudocode:
add n = do
    x <- get
    put (x + n)

quadruple = do
    x <- get
    add x
    y <- get
    add y
We'd be forced to write the following really awkward (but pure!) code:
add :: Int -> State Int ()
add n s0 = let (x, s1) = get s0
            in put (x + n) s1

quadruple :: State Int ()
quadruple s0 = let (x, s1) = get s0
                   (_, s2) = add x s1
                   (y, s3) = get s2
                in add y s3
Yuck! I could have written shorter versions of those functions but I wanted to emphasize the explicit state passing. Functional programmers hate boilerplate, so they use functions to abstract away repetitive tasks. In our above example, the repetitive task is manually passing state around:
let (a, s') = m s
 in f a s'
In other words, we apply some stateful output m to the current state, which generates the pure output a and a new updated state s'. We then feed both of these to some new stateful function f. Let's write a function to automate this repetitive task for us:
bind :: State s a -> (a -> State s b) -> State s b
bind m f s = let (a, s') = m s
              in f a s'
We can use bind to shorten our original code:
add :: Int -> State Int ()
add n = bind get           (\x ->
        put (x + n)        )

quadruple :: State Int ()
quadruple = bind get     (\x ->
            bind (add x) (\_ ->
            bind get     (\y ->
            add y        )))
Wow! This looks awfully similar to our original pseudocode:
add n = bind get           (\x ->   |   add n = do x <- get
        put (x + n)        )        |              put (x + n)
quadruple = bind get     (\x ->     |   quadruple = do x <- get
            bind (add x) (\_ ->     |                  add x
            bind get     (\y ->     |                  y <- get
            add y        )))        |                  add y
In fact, our original pseudocode wasn't even pseudocode. That's actually valid Haskell code. The do notation is just syntactic sugar for repeated calls to bind, except that in Haskell bind is denoted by the infix function >>=.

What will really surprise newcomers to Haskell is that do notation is not just syntactic sugar for stateful functions. It's syntactic sugar for anything that's a monad, and State s is just one of many kinds of monads. In fact, >>= generalizes bind:
(>>=) :: (Monad m) =>       m a -> (a ->       m b) ->       m b
bind  ::              State s a -> (a -> State s b) -> State s b
This is why Haskell has so many monad tutorials.


Haskell uses State for input and output, too. In fact, Haskell's IO type is just:
type IO a = State RealWorld a
-- i.e. type IO a = RealWorld -> (a, RealWorld)
I'm not making this up. Of course, RealWorld doesn't actually imply that we are masters of the universe capable of manipulating reality at will. It's just a placeholder to symbolize the state of the real world so that we can interact with it in a structured, type-safe, and pure way.

Earlier I made the bold statement that Haskell has no "side effects". This is because the "effects" of Haskell don't really happen on the "side". They are dealt with front and center using the type system and ordinary, pure functions rather than hacking in extra out-of-band language features as many other functional languages are wont to do.

IO differs from State s in that the state is never exposed. You can't get or set it (obviously, since we are neither omniscient nor omnipotent), nor can you store the state to be revisited at a later date (since we aren't time-travelers). However, we can still use our understanding of State to reason about how IO works. Take for example the getLine function:
getLine :: IO String
-- i.e. getLine :: RealWorld -> (String, RealWorld)
We can think of getLine as starting from some RealWorld state and outputting a a String, leaving RealWorld in a new state as a result of its operation.

From this type we can immediately deduce a couple of things about getLine's behavior:
  • The string it embeds in its output may change with each invocation since getLine is a function of the outside world's state.
  • getLine may change the outside world (i.e. have effects, like consuming input) since it returns a new state
Contrast this with an ordinary String:
myString :: String
myString = "Haskell for all"
We can similarly deduce that:
  • myString always returns the exact same string since it is not a function of any state
  • myString doesn't change any state when invoked since it doesn't provide a new state for some other function to use.

Advantages of Purity

Haskell's embeds all statefulness directly within its own type system, which leads to several advantages over other languages:
  • We can distinguish between pure and "impure" functions just by looking at their type. We don't have to read a function's documentation to know if it has side effects. The type documents the function's behavior.
  • The compiler doesn't have to optimize "impure" functions differently. It can use the same optimizations for IO code that it would have used for pure State s code.
  • The compiler can more aggressively optimize pure code because the type guarantees no side effects. For example, the compiler can safely reduce the number of calls to a pure function by caching its output.
  • Haskell abstracts imperative code even better than imperative languages. Haskell can implement high-level imperative combinators (such as iterators/iteratees or foreach) directly within the language because statefulness is implemented directly within the language. This is why Haskell blows away even imperative languages when it comes to topics like concurrency, coroutines, and reactive programming.
  • The State s monad possesses a strong and elegant theoretical basis in category theory and you can reason about it mathematically if you ever learn category theory. Other functional languages just implement statefulness as a dirty hack that seems like an afterthought.

Tuesday, January 3, 2012

Haskell for Intermediate Programmers - Algebraic Data Types

Haskell uses "algebraic" data types, but none of the tutorials or introductions really explain why they are called "algebraic". Let's go over the real origin of their name and perhaps gain a deeper understanding of them.

Algebraic data types are types that can be modeled using simple algebra, namely addition and multiplication. Wait, what? I can add types? Yes:
a + b = Either a b
Addition is associative, and Eithers are "sort of" associative, which is to say that if we squint, the following two types are essentially the same:
Either (Either a b) c ~ Either a (Either b c)
(a + b) + c = a + (b + c)
... where I use ~ to denote "essentially the same". We can show they are equivalent by creating functions that transform each type to the other:
forward  x = case x of
    Left  e -> case e of
        Left  a -> Left a
        Right b -> Right (Left b)
    Right c -> Right (Right c)
reverse x = case x of
    Left  a -> Left (Left a)
    Right e -> case e of
        Left  b -> Left (Right b)
        Right c -> Right c

-- or more tersely:
forward  = either (either Left (Right . Left)) (Right . Right)
reverse = either (Left . Left) (either (Left . Right) Right)
We can also multiply types:
a * b = (a, b)
This is also associative:
((a, b), c) ~ (a, (b, c))
(a * b) * c = a * (b * c)

forward  ((a, b), c) = (a, (b, c))
reverse (a, (b, c)) = ((a, b), c)
More importantly, multiplication distributes over addition, just like in normal algebra:
(a, Either b c) ~ Either (a, b) (a, c)
a * (b + c) = a * b + a * c

forward (a, e) = either (Left . (a ,)) (Right . (a ,)) e
reverse = (either fst fst, either (Left . snd) (Right . snd))
We can also create an analog of 0:
data Zero -- a data type with no constructor

Either Zero a ~ a
0 + a = a

forward (Right a) = a
reverse a = Right a -- or reverse = Right
The forward function doesn't need to check the alternative case for the Left constructor because we can't create a value of type Zero.

The analog of 1 is any type with a single empty constructor, such as the () type:
type One = ()
(One, a) ~ a
1 * a = a

forward (One, a) = a
reverse a = (One, a)
Multiplying by Zero has the expected effect:
(Zero, a) ~ Zero
0 * a = 0

-- Neither type can be built
Incredibly, we can "essentially" create any Haskell data type using only Either, (,), (), and Zero. Some examples of Haskell types and their algebraic equivalents:
data Bool = True | False
Bool ~ 1 + 1 -- i.e. Either () ()
     = 2

data Maybe = Just a | Nothing
Maybe a ~ a + 1 -- i.e. Either a ()

data Trool = True_ | False_ | Empty -- "troolean"
Trool ~ 1 + 1 + 1 -- i.e. Either (Either () ()) ()
      = 3

-- or
type Trool = Maybe Bool
Trool ~ Bool + 1 -- i.e. Either (Either () ()) ()
      ~ 2 + 1
      = 3

data TwoBools = TwoBools Bool Bool
TwoBools ~ Bool * Bool -- i.e. (Bool, Bool)
         ~ 2 * 2
         = 4
Each type corresponds to an algebraic number which tells us how many states that type can represent. The Bool type represents 2 states. The two Trool types represent 3 states.

This state-counting explains why the algebraic equivalence works: If two types represent the same number of states, you can always reversibly convert them back and forth. When we multiply two types, we multiply the number of states they represent. When we add two types, we add the number of states they represent.

We can follow some simple rules to translate Haskell data types into algebraic expressions:
  • Replace | with +
  • Replace a constructor (or tuple) that holds multiple values with a product
  • Replace an empty constructor with 1.
  • Replace a type with no constructor with 0
For example:
data Sum a b c = S1 a | S2 b | S3 c
Sum a b c ~ a + b + c

type Sum2 a b c = Either a (Either b (Either c Zero)))
Sum2 a b c = a + (b + (c + 0))
           = a + b + c

data Product a b c d e = P a b c d e
Product a b c d e ~ a * b * c * d * e

type Product2 a b c d e = (a, b, c, d, e)
Product2 ~ a * b * c * d * e

type Product3 a b c d e = (a, (b, (c, (d, (e, ())))))
Product 3 ~ a * (b * (c * (d * (e * 1))))
          = a * b * c * d * e

data Term a b c = T1 a b | T2 c 
Term a b c ~ a * b + c

type Term2 a b c = Either (a, b) c
Term2 a b c ~ a * b + c
In the above example, we can think of the type constructors as algebraic functions. Sum computes the sum of its three arguments. Product computes the product of its five arguments.

But what about recursive types? They work, too!
data List a = Nil | Cons a (List a)
List a ~ 1 + a * List a
       ~ 1 + a * (1 + a * List a)
       ~ 1 + a * (1 + a * (1 + a * List a))
       ~ ...
       ~ (1) + (a) + (a * a) + (a * a * a) + ...
This tells us that a List is either empty (1) or (+) has 1 element (a) or (+) has 2 elements (a * a) or (+) has 3 elements (a * a * a) and so on. That's pretty slick!

So far I've only mentioned algebraic equivalents to data types, but we can even represent functions as algebraic expressions:
a -> b = b ^ a
To see why, imagine that our function has a finite domain (i.e. a has n possible states, where n is finite). Then, we could encode our function by just asking what will its result be for all possible values of a and storing those results in an n-tuple.

For example, if we write a function of type Bool -> Trool, then we could instead rewrite our function as a simple data type of type (Trool, Trool) (one Trool for each state of Bool). This immediately tells us that there are only 9 unique functions with that type, and the algebraic translation agrees:
Bool -> Trool ~ Trool^Bool
              ~ 3^2
              = 9
For example, if our function were:
data Trool = True_ | False_ | Empty

f :: Bool -> Trool
f x = case x of
    True -> Empty
    False -> False_
I could instead encode that function as a tuple, as long as I use a consistent encoding/decoding convention:
f' :: (Trool, Trool)
f' = (Empty, False_)

f' ~ f
If -> really exponentiated types, we'd expect the following two types to be essentially equal:
a -> b -> c ~ (a, b) -> c
(c^b)^a = c^(a * b)

forward = uncurry
reverse = curry
... and they are!

Obviously, we work with many types that do not have a finite number of states, such as Integer or List a, but it doesn't mean the algebra is wrong. It's just that now we are adding and multiplying infinities, so we will always get an infinite number of possible states. However, the algebra still gives us a good intuition of how we can "essentially" construct any type just from just two base types and three orthogonal functions on types:
-- Base types
data Zero
type One = ()

-- Type Functions
a + b = Either a b
a * b = (a, b)
a -> b = b^a
Unfortunately, an "essentially equal" type won't cut it in Haskell, since it won't type-check, but I hope this helps explain some of the motivation behind Haskell's data type system.

It would be really elegant if Haskell's type system was just syntactic sugar on top of the above minimal type system. In other words, alternate constructors would just get translated into a sum fold:
data For = A a | B b | C c
-- gets translated into
newtype For = For
    { unFor :: Either a (Either b (Either c Zero))) }
... and constructors that contained several values would get translated into a product fold:
data All = A a b c d e
-- gets translated into
newtype All = All
    { unAll :: (a, (b, (c, (d, (e, ()))))) }
It would be even nicer if these newtypes could automatically derive class instances from their components types based on the algebra (i.e. a more powerful version of GeneralizedNewtypeDeriving). To elaborate, just imagine having a type H that was an instance of some interface:
class HasHeight t where getHeight :: t -> Double

instance HasHeight H where getHeight = ...
Then you define some other type based off of H:
data Object x y = Person H x | Building H y | Chair H
    deriving (HasHeight)
Then the class system does some algebra behind the scenes to confirm that it can treat Object as a super-set of H:
Object x y = H * x + H * y + H
               = H * (x + y + 1) -- confirmed!
Then it automatically derives the instance:
instance HasHeight Object where
    getHeight o = case o of
        Person   h _ -> getHeight h
        Building h _ -> getHeight h
        Char     h   -> getHeight h
Unfortunately, Haskell's class system isn't strong enough to express this kind of behavior, so we are stuck bickering over various record implementations instead of enjoying a truly algebraic type system.

Sunday, January 1, 2012

Haskell for C Programmers - For Loops

C programmers make incredibly heavy use of for loops and when they switch over to Haskell they have trouble writing idiomatic code because Haskell doesn't provide an analagous programming construct. Simon Peyton Jones calls Haskell "the world's finest imperative programming language", so I'll take a few simple C code examples and translate them to exactly equivalent Haskell and then compare them to idiomatic Haskell.


Let's begin with a sum function in C:
double sum(double *xs, size_t num_xs)
    size_t i;
    double result = 0.0;
    for (i = 0; i < num_xs; i++)
    { result += xs[i]; }
    return result;
For an exact translation, I'll first define a while and for function in Haskell:
while :: (Monad m) => m Bool -> m a -> m ()
while cond action = do
    c <- cond
    when c $ do
        while cond action

for :: (Monad m) => m a -> m Bool -> m b -> m c -> m ()
for init cond post action = do
    while cond $ do
Then we need to define a global state with default values:
data Status = Status { _i :: Int, _result :: Double }

class Default a where def :: a

instance Default Int    where def = 0
instance Default Double where def = 0.0
instance Default Status where def = Status def def
... and we can use the data-lens package, specifically Data.Lens.Lazy, to make state updates much easier if we define a few lenses first:
i      = lens _i      (\x s -> s { _i      = x })
result = lens _result (\x s -> s { _result = x })
Now we are good to go! The final translation is:
sum :: [Double] -> Int -> Double
sum xs n = flip evalState def $ do
    result ~= 0
    for (i ~= 0) (liftM (< n) (access i)) (i += 1) $
        i' <- access i
        result += (xs !! i')
    access result
However, we don't really need to use the length, because Control.Monad provides a way to call an action once for each element of a list, namely forM. This eliminates the n and i variables completely and removes any chance of an out-of-bounds run-time error by providing the wrong list length. forM resembles other languages' foreach constructs, except it is more general:
sum :: [Double] -> Double
sum xs = flip evalState def $ do
    -- i.e. foreach (double x in xs) { ... }
    forM xs $ \x ->
        modify (\a -> a + x)
Much better! But we're just getting started. For monadic functions that fold a list into a single value, Control.Monad provides the foldM function:
sum xs = flip evalState def $
    foldM (\a x -> return (a + x)) 0.0 xs
But now we're not even using any features of the State monad anymore. The above function is equivalent to:
sum xs = foldl (\x a -> x + a) 0.0 xs
... and we can contract it further to:
sum = foldl (+) 0.0
... which is how the Prelude defines it. So any time you need to simply fold a bunch of values into a single value, take a look at foldr/foldl first.


What if I want to print a list of values, one value per line? In C, it would be:
void printer(double *xs, size_t num_xs)
    for (i = 0; i < num_xs; i++)
    { printf("%f\n", xs[i]); }
And in Haskell, the exact translation would be:
printer xs n = flip execStateT 0 $
    for (put 0) (liftM (< n) get) (modify (+ 1)) $ do
        i' <- get
        print $ xs !! i'
Using forM_, we can do better:
printer xs = forM_ xs $ \x -> print x
... or just:
printer xs = forM_ xs print
... or we can use mapM_, which is forM_ with its arguments reversed:
printer = mapM_ print
Hopefully you can see that Haskell's forM/mapM replace C's for loops for monadic code, but you shouldn't rely on them like a crutch. For example, if I wanted to print all pair-wise combinations of two numbers from 1 to 10, I could write:
forM_ [1..10] $ \i ->
    forM_ [1..10] $ \j ->
        print (i, j)
... which looks an awful lot like a nested for-loop you'd write in C. But I could also write it like:
mapM_ print $ do
    i <- [1..10]
    j <- [1..10]
    return (i, j)

-- or use list comprehension syntactic sugar
mapM_ print [(i, j) | i <- [1..10], j <- [1..10]]
... or you could get even fancier and use the Applicative class:
mapM_ print $ (,) <$> [1..10] <*> [1..10]
... which almost seems like cheating.


Idiomatic Haskell separates data generation from data consumption and the last function demonstrates that. You can partition the function into a generator and consumer of data:
generator = (,) <$> [1..10] <*> [1..10]
consumer = mapM_ print
This design pattern separate concerns and creates more modular code.

You might think this doesn't work when you need to process data before all of it is available, but you'd be wrong. Haskell uses enumerators and "iteratees" to solve precisely that problem. You write generator and consumer code independently, yet they still magically "fuse" into the correct data-processing pipeline when you combine them, handling data as it is generated. I will discuss these in a later post.

This leads to a good rule of thumb: strive to separate data generation from consumption, even when not programming in Haskell.