Monday, May 6, 2013

pipes-3.3.0: Folds and uniting ListT with Proxy

pipes-3.3.0 simultaneously resolves two long-standing problems in the library:
  • Not all proxy transformers implemented ListT
  • Folds required using the base monad
It turns out that fixing ListT for the remaining hold-outs proved to solve the fold problem as well, and this post will detail that a bit more.


ListT


pipes-2.4 first identified the existence of three extra categories, two of which I call the "request" and "respond" categories. These categories are enormously useful, especially since you can implement ListT with both of them, but then I discovered that you couldn't implement the (/>/) and (\>\) composition operators for certain proxy transformers, specifically:
  • MaybeP
  • EitherP
  • StateP
  • WriterP
This was really disconcerting, and it seemed really odd that every proxy transformer could lift the identities for those two categories (i.e. request and respond), but not always lift the corresponding composition operators. I settled for a temporary solution, which was to split out (/>/) and (\>\) into a separate ListT type class.

However, several recent events made me suspect that something was amiss and caused me to revisit this solution. I received my first clue while working on the pipes-directory library, where I wanted to model getDirectoryContents using the following type:
getDirectoryContents
    :: (Proxy p)
    => FilePath -> ProduceT (ExceptionP p) SafeIO FilePath
This would let users bind directories non-deterministically in ProduceT so that they could describe effectful directory traversals at a high-level. This required pipes-safe so that the directory stream would be properly finalized in the event of exceptions of termination, which is why it uses ExceptionP and SafeIO.

However, ExceptionP is just a type synonym for EitherP, and EitherP did not implement the ListT type class, which meant that I could not use the ProduceT monad. So I revisited EitherP and discovered that there was a law-obiding ListT instance for EitherP that I had missed the first time around. Moreover, I could use the exact same trick to implement ListT for MaybeP, too.

This meant that only two proxy transformers remained which did not implement ListT:
  • StateP
  • WriterP
Moreover, WriterP was internally implemented using StateP under the hood, meaning that if I could solve StateP then I could finally merge the ListT class back into the Proxy class.

Simultaneously, while working on pipes-parse I encountered several buggy corner cases with StateP, all of which gave the wrong behavior. Similarly, WriterP also gave the wrong behavior in a wide variety of cases and this Stack Overflow question gives a great example of how useless WriterP was. This suggested that I had implemented both of those two proxy transformers incorrectly, since both of them gave the wrong behavior in many corner cases and both of them resisted a correct ListT implementation.

This observation led me to discover the correct solution: make StateP and WriterP share their effects globally across the pipeline, instead of locally. This fix solved both problems:
  • Both of them now implement List and obey the ListT laws
  • Both of them now give correct behavior in all corner cases
Consequently, I can now merge the ListT class into the Proxy class and reunite request and respond with their respective composition operators. Also, now all proxy transformers lift all four categories correctly.


Folds


The WriterP fix leads to a big improvement in the pipes folding API. Now you can do folds using WriterP within the pipeline and without using the base monad.

For example, if you want to fold all positive elements from upstream, you can now write:
somePipe = do
    -- The unitU discards values that 'toListD' reforwards
    xs <- execWriterK (takeWhileD (> 0) >-> toListD >-> unitU) ()
    respond xs
    ...
You can now access the result of folds within pipes! You no longer have to wait until the Session is complete to retrieve the folded data.

Also, since folds don't use the base monad you no longer need to hoist stages that you compose with a fold. For example, if you want to sum the first ten lines of user input, you can just write:
runProxy $ execWriterK $ readLnS >-> takeB_ 10 >-> sumD

Deprecation


I've also started to deprecate several parts of the API in preparation for an eventual pipes-4.0.0 release. These are the main things I deprecated:
  • The classic pipes API (i.e. await, yield, and (>+>))
  • raise functions (i.e. raise and raiseP)
  • K functions, like hoistK and liftK (exception: I keep the run...K functions)
  • Many bidirectional utilities from the pipes prelude and some upstream utilities
  • idT and coidT are renamed to pull and push
If you disagree with any of these deprecations, please let me know since I'm always open to suggestions to keep them or migrate them to a pipes-extras library.

I renamed idT and coidT because this allows for a nice convention where every category is named after its identity operator. Also, the new names are more suggestive of their behavior: idT begins by pulling information, while coidT first pushes information.

This rename becomes even more advantageous when you use the io-streams style I discussed in a previous post, but I will save the full explanation of why for later.


Modules


I've also tightened up the module hierarchy, which has gone down from 23 modules to 18, and will go down further to 16 when I remove the deprecated Control.Pipe and Control.Proxy.Pipe modules in pipes-4.0.0. Hopefully this makes the library a bit less intimidating to newcomers and easier to navigate.


Future Work


As always, I'm still working on pipes-parse. The big holdup is that I have been experimenting with more elegant solutions to pushback, mainly because I would like to implement many non-trivial features like nested sub-parsers. If worse comes to worse, I will just drop those advanced features and push the simpler version out the door in the next few weeks.

Saturday, May 4, 2013

Program imperatively using Haskell lenses

Haskell gets a lot of flack because it has no built-in support for state and mutation. Consequently, if we want to bake a stateful apple pie in Haskell we must first create a whole universe of stateful operations. However, this principled approach has paid off and now Haskell programmers enjoy more elegant, concise, and powerful imperative code than you can find even in self-described imperative languages.


Lenses


Your ticket to elegant code is the lens library. You define your data types as usual, but you prefix each field with an underscore. For example, I can define a Game:
data Game = Game
    { _score :: Int
    , _units :: [Unit]
    , _boss  :: Unit
    } deriving (Show)
... full of Units:
data Unit = Unit
    { _health   :: Int
    , _position :: Point
    } deriving (Show)
... whose locations are represented by Points:
data Point = Point
    { _x :: Double
    , _y :: Double
    } deriving (Show)
We prefix these fields with an underscore because we will not be using them directly. Instead, we will use them to build lenses, which are much more pleasant to work with.

We can build these lenses in two ways. Our first option is to define lenses manually using the lens convenience function from Control.Lens. For example, we can define a score lens to replace the _score field accessor:
import Control.Lens

score :: Lens' Game Int
score = lens _score (\game v -> game { _score = v })
A Lens is like a map which you use to navigate complex data types. We use the above score lens to navigate from our Game type to its _score field.

The type reflects where we begin and end: Lens' Game Int means we must begin on a value of type Game and end on a value of type Int (the score, in this case). Similarly, our other lenses will clearly indicate their starting and ending points in their types:
units :: Lens' Game [Unit]
units = lens _units (\game v -> game { _units = v })

boss :: Lens' Game Unit
boss = lens _boss (\game v -> game { _boss = v })

health :: Lens' Unit Int
health = lens _health (\unit v -> unit { _health = v })

position :: Lens' Unit Point
position = lens _position (\unit v -> unit { _position = v })

x :: Lens' Point Double
x = lens _x (\point v -> point { _x = v })

y :: Lens' Point Double
y = lens _y (\point v -> point { _y = v })
However, we don't have to write out all this boilerplate if we're lazy. Our second option is to use Template Haskell to define all these lenses for us:
{-# LANGUAGE TemplateHaskell #-}

import Control.Lens

data Game = Game
    { _score :: Int
    , _units :: [Unit]
    , _boss  :: Unit
    } deriving (Show)

data Unit = Unit
    { _health   :: Int
    , _position :: Point
    } deriving (Show)

data Point = Point
    { _x :: Double
    , _y :: Double
    } deriving (Show)

makeLenses ''Game
makeLenses ''Unit
makeLenses ''Point
Just remember that Template Haskell requires these makeLenses declarations to go after your data types.


Initial State


The next thing we need is a test initial game state:
initialState :: Game
initialState = Game
    { _score = 0
    , _units =
        [ Unit
            { _health = 10
            , _position = Point { _x = 3.5, _y = 7.0 }
            }
        , Unit
            { _health = 15
            , _position = Point { _x = 1.0, _y = 1.0 }
            }
        , Unit
            { _health = 8
            , _position = Point { _x = 0.0, _y = 2.1 }
            }
        ]
    , _boss = Unit
        { _health = 100
        , _position = Point { _x = 0.0, _y = 0.0 }
        }
    }
We've enlisted three valiant heroes to slay the dungeon boss. Let the battle begin!


First Steps


Now we can use our lenses! Let's create a routine for our warriors to strike at the boss:
import Control.Monad.Trans.Class
import Control.Monad.Trans.State

strike :: StateT Game IO ()
strike = do
    lift $ putStrLn "*shink*"
    boss.health -= 10
strike prints an evocative sound to the console, then decrements the boss's health by 10 hit points.

strike's type indicates that it operates within the StateT Game IO monad. You can think of this as a DSL where we layer our pure game state (i.e. StateT Game) on top of side effects (i.e. IO) so that we can both mutate our game and also print cute battle effects to the console. All you have to remember is that any time we need side effects, we will use lift to invoke them.

We'll test out strike in ghci. In order to run strike, we must supply it with an initialState:
>>> execStateT strike initialState 
*shink*
Game {_score = 0, _units = [Unit {_health = 10, _position = Poin
t {_x = 3.5, _y = 7.0}},Unit {_health = 15, _position = Point {_
x = 1.0, _y = 1.0}},Unit {_health = 8, _position = Point {_x = 0
.0, _y = 2.1}}], _boss = Unit {_health = 90, _position = Point {
_x = 0.0, _y = 0.0}}}
execStateT takes our stateful code and an initial state, and then runs that code to produce a new state. ghci automatically shows the return value as a convenience so we can inspect the newly returned state. The output is a bit of a mess, but if you strain your eyes you can see that the boss now only has 90 health.

We can view this more easily by storing the new state in a variable:
>>> newState <- execStateT strike initialState 
*shink*
... and then we can query newState for the part we actually care about:
>>> newState^.boss.health
90

Composition


This syntax very strongly resembles imperative and object-oriented programming:
boss.health -= 10
What is going on here? Haskell is decidely not a multi-paradigm language, yet we have what appears to be multi-paradigm code.

Amazingly, nothing on that line is a built-in language feature!
  • boss and health are just the lenses we defined above
  • (-=) is an infix function
  • (.) is function composition from the Haskell Prelude!
Wait, (.) is function composition? Really?

This is where the lens magic comes in. Lenses are actually ordinary functions, and our "multi-paradigm" code is actually functions all the way down!

In fact, Lens' a b is actually a type synonym for a certain type of higher-order function:
type Lens' a b =
    forall f . (Functor f) => (b -> f b) -> (a -> f a)
You don't need to understand the details of that. Just remember that Lens' a b is a higher-order function that accepts a function of type (b -> f b) as an argument, and returns a new function of type (a -> f a). The Functor part is the theoretically-inspired "magic".

Armed with that knowledge, let's make sure the types check out by expanding out the Lens' type synonyms for boss and health
boss :: Lens' Game Unit
-- expands to:
boss :: (Functor f) => (Unit -> f Unit) -> (Game -> f Game)

health :: Lens' Unit Int
-- expands to:
health :: (Functor f) => (Int -> f Int) -> (Unit -> f Unit)
Now let's review the definition of function composition:
(.) :: (b -> c) -> (a -> b) -> (a -> c)
(f . g) x = f (g x)
Notice that if we specialize our type variables to:
a ~ (Int  -> f Int)
b ~ (Unit -> f Unit)
c ~ (Game -> f Game)
... then this has exactly the right type to compose our two lenses:
(.) :: ((Unit -> f Unit) -> (Game -> f Game))
    -> ((Int  -> f Int ) -> (Unit -> f Unit))
    -> ((Int  -> f Int ) -> (Game -> f Game))
If we put the Lens' type synonyms back in, we get:
(.) :: Lens' Game Unit -> Lens' Unit Int -> Lens' Game Int

boss . health :: Lens' Game Int
So function composition is also lens composition! In fact, lenses form a category where (.) is the category's composition operator and the identity function id is also the identity lens:
(.) :: Lens' x y -> Lens' y z -> Lens' x z

id  :: Lens' x x
What's so beautiful about this is that Haskell lets us remove the spaces around the function composition operator so that it looks exactly like object-oriented accessor notation!

Categories make it really easy to connect and group components on the fly. For example, if I anticipate that I will be modifying the Boss's health frequently, I can just define a composite lens:
bossHP :: Lens' Game Int
bossHP = boss.health
... and now I can use it wherever I previously used boss.health:
strike :: StateT Game IO ()
strike = do
    lift $ putStrLn "*shink*"
    bossHP -= 10
... or similarly use it as an accessor:
>>> newState^.bossHP
90

Traversals


Lenses are grounded in some really elegant theory, and as a result they get a lot of things right that imperative languages normally don't!

For example, let's say that our boss is a dragon and breathes fire, which damages all heroes. Using lenses, I can decrement the entire party's health using a single instruction:
fireBreath :: StateT Game IO ()
fireBreath = do
    lift $ putStrLn "*rawr*"
    units.traversed.health -= 3
This makes use of a new lens!
traversed :: Traversal' [a] a
traversed lets us "dig in" to the values in a list so that we can manipulate them as a single unit instead of manually looping over the list. However, this time the type is a Traversal' instead of a Lens'.

A Traversal is a like a Lens' except weaker:
type Traversal' a b =
    forall f . (Applicative f) => (b -> f b) -> (a -> f a)
If you compose Lens' with a Traversal', you get the weaker of the two: a Traversal'. This works no matter which order you compose them in:
(.) :: Lens' a b -> Traversal' b c -> Traversal' a c

(.) :: Traversal' a b -> Lens' b c -> Traversal' a c
units                  :: Lens'      Game [Unit]
units.traversed        :: Traversal' Game  Unit
units.traversed.health :: Traversal' Game  Int
In fact, we don't need to figure this out. The compiler will infer the correct type all by itself:
>>> :t units.traversed.health
units.traversed.health
  :: Applicative f =>
     (Int -> f Int) -> Game -> f Game
That's exactly the right type to be a Traversal' Game Int!

Actually, why not just compose these lenses into a single lens:
partyHP :: Traversal' Game Int
partyHP = units.traversed.health

fireBreath :: StateT Game IO ()
fireBreath = do
    lift $ putStrLn "*rawr*"
    partyHP -= 3
Let's also use partyHP lens to retrieve the new party hitpoints:
>>> newState <- execStateT fireBreath initialState 
*rawr*
>>> newState^.partyHP

<interactive>:3:11:
    No instance for (Data.Monoid.Monoid Int)
      arising from a use of `partyHP'
    Possible fix:
      add an instance declaration for (Data.Monoid.Monoid Int)
    In the second argument of `(^.)', namely `partyHP'
    In the expression: newState ^. partyHP
    In an equation for `it': it = newState ^. partyHP
Oops! This is a type error because there is no single health to get! This is why a Traversal' is weaker than a Lens': traversals may point to multiple values, so they do not support a well-defined way to get just one value. The type system saved us from a potential bug!

Instead, we must specify that we actually want a list of values using the toListOf function:
toListOf :: Traversal' a b -> a -> [b]
This gives the desired result:
>>> toListOf partyHP newState 
[7,12,5]
... and there's an infix operator equivalent to toListOf: (^..):
>>> initialState^..partyHP
[10,15,8]
>>> newState^..partyHP
[7,12,5]
Now we can clearly see at a glance that fireBreath worked the way we intended.

Now I want to get really fancy. I want to define a traversal over a geographic area. Can I do that?
around :: Point -> Double -> Traversal' Unit Unit
around center radius = filtered (\unit ->
    (unit^.position.x - center^.x)^2
  + (unit^.position.y - center^.y)^2
  < radius^2 )
Sure I can! Now I can limit the dragon's fire breath to a circular area!

Edit: filtered is apparently not a theoretically valid traversal because it does not preserve the number of elements. See this /r/haskell thread for details.
fireBreath :: Point -> StateT Game IO ()
fireBreath target = do
    lift $ putStrLn "*rawr*"
    units.traversed.(around target 1.0).health -= 3
Notice how expressive that code is: we want to decrement the health of all units around the target. That code conveys our intention much more clearly than the equivalent mainstream imperative code and it leaves much less room for error.

Anyway, back to breathing fire. First, let's see where the units are located:
> initialState^..units.traversed.position
[Point {_x = 3.5, _y = 7.0},Point {_x = 1.0, _y = 1.0},Point {_x
 = 0.0, _y = 2.1}]
Hmmm, the latter two units are close by, so I will aim the fireball in between them:
>>> newState <- execStateT (fireBreath (Point 0.5 1.5)) initialState 
*rawr*
>>> (initialState^..partyHP, newState^..partyHP)
([10,15,8],[10,12,5])
Nailed it!


Zooming


We can do more unique things with lenses, like zoom in on subsets of our global state:
retreat :: StateT Game IO ()
retreat = do
    lift $ putStrLn "Retreat!"
    zoom (units.traversed.position) $ do
        x += 10
        y += 10
As before, we can combine these lenses into a single lens if we want to reuse it later on:
partyLoc :: Traversal' Game Point
partyLoc = units.traversed.position

retreat :: StateT Game IO ()
retreat = do
    lift $ putStrLn "Retreat!"
    zoom partyLoc $ do
        x += 10
        y += 10
Let's try it out:
>>> initialState^..partyLoc
[Point {_x = 3.5, _y = 7.0},Point {_x = 1.0, _y = 1.0},Point {_x
 = 0.0, _y = 2.1}]
>>> newState <- execStateT retreat initialState 
Retreat!
>>> newState^..partyLoc
[Point {_x = 13.5, _y = 17.0},Point {_x = 11.0, _y = 11.0},Point
 {_x = 10.0, _y = 12.1}]
Let's look at the type of zoom in the context of this particular example:
zoom :: Traversal a b -> StateT b IO r -> StateT a IO r
zoom has some nice theoretical properties. For example, we'd expect that if we zoom using two successive lenses, it should behave the same as zooming using the composite lens:
zoom lens1 . zoom lens2 = zoom (lens1 . lens2)
... and if we zoom in on the empty lens, we end up back where we started:
zoom id = id
In other words, zoom defines a functor, and those equations are the functor laws!


Combining commands


So far I've only shown a single command at a time, but now let's take all of these concepts and imperatively assemble a battle from them:
battle :: StateT Game IO ()
battle = do
    -- Charge!
    forM_ ["Take that!", "and that!", "and that!"] $ \taunt -> do
        lift $ putStrLn taunt
        strike

    -- The dragon awakes!
    fireBreath (Point 0.5 1.5)
    
    replicateM_ 3 $ do
        -- The better part of valor
        retreat

        -- Boss chases them
        zoom (boss.position) $ do
            x += 10
            y += 10
Let's try it out!
>>> execStateT battle initialState 
Take that!
*shink*
and that!
*shink*
and that!
*shink*
*rawr*
Retreat!
Retreat!
Retreat!
Game {_score = 0, _units = [Unit {_health = 10, _position = Poin
t {_x = 33.5, _y = 37.0}},Unit {_health = 12, _position = Point 
{_x = 31.0, _y = 31.0}},Unit {_health = 5, _position = Point {_x
 = 30.0, _y = 32.1}}], _boss = Unit {_health = 70, _position = P
oint {_x = 30.0, _y = 30.0}}}
I guess people really aren't joking when they say Haskell is the finest imperative language.


Conclusions


This really just scratches the surface of the lens library, which is one of the crown jewels of the Haskell ecosystem. You can use lenses for pure programming, too, and compress very powerful and complex computations into very readable and elegant code. When I have more time I will write even more about this amazing library.