Haskell for all: pipes-3.2: ListT, Codensity, ArrowChoice, and performance

pipes-3.2 is out and it boasts several cool new features. The important highlights are:

A correct-by-construction ListT implementation that converts to and from proxies
The CodensityP proxy transformer, which improves the time complexity of left-associated binds
ArrowChoice operations for selectively applying pipes to subsets of an input stream
"Pointful" operators
Many performance improvements

This post is mainly a changelog, so if you are completely new to pipes, then I recommend you begin from the pipes tutorial, which is probably the longest tutorial on Hackage at this point.

ListT

Many people know that ListT in transformers is broken, thanks to the wonderful ListT done right article. Fewer people know that there is a correct-by-construction implementation of ListT up on Hackage in the List package. However, there are a couple of problems with the List package:

It's difficult to build custom ListT actions.
It's difficult to read out the result of List.

The generator package tries to solve the first problem, and remarkably resembles a Producer from pipes. It provides a monad equipped with a yield command and you use generate to compile it to a ListT action. However, neither List nor generator solve the second problem.

If you thought to yourself "Oh, no.... Gabriel added a List dependency to pipes", you'd be wrong! In fact, pipes has had ListT support since version 2.4 and I didn't even realize it until I was working on a perfect backtracking parser for the upcoming pipes-parse library.

The key lies in the Interact class that I introduced in pipes-2.4, which I've renamed to the ListT class in this release. People familiar with pipes know that this mysterious class provided two extra operators: (/>/) and (\>\), and that these operators happened to form two extra categories, with respond and request as their respective identities:

respond />/ f = f

f />/ respond = f

(f />/ g) />/ h = f />/ (g />/ h)


request \>\ f = f

f \>\ request = f

(f \>\ g) \>\ h = f \>\ (g \>\ h)

However, at the time I discovered these two additional categories I dismissed them as less interesting than the proxy composition categories, even mentioning in the tutorial that "they are more exotic and you probably never need to use them". Little did I know how wrong I was!

I later discovered that these two categories were both ListT Kleisli categories. The "respond" category (i.e. respond and (/>/))) is actually a monad over the downstream output of proxies, or in other words:

respond corresponds to return
(/>/) corresponds to (>=>)

I call this monad RespondT and Control.Proxy.ListT exports the following newtype which lets you convert between the Proxy monad and the RespondT monad:

newtype RespondT p a' a b' m b
    = RespondT { runRespondT :: p a' a b' b m b' }

instance (Monad m, ListT p) => Monad (RespondT p a' a b' m) where
    return a = RespondT (respond a)
    m >>= f  = RespondT (
        runRespondT m //> \a ->
        runRespondT (f a))
    -- (//>) is the "pointful" version of (/>/),
    -- just like (>>=) is the "pointful" version of (>=>)

Using RespondT, you can bind a proxy's output as if you were binding a list within the list monad:

import Control.Proxy

twoNumbers :: (Proxy p) => () -> Producer p Int IO ()
twoNumbers =
     readLnS
 >-> execU (putStrLn "Enter a number: ")
 >-> takeB_ 2

stringsTillQuit :: (Proxy p) => () -> Producer p String IO ()
stringsTillQuit =
     stdinS
 >-> execU (putStrLn "Enter a string: ")
 >-> takeWhileD (/= "quit")

-- 'ProduceT' is a convenient type synonym around 'RespondT'
exampleListT :: (ListT p) => () -> ProduceT p IO (Int, String)
exampleListT () = do
    n   <- RespondT $ twoNumbers ()
    str <- RespondT $ stringsTillQuit ()
    return (n, str)

You can then compile RespondT back to a proxy just by unwrapping the newtype, generating a proxy that produces one output per permutation of bound values:

exampleProxy :: (List p) => () -> Producer p (Int, String) IO ()
exampleProxy = runRespondK exampleListT

More often, you would just combine these two steps into one:

exampleProxy :: (List p) => () -> Producer p (Int, String) IO ()
exampleProxy () = runRespondT $ do
    n   <- RespondT $ twoNumbers ()
    str <- RespondT $ stringsTillQuit ()
    return (n, str)

>>> runProxy $ exampleProxy >-> printD
Enter a number: 
1<Enter>
Enter a string: 
Hello<Enter>
(1,"Hello")
Enter a string: 
world<Enter>
(1,"world")
Enter a string: 
quit<Enter>
Enter a number: 
2<Enter>
Enter a string: 
Testing<Enter>
(2,"Testing")
Enter a string: 
123<Enter>
(2,"123")
Enter a string: 
quit<Enter>

Notice how reading out the ListT is trivial. You just use the pipes machinery you know and love to read out the resulting lazy stream of values.

As you might have guessed, there is a symmetric ListT monad over upstream outputs, too, which I've named RequestT. RespondT and RequestT are correct by construction, meaning that they always satisfy the monad and monad transformer laws.

However, RespondT and RequestT are much more powerful than meets the eye. For example, you need not limit yourself to Producers when you use RespondT, as the following example demonstrates:

pipeT :: (ListT p) => () -> Pipe p Int (Int, Int) IO ()
pipeT () = runRespondT $ do
    x <- RespondT $ takeB_ 2 ()
    y <- RespondT $ takeB_ 3 ()
    return (x, y)

You can non-deterministically select outputs from any Proxy type and RespondT just magically does the right thing:

>>> runProxy $ enumFromS 1 >-> pipeT >-> printD
(1,2)
(1,3)
(1,4)
(5,6)
(5,7)
(5,8)

Similarly, you need not restrict yourself to unidirectional pipes. RespondT and RequestT won't bat an eyelash if you try to use them for bidirectional general-purpose proxies. pipes goes above and beyond a traditional ListT implementation.

The proxy prelude provides convenience functions for common operations, such as ranges or iterating over lists:

pairs :: (ListT p) => () -> Producer p (Int, Int) IO ()
pairs () = runRespondT $ do
    x <- rangeS 1 2
    lift $ putStrLn $ "x = " ++ show x
    y <- eachS [4, 6, 8]
    lift $ putStrLn $ "y = " ++ show y
    return (x, y)

eachS is named after Ruby's each function and rangeS is named after Python's range function. You can bind each one within RespondT to non-deterministically select from a list or range, respectively.

>>> runProxy $ pairs >-> printD
x = 1
y = 4
(1,4)
y = 6
(1,6)
y = 8
(1,8)
x = 2
y = 4
(2,4)
y = 6
(2,6)
y = 8
(2,8)

It also wouldn't be a "ListT done right" unless it got the examples from that article correct, too:

myTest :: (ListT p) => Int -> () -> ProduceT p IO (Int, Int)
myTest n () = do
    let squares = eachS $ takeWhile (<= n) $ map (^2) [0..]
    x <- squares
    y <- squares
    lift $ print (x, y)
    guard $ x + y == n
    lift $ putStrLn "Sum of squares."
    return (x, y)

However, that example had a much more difficult time reading out just the first result. We can do so quite easily just by using the headD_ fold, which only drives the RespondT block long enough to retrieve the first result:

>>> let p = raiseK (runRespondK (myTest 5)) >-> headD_
>>> execWriterT $ runProxy p
(0,0)
(0,1)
(0,4)
(1,0)
(1,1)
(1,4)
Sum of squares.
First {getFirst = Just (1,4)}

If you want to learn more, you can read the newly added ListT section of the tutorial, which provides even more code examples.

So now proxies now possess two symmetric ListT implementations you can add to your toolbox, and they improve on the state of the art by reusing the elegant pipes machinery for both building ListT actions and reading out their values.

Codensity Proxy Transformer

In addition to conduit, I must also contend with Edward's machines library (currently on Github). Until recently, machines possessed one notable advantage over pipes: it used a codensity transformation of its internal free monad to avoid a quadratic blowup from a large series of left-associated binds. Normally these do not arise commonly in practice, but it is still a nice feature to have.

Now pipes has assimilated this feature, too. You can improve the time complexity of any pipe just by wrapping the pipe in runCodensityP or runCodensityK, both of which behave like the improve function from the free package. These automatically fix any quadratic time complexity of left-associated binds.

replicateM is a great example of a function that generates lots of left-associated binds. If I try to use it within a pipe, I will get a quadratic blowup:

import Control.Monad
import Control.Proxy

leftAssociate () = replicateM 10000 (request ())

main = do
    xs <- runProxy $ enumFromS (1 :: Int) >-> leftAssociate
    print xs

$ time ./main >/dev/null

real    0m3.773s
user    0m3.716s
sys     0m0.052s

... but if you wrap the pipeline in runCodensityK, it switches to linear time complexity:

import Control.Monad
import Control.Proxy
import Control.Proxy.Trans.Codensity

leftAssociate () = replicateM 10000 (request ())

main = do
    xs <- runProxy $ runCodensityK $
        enumFromS (1 :: Int) >-> leftAssociate
    print xs

$ time ./main >/dev/null

real    0m0.031s
user    0m0.024s
sys     0m0.000s

Even better, you can wrap just the pathological pipe in runCodensityK:

main = do
    xs <- runProxy $
        enumFromS (1 :: Int) >-> runCodensityK leftAssociate
    print xs

... which gives a minor performance improvement (more easily detectable for larger numbers of requests):

time ./main >/dev/null

real    0m0.027s
user    0m0.024s
sys     0m0.000s

My own performance measurements show that while the codensity transformation does improve time complexities for left-associated binds, it yields worse constant factors (about 6-fold slower for entirely pure code, but much less for IO-bound code), which is why I do not enable it on by default. The main reason the naive free monad in the ProxyFast implementation outperforms the codensity version is that it can use rewrite RULES to rewrite your code into the optimal tuned form, but the codensity-transformed version cannot.

ArrowChoice

Ever since I first released pipes, I've received numerous questions about whether or not pipes can be made an instance of Arrow. While you can't make Arrow work, you CAN make proxies implement ArrowChoice, although I don't provide an actual instance because there are two such instances (one for downstream and one for upstream) and the requisite newtypes would be very cumbersome.

You can find these combinators in the ArrowChoice section of the proxy prelude, which provides left{D/U} and right{D/U}. Using these combinators, you can selectively apply pipes to a subset of a stream:

stream
    :: (Monad m, Proxy p)
    => () -> Producer p (Either Int Char) m ()
stream = fromListS
    [Left 3, Right 'C', Right 'D', Left 4, Right 'E', Left 5]

>>> let p = stream >-> leftD (takeB_ 2 >-> mapD show) >-> printD
>>> runProxy p
Left "3"
Right 'C'
Right 'D'
Left "4"

This lets you dynamically switch behavior in response to stream values. For example, one person recently asked me how you would switch content handling in the middle of a pipeline (for example, after negotiating encryption and compression). One option I proposed is that they could use Either to distinguish between values before and after negotiation and then filter them differently using the ArrowChoice combinators:

negotiation () = do
     (before >-> mapD Left ) ()
     (after  >-> mapD Right) ()

main = runProxy $
     source
 >-> negotiation
 >-> leftD idT >-> rightD (decompress >-> decrypt)
 >-> sink

However, that's purely a theoretical idea I threw out there. I haven't actually tried this solution in practice, yet.

Point-ful operators

All type classes now use the "point-ful" equivalents of their original point-free composition operators, and the point-free operators are now derived from the point-ful ones. There are two significant advantages of this:

Types. Some previous perfectly safe code required the following hack: ((\_ -> p) >-> k) undefined. These point-ful operators now naturally lead to the correct and more general types of their corresponding point-free composition operators.
Performance. I actually used these operators internally to get pipes performance so high. Exposing them directly in the type class removes a lot of code indirection and improves the performance of the base proxy implementations (typically by about 10-15%) and the proxy transformers because there is less indirection.

Also, some people have told me that they found these point-ful operators to be more intuitive to work with, although I still personally prefer the point-free operators.

Performance

Other performance improvements include better rewrite RULES. I found several cases where the original rules were not firing, which is why the proxy prelude still depended on the manual worker/wrapper code to ensure that they fired. I recently found a more general set of rewrite RULES that work even more reliably and the proxy prelude now seems to optimize correctly even when written using the most naive code. However, I haven't committed the newer naive versions yet, as a precaution.

Stability

Many people probably want to know when I still stabilize the pipes package so that it can eventually go in the Haskell platform. The answer is that I won't stabilize it officially until I complete three upcoming libraries:

pipes-parse, which provides a native parsing extension and a standard set of end-of-input machinery for pipes
pipes-free, which will expose the underlying free monad and also provide an iostreams-like interface to pipes
pipes-stm, which will be the basis of an FRP system built on top of pipes

These are the three libraries that I expect to considerably exercise the API of the main library and uncover any significant omissions in its design. For example, just working on pipes-parse alone gave rise to the ListT machinery and the ArrowChoice combinators. However, the core API has so far proven to be particularly solid and future-proof, thanks to the enormously useful proxy transformer system. Proxy transformers let me continue to release new features (like CodensityP and the upcoming ParseP) without impacting any existing code. The only major backwards-incompatible change in this last release was just renaming the Interact class to ListT for clarity, since I figured this would be my last chance to rename it now that people will probably use it much more heavily now.

Open Design Issues

There remains one major performance bottleneck in the fast proxy implementation: hoist. I can dramatically speed it up by removing an normalization step, but doing so means that you can only safely supply hoist with monad morphisms (such as lift), otherwise you will violate the monad transformer laws. In theory, you should really only use monad morphisms for hoist anyway, but in practice people violate this all the time and try to do "weird" things like hoist (runStateT 0), which is very insensible, and I can't easily use the types to forbid that kind of thing. So I would like feedback on whether people prefer speed or safety for the ProxyFast implementation. Alternatively, I could release a third base Proxy instance not selected by default that is identical to ProxyFast in all respects except for providing the faster hoist.

Another question I have is whether or not to merge the proxy prelude into a single module. I welcome any feedback on that. Some people comment that the the module hierarchy is a little-bit too fine-grained and that's one opportunity to condense four modules into one.

Upcoming libraries

The next library on the docket is pipes-parse, followed shortly by pipes-bytestring. I know that many people need the pipes-bytestring library to progress further with pipes, but it depends on pipes-parse, which establishes some important higher-level idioms for the pipes ecosystem in general. My rate of progress has slowed recently mainly because I'm wrapping up my PhD, but I expect that I can finish both within about two months even at my current rate.

Haskell for all

Thursday, March 21, 2013

pipes-3.2: ListT, Codensity, ArrowChoice, and performance