Wednesday, October 31, 2012

pipes-2.5: Faster and slimmer

Introduction


I optimized the entire pipes library very aggressively for version 2.5, and now the library runs faster than conduit on my micro-benchmarks. I'll begin with the purest benchmark which gives the greatest difference in speed since it only measure the efficiency of each library's implementation without any IO bottlenecks:
import Control.Proxy
import Data.Conduit
import Data.Conduit.List as L

n = 1000000 :: Int

-- Pipes
main = runProxy $ discard <-< enumFromToS 1 n

-- Conduit
main = L.enumFromTo 1 n $$ L.sinkNull
Note some differences from last time. This time I'm using conduit's built-in optimized discard equivalent: sinkNull. Also, I've multiplied n by 10 to more accurately measure throughput. I compile both implementations with -O2.

pipes now spends about 112 ns per round-trip:
real    0m0.112s
user    0m0.104s
sys     0m0.004s
... while conduit spends about 167 ns per round-trip:
real    0m0.167s
user    0m0.156s
sys     0m0.008s
I achieved this speed increase by reverting to the pipes-1.0 trick of making the base monad optional, at the expense of breaking the monad transformer laws. I spent a considerable amount of effort trying to get the correct version to work, but I was led inexorably to the same conclusion that Michael already reached, which was that the original approach was best and that the gain in performance is worth bending the monad transformer laws.

Note that the above benchmark exaggerates the differences and is not indicative of real-world performance differences. For typical code you will not observe measurable differences between pipes and conduit when IO is the bottle-neck.

There is also one area in which conduit may still give (very slightly) better performance, which is in speeding up user-defined pipes. One goal I did not complete for this release was copying Michael's trick of using rewrite RULES to inline the Monad instance for user-defined pipes. I plan to copy this same trick in a separate release because I want to take the time to ensure that I can get the rewrite RULES to always fire without interfering with other optimizations.


Light-weight


The big focus of this release was to make pipes a very light-weight dependency, both in terms of performance and transitive dependencies. In the rewrite I dropped the free dependency so now the package only has two non-base dependencies:
  • transformers >= 0.2.0.0
  • index-core
... and I plan on dropping index-core along with Frames once I complete my resource management solution, leaving just a transformers dependency, which is about as light-weight as it could possibly get.

I also stole a page from Michael's book, by removing the -O2 flag from the pipes.cabal file. This flag no longer has any effect on performance after the rewrite, so you should see quicker compile times, making the pipes dependency even lighter.

There is still one other way I could make the pipes dependency even lighter, which is to remove the MFunctor class. The Control.MFunctor module requires Rank2Types, which might rule out pipes for projects that use non-GHC compilers, so if this is an issue for you, just let me know and I will try to migrate MFunctor to a separate library. Frames use a lot of extensions, but those will be on the way out, leaving behind just FlexibleContexts and KindSignatures, which are very mild extensions.


Resource management


I also wanted to use this update to point out that you can get deterministic resource management with pipes today if you use Michael's ResourceT in the base monad. So if you want to use pipes and all you care about is resource determinism then you can switch over already.

However, that alone will NOT give you prompt finalization and if you want promptness you will have to wait until I complete my own resource management extension. The extension I have in mind will be released as a proxy transformer that you can layer in any proxy transformer stack, so any proxy code you currently write can be transparently upgraded to work with resource management later on when I release the extension.

Another thing I want to mention is that while I will release the tools to manage resources promptly and deterministically, I do not plan on using these tools in the proxy standard libraries that I will release. The main reason for this is that:
  • There is no one true solution to finalization and I don't want people to have to buy in to my finalization approach to use the standard libraries I provide.
  • Most people I've talked to who care about finalization usually take the initiative write their own higher-level abstractions on top of whatever finalization primitives I provide them.
So my plan is that the standard libraries I will write will purely focus on the streaming part of the problem and leave initialization/finalization to the end user, which they can optionally implement using my resource management solution or whatever other approach they prefer (such as ResourceT, for example).

If you want to know what I personally use in my own projects at the moment, I just use the following pattern:
do withResource $ \h -> 
   runProxy $ ... <-< streamFromResource h
This gives "good enough" behavior for my purposes and out of all the finalization alternatives I've tried, it is by far the easiest one to understand and use.

The other reason I'm trying out this agnostic approach to finalization is due to discussion with Gregory Collins about his upcoming io-streams library, where he takes a very similar approach to the one I just described of leaving initialization/finalization to the end user to avoid cross-talk between abstractions and to emphasize handling the streaming aspect correctly.


Goals for the near future


I focused on improving the performance of pure code because I plan to release bytestring/text standard libraries and their corresponding parsing proxy transformers very soon, which demand exceptional pure performance. The goal of the upcoming proxy-based parsing libraries is not to beat attoparsec in speed (which I'm reasonably sure is impossible), but rather to:
  • Interleave parsing with with effects
  • Provide a low-memory streaming parser by allowing the user to selectively control backtracking
  • Still be really fast
The first two features are sorely missing from attoparsec, which can't interleave effects and always backtracks so that the file isn't cleared from memory until the parsing completes. For my own projects I need the second feature the most because I get a lot of requests to parse huge files (i.e. 20 GB) that do not fit in my computer's memory. More generally, I want pipes to be the fallback parser of choice for all problems that attoparsec does not solve.

Saturday, October 20, 2012

"Hello, core!"

Haskell optimization is very opaque and getting tight inner loops is something of a black art for people who don't take the time to learn GHC's core output. If you like fine-grained optimization, this post will walk you through a very simple example to help you learn to read core.

I'm planning to write some posts in the future discussing optimizing Haskell code and I would like to use this post as a foundation for those latter discussions.


ghc-core


When you practice reading core, stick to really simple programs. In fact, let's study the core generated by the simplest possible Haskell program:
main :: IO ()
main = return ()
The ghc-core tool produces a (slightly) readable core output. To install it, just type:
cabal install ghc-core
... and cabal should install it to ~/.cabal/bin/ghc-core.

We generate core output by running:
~/.cabal/bin/ghc-core --no-cast --no-asm program.hs
The --no-cast flag improves the readability by removing type casts and the --no-asm flag says not to include the final assembly code.

The above command automatically enables all optimizations and outputs to a pager the following two sections:
  • A list of optimizations that fired
  • The core section, which contains the intermediate core representation
The optimization section should look something like this:
0 Lets floated to top level; 0 Lets floated elsewhere; from ...

0 Lets floated to top level; 0 Lets floated elsewhere; from ...

Total ticks:     15

2 PreInlineUnconditionally
  1 x_abD
  1 s_abE
5 UnfoldingDone
  1 returnIO
  1 runMainIO
  1 main
  1 returnIO1
  1 $fMonadIO_$creturn
1 RuleFired 1 Class op return
2 LetFloatFromLet 2
2 EtaExpansion
  1 :main
  1 main
3 BetaReduction
  1 a_abC
  1 x_abD
  1 s_abE
10 SimplifierDone 10
... and the core should look something like this:
Result size = 20

main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
[GblId,
 Arity=1,

 Unf=Unf{Src=, TopLvl=True, Arity=1, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
main1 =
  \ (eta_B1 :: State# RealWorld) ->
    (# eta_B1, () #)

main :: IO ()
[GblId,
 Arity=1,

 Unf=Unf{Src=, TopLvl=True, Arity=0, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
main =
  main1
  

main2
  :: State# RealWorld
     -> (# State# RealWorld, () #)
[GblId,
 Arity=1,

 Unf=Unf{Src=, TopLvl=True, Arity=1, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [0] 30 0}]
main2 =
  \ (eta_X7 :: State# RealWorld) ->
    runMainIO1
      @ ()
      (main1
       )
      eta_X7

:main :: IO ()
[GblId,
 Arity=1,

 Unf=Unf{Src=, TopLvl=True, Arity=0, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
:main =
  main2
That still looks pretty ugly, so let's filter it a bit.


Core


You'll notice that every function has a type signature, without exception. For example, main1's type is:
main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
However, each function also has a set of attributes that ghc uses to guide various optimization passes. main1's attributes are:
[GblId,
 Arity=1,

 Unf=Unf{Src=, TopLvl=True, Arity=0, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
For example, Cheap=True says that it is cheap to inline main1. Similarly, ConLike=True says it is okay if a rewrite rule requires duplicating main1's code.

However, for the purposes of this post, we will just completely ignore these annotations and just focus on the type signatures and code. I'll remove the attributes to leave behind something easier on the eyes:
main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
main1 =
  \ (eta_B1 :: State# RealWorld) ->
    (# eta_B1, () #)

main :: IO ()
main =
  main1

main2
  :: State# RealWorld
     -> (# State# RealWorld, () #)
main2 =
  \ (eta_X7 :: State# RealWorld) ->
    runMainIO1
      @ ()
      (main1
       )
      eta_X7

:main :: IO ()
:main =
  main2
Now we see that there are four functions total, but a lot of noise still remain.


Primitive types and values


Let's walk through this core, beginning with the type signature of main1:
main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
Somebody sprinkled #s all over our type signature. These #s denote primitive types and values exposed by the compiler that are not defined within the Haskell language. You can check out the GHC.Prim module which declares these values and type. Notice how it cheats and declares all types empty and the values are all set to let x = x in x just so that everything type-checks.

You can learn more about these primitive types and operations by reading the section on Unboxed types and primitive operations in the GHC user guide.


IO is a state monad


GHC implements IO as a newtype wrapper around a state monad:
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
Notice that this is just the right type to wrap main1, which uses the same stateful pattern:
main1 :: State# RealWorld -> (# State# RealWorld, () #)
IOs state-based implementation ensures proper sequencing of computations and the State# RealWorld token is an empty token that exists solely to ensure that each sequenced computation depends on the previous one so that the compiler does not rearrange, merge or duplicate them.


The chain of command


Now, let's study the definition of main1:
main1 =
  \ (eta_B1 :: State# RealWorld) ->
    (# eta_B1, () #)
If we tidy this up and remove the type annotation, we get:
main1 = \s -> (s, ())
This is just return () in the State (State# RealWorld) monad. The compiler translated our original return () statement into the equivalent stateful function under hood.

However, the next function seems to accomplish nothing:
main :: IO ()
main =
  main1
... but that's only because the --no-casts flag removed type casts. If we add them back in, we get:
main =
  main1
  `cast` (Sym (NTCo:IO <()>)
          :: (State# RealWorld
              -> (# State# RealWorld, () #))
               ~#
             IO ())
That `cast` statement corresponds to the IO constructor from our IO newtype:
IO :: (State# RealWorld -> (# State# RealWorld, a #)) -> IO a
This constructor hides the underlying stateful representation behind the opaque IO newtype. However, unlike data constructors, newtype constructors cost nothing and don't translate into functions with a run-time overhead. Instead, they translate into free compile-time cast statements which the compiler erases once compilation is complete.

This explains why main has a different type from main1:
main :: IO ()
main packages our impure code inside the IO newtype, whereas main1 contains the raw underlying stateful representation.

In fact, main corresponds exactly to the main function we wrote in our original code (which is why it has the exact same name):
-- The original program
main :: IO ()
main = return ()
The next function appears to run our code:
main2 =
  \ (eta_X7 :: State# RealWorld) ->
    runMainIO1
      @ ()
      (main1
       )
      eta_X7
With a little tidying up, this becomes:
main2 = \s -> runMainIO1 main1 s
But wait, it runs main1 and not main! I'm guessing that the main token exists solely to establish the correspondence to the user-written code, but perhaps ghc doesn't actually care about it and instead uses the raw main1 directly.

Now, to be honest, I don't actually know what runMainIO1 does, but according to Edward Yang, it initializes some things like interrupt handlers before running the program.

Finally, we reach the true entry point of our program:
:main :: IO ()
:main =
  main2 -- There is a hidden cast here
This passes our program to the Haskell runtime to be executed. That's it! We're done.

Monday, October 15, 2012

Parsing chemical substructures

This is the second in a series of coding examples from my own work. /r/programming asked for non-trivial, yet digestible, Haskell examples, so I hope this fits the bill.

In this post I will show how learning Haskell changes the way you think. I will take a common Haskell idiom (monadic parsing) and apply it in a new way to solve a structural bioinformatics problem. To make this post more accessible to non-Haskell programmers, I'm going to gloss over implementation details and instead try to discuss at a high-level how I apply Haskell idioms to solve my problem.


The Parser type


When we approach parsing problems we normally focus on the text as the central element of our algorithm and design our program around combining and manipulating text. Monadic parsing turns this approach on its head and places the parser front and center, where we instead combine and manipulate parsers.

This means we need to define a Parser type:
type Parser a = String -> [(a, String)]
A Parser a takes a starting String as its input and parses it to return a single value of type a and the remaining unconsumed String. There might be multiple valid parses, so we actually return a list of possible parsings instead of a single one. If the parse fails, we return an empty list signifying no valid parsings.

We can also use Haskell's newtype feature to encapsulate the implementation and hide it from the user:
newtype Parser a
  = Parser { runParser :: String -> [(a, String)] }
This defines the Parser constructor which we use to wrap our parsing functions:
Parser :: (String -> [(a, String)]) -> Parser a
.. and the runParser function which unwraps the Parser to retrieve the underlying function:
runParser :: Parser a -> (String -> [(a, String)])
Additionally, this encapsulation enables some "magic" later on.


Parsers


Haskell has a Bool data type, defined as:
data Bool = True | False
... so let's define a Parser that parses a True value:
import Data.List

parseTrue :: Parser Bool
parseTrue = Parser (\str ->
    if (isPrefixOf "True" str)
    then [(True, drop 4 str)] -- Parse succeeds: 1 result
    else []                   -- Parse fails   : 0 results
    )
Similarly, we can define a Parser for False:
parseFalse :: Parser Bool
parseFalse = Parser (\str ->
    if (isPrefixOf "False" str)
    then [(False, drop 5 str)] -- Parse succeeds: 1 result
    else []                    -- Parse fails   : 0 results
    )
Let's test out our parsers using ghci:
>>> runParser parseTrue "True Story"
[(True, " Story")]
>>> runParser parseFalse "True Story" -- Fails: not False
[]
>>> runParser parseFalse "Falsehood"
[(False, "hood")]
>>> runParser parseFalse " Falsehood" -- Fails: leading space
[]
What if we want to also skip initial spaces? We can define a parser that always succeeds and returns the string trimmed of all leading spaces:
skipSpaces :: Parser ()
skipSpaces = Parser (\str ->
    [((), dropWhile (== ' ') str)] -- Always succeeds: 1 result
    )
Let's confirm it works:
>>> runParser skipSpaces "          Falsehood"
[((), "Falsehood")]
>>> runParser skipSpaces "Apple"
[((), "Apple")]

Monads


Now we want to combine our Parsers so we can parse both a True and a False with optional spaces in between. This means we need some elegant way to take the unconsumed input from each Parser and feed it directly into the next Parser in the chain.

Fortunately, Haskell solves this problem cleanly using monads. A Monad defines an interface to two functions:
class Monad m where
    return :: a -> m a
    (>>=)  :: m a -> (a -> m b) -> m b
Like interfaces in any other language, we can program generically to that interface. Haskell's do notation works with this generic Monad interface, so we can use the imperative do syntax to manipulate anything that implements Monad.

Our Parser implements the Monad interface quite nicely:
instance Monad Parser where
    return a = Parser (\str  -> [(a, str)])
    m >>= f  = Parser (\str1 ->
        -- This is a list comprehension, basically
        [(b, str3) | (a, str2) <- runParser m     str1,
                     (b, str3) <- runParser (f a) str2]
        )
This is not a monad tutorial, so I'm glossing over why that is the correct definition or what it even means, but if you want to learn more about monads, I highly recommend: You could have invented monads.

When we make Parser a Monad, we gain the ability to assemble Parsers using do notation, so let's use do notation to combine multiple parsers in an imperative style:
trueThenFalse :: Parser (Bool, Bool)
trueThenFalse = do
    t <- parseTrue
    skipSpaces
    f <- parseFalse
    return (t, f)
That reads just like imperative code: parse a True, skip some spaces, then parse a False. Finally, return the two values you parsed. This seems straightforward enough until you realize we haven't actually parsed any text, yet! All we've done is combine our smaller parsers into a larger parser poised to be run on as many inputs as we please.

Let's make sure it works as advertised:
>>> runParser trueThenFalse "True   False leftovers"
[((True, False), " leftovers")]
>>> runParser trueThenFalse "False   True"
[]

Alternatives


Sometimes we want to try multiple parsing alternatives. For example, what if I want to parse a True or a False? I can define a (<|>) operator that tries both parsers and then returns the union of their results:
(<|>) :: Parser a -> Parser a -> Parser a
p1 <|> p2 = Parser (\str ->
    runParser p1 str ++ runParser p2 str
    )
Now I can parse a Bool value without specifying which one and the parser will return which one it parsed:
parseBool :: Parser Bool
parseBool = parseFalse <|> parseTrue
>>> runParser parseBool "True Story"
[(True, " Story")]
>>> runParser parseBool "Falsehood"
[(False, "hood")]


Parsing chemistry


Parsers have one more feature that might surprise you: There is nothing String-specific about them! With one tiny modification, we can generalize them to accept any type of input:
newtype Parser s a
  = Parser { runParser :: s -> [(a, s)] }

instance Monad (Parser s) where
    <EXACT same code as before>

(<|>) :: Parser s a -> Parser s a -> Parser s a
(<|>) = <EXACT same code as before>
Since s is "polymorphic", we can set it to any conceivable type and the above code still works. The only String-specific behavior lies within the specific parser definitions, and their new compiler-inferred types reflect that:
-- These parsers only accept Strings as input
parseFalse    :: Parser String Bool
parseTrue     :: Parser String Bool
trueThenFalse :: Parser String (Bool, Bool)
But there's no reason we can't define entirely different Parsers that accept completely different non-textual input, such as chemical structures. So I'll switch gears and define parsers for chemical Structures, where a Structure is some sort of a labeled graph:
data Structure = Structure {
    graph :: Graph          , -- Adjacency list
    atoms :: Vector AtomName} -- The node labels
... with some convenience functions I've defined for manipulating the Graph:
-- Return the edges of the graph
bonds :: Graph -> [Edge]

-- Remove an edge from the graph
deleteBond :: Edge -> Graph -> Graph
Now, I can define new Parsers that operate on Structures instead of Strings.


Parsing bonds


The most primitive parser I'm interested in parses a single bond. It requires two AtomNames which specify what kind of bond to look for (i.e. a carbon-carbon bond, except it can be even more specific). Then, it outputs which two indices in the graph matched that bond-specification:
parseBond :: AtomName -> AtomName -> Parser Structure (Int, Int)
I can use the list monad (i.e. a list comprehension) to define this primitive parser (and don't worry if you can't precisely follow this code):
parseBond name1 name2
  = Parser $ \(Structure oldGraph atoms) -> do
    -- The first atom must match "name1"
    i1 <- toList (findIndices (== name1) atoms)

    -- Some neighboring atom must match "name2"
    i2 <- filter (\i -> atoms ! i == name2) (oldGraph ! i1)

    -- Remove our matched bond from the graph
    let newGraph = deleteBond (i1, i2) oldGraph

    -- .. and return the matched indices
    return ((i1, i2), Structure newGraph atoms)
Haskell strongly encourages a pure functional style, which keeps me from "cheating" and using side effects or mutation to do the parsing. By sticking to a pure implementation, I gain several bonus features for free:
  • If our bond occurs more than once, this correctly matches each occurrence, even if some matches share an atom
  • If both AtomNames are identical, this correctly returns both orientations for each matched bond
  • This handles backtracking with (<|>) correctly
  • I can parallelize the search easily since every search branch is pure
I got all of that for just 6 lines of code!


Parsing substructures


Now I can build more sophisticated parsers on top of this simple bond parsers. For example, I can build a generic substructure parser which takes a sub-Structure to match and returns a list of matched indices:
parseSubstructure :: Structure -> Parse Structure [Int]
Again, if you don't precisely understand the code, that's okay:
parseSubstructure (Structure graph as)
    -- Use the State monad to keep track of matches
  = (`evalStateT` (V.replicate (V.length as) Nothing)) $ do

        -- foreach (i1, i2) in (bonds graph):
        forM_ (bonds graph) $ \(i1, i2) -> do

            -- Match the bond
            (i1', i2') <- lift $ parseBond (as ! i1) (as ! i2)

            -- The match must be consistent with other matches
            matches    <- get
            let consistent i1 i1' = case (matches ! i1) of
                    Nothing   -> True
                    Just iOld -> iOld == i1'
            guard (consistent i1 i1' && consistent i2 i2')

            -- Update the match list
            put (matches // [(i1, Just i1'), (i2, Just i2')])

        -- Return the final list of matches
        matchesFinal <- get
        justZ . sequence . toList $ matchesFinal
Like before, the code detects all matching permutations and backtracks if any step fails.


Reusable abstractions


I find it pretty amazing that you can build a substructure parser in just 18 lines of Haskell code. You might say I'm cheating because I'm not counting the amount of lines of code I took to define the Parser type, the Monad implementation, and the (<|>) type. However, the truth is I can actually get all of those features using 1 line of Haskell:
type Parser s = StateT s []
-- and rename all 'Parser' constructors to 'StateT'
So I lied: it's actually 19 lines of code.

I don't expect the reader to know what StateT or [] are, but what you should take away from this is that both of them are part of every Haskell programmer's standard repertoire of abstractions.

Moreover, when I combine them I automatically get a correct Monad implementation (i.e. do notation) and a correct Alternative implementation (which provides the (<|>) function), both for free!


Conclusions


This is just one of many abstractions I used to complete a structural search engine for proteins. Now that it's done, I'll be blogging more frequently about various aspects of the engine's design to give people ideas for how they could use Haskell in their own projects. I hope these kinds of code examples pique people's interest in learning Haskell.


Appendix


I've included the full code for the String-based Parsers. The Structure-based Parsers depend on several project-specific data types, so I will just release them later as part of my protein search engine and perhaps factor them out into their own library.

Also, as a stylistic note, I prefer to use ($) to remove dangling final parentheses like so:
Parse (\str ->        =>  Parse $ \str ->
    someCode          =>      someCode
    )                 =>
... but I didn't want to digress from the post's topic by explaining how the ($) operator behaves.
import Data.List

newtype Parser a
  = Parser { runParser :: String -> [(a, String)] }

parseTrue :: Parser Bool
parseTrue = Parser (\str ->
    if (isPrefixOf "True" str)
    then [(True, drop 4 str)]
    else []
    )

parseFalse :: Parser Bool
parseFalse = Parser (\str ->
    if (isPrefixOf "False" str)
    then [(False, drop 5 str)]
    else []
    )

skipSpaces :: Parser ()
skipSpaces = Parser (\str ->
    [((), dropWhile (== ' ') str)]
    )

instance Monad Parser where
    return a = Parser (\str  -> [(a, str)])
    m >>= f  = Parser (\str1 ->
        [(b, str3) | (a, str2) <- runParser m     str1,
                     (b, str3) <- runParser (f a) str2]
        )

trueThenFalse :: Parser (Bool, Bool)
trueThenFalse = do
    t <- parseTrue
    skipSpaces
    f <- parseFalse
    return (t, f)

(<|>) :: Parser a -> Parser a -> Parser a
p1 <|> p2 = Parser (\str ->
    runParser p1 str ++ runParser p2 str
    )

parseBool :: Parser Bool
parseBool = parseFalse <|> parseTrue

Saturday, October 6, 2012

pipes-2.4: Proxy transformers, extra categories, utilities, and benchmarks

This release packs a LOT of new features, so I will begin with the most significant feature: proxy transformers. The proxy transformer pattern provides a very simple extension framework that cleanly solves many problems that iteratee library authors face.

Users of the library should read Control.Proxy.Trans.Tutorial, which explains how proxy transformers work. However, this post also provides a decent introduction to them, too.

Introduction


Wouldn't it be nice if you could catch and handle errors within a proxy? Now you can! It's as simple as:
import Control.Monad (forever)
import Control.Monad.Trans (lift)
import Control.Proxy
import Control.Proxy.Trans.Either as E
import Safe (readMay)

promptInts :: () -> EitherP String Proxy C () () Int IO r
promptInts () = recover $ forever $ do
    str <- lift getLine
    case readMay str of
        Nothing -> E.throw "Could not parse an integer"
        Just n  -> liftP $ respond n

recover p =
    p `E.catch` (\str -> lift (putStrLn str) >> recover p)

main = runProxy $ runEitherK $ mapP printD <-< promptInts
>>> main
1<Enter>
1
Test<Enter>
Could not parse an integer
Apple<Enter>
Could not parse an integer
5<Enter>
5
The above program condenses many new features of this release into a nice compact example and I'll use it to show-case each feature.


Proxy transformers


The above program uses the EitherP proxy transformer. To access this feature, you just import the transformer you wish to use:
import Control.Proxy.Trans.Either as E
Control.Proxy imports all the remaining machinery you need.

EitherP extends any proxy-like type with the ability to throw and catch errors locally, as if it lived inside a native EitherT block. It does so in such a way that preserves composition (and the category laws!), so you can directly compose the result without unwrapping the EitherP.

When you are done composing, just use runEitherK to convert it back to the underlying proxy:
runEitherK
  :: (q -> EitherP e p a' a b' b m r )
  -> (q -> p a' a b' b m (Either e r))

Utilities


This release introduces the "proxy prelude", a set of convenience functions for users of the library. Control.Proxy automatically exports these and they don't clash with the Prelude or any common libraries.

Our old friend printer got a name-change and now goes by printD. This utility function prints all values bound downstream:
printD :: Show a => x -> Proxy x a x a IO r
I provide many more utility functions under the Control.Proxy.Prelude hierarchy, and people who enjoyed my functor design pattern post will also enjoy the abundance of cute trivial examples of the functor pattern in the documentation for Control.Proxy.Prelude.Base.


Proxy transformers are functors


However, this release includes a far more sophisticated set of functors: the proxy transformers themselves. Each proxy transformer implements the ProxyTrans class which defines two functions: mapP and liftP, related by the equation:
mapP = (liftP .)
mapP defines two separate functors.

The first functor behaves like a traditional monad transformer, converting the base Kleisli category to the extended Kleisli category:
mapP return = return

mapP (f >=> g) = mapP f >=> mapP g
You can write these laws using liftP to see that our proxy transformers behave like ordinary monad transformers:
liftP $ return x = return x

do x <- liftP m
   liftP $ f x
= liftP $ do x <- m
             f x
The above program uses this capacity of liftP to lift operations from the Proxy monad to the EitherP String Proxy monad.

The second functor lifts the base proxy composition to the extended proxy composition:
mapP idT = idT

mapP (p1 >-> p2) = mapP p1 >-> mapP p2
This latter functor lets you compose simpler proxies with extended proxies. The above program uses mapP in this capacity to promote printD for composition with promptInts:
mapP printD <-< promptInts
This demonstrates a concrete application of the functor design pattern, allowing seamless interoperability between proxies written to varying feature sets. The proxy transformers lift both the monad instance and the composition instance correctly so that simpler proxies play nicely with extended proxies.


Type signatures


The above program demos the new replacement for Void: C. This will shorten type signatures and also removes the dependency on void.

Also, now the Proxy type is a newtype around the underlying FreeT implementation. This gives nicer type errors when things go wrong.


Proxy Transformer Stacks


Just like monad transformers, you can stack proxy transformers to automatically combine their effects. By combining the StateP and EitherP proxy transformers, you can implement non-backtracking parsers for free:
{-# LANGUAGE GeneralizedNewtypeDeriving, OverloadedStrings #-}

import Control.Monad.Trans
import Control.Proxy
import Control.Proxy.Trans.Either as E
import Control.Proxy.Trans.State
import Data.Text as T hiding (take)

newtype ParseP p a' a b' b m r =
    ParseP { unParseP ::
        StateP Text (EitherP Text p) a' a b' b m r }
    deriving (Monad, MonadTrans, Channel)

instance ProxyTrans ParseP where
    liftP = ParseP . liftP . liftP

runParseK
  :: (q -> ParseP p a' a b' b m r)
  -> (q -> p a' a b' b m (Either Text (r, Text)))
runParseK = runEitherK . runStateK T.empty . (unParseP .)
The Channel type class defines proxy composition, so we can compose our parsing proxies seamlessly.

Let's write a few parsing primitives:
import Data.Monoid
import Data.Text.IO as T
import Prelude hiding (take)

take n = ParseP go where
    go = do
        s <- get
        if (T.length s < n)
        then do
            s' <- liftP $ liftP $ request ()
            put (s <> s')
            go
        else do
            let (h, t) = T.splitAt n s
            put t
            return h

parseFail str = ParseP $ liftP $ E.throw str

string str = do
    str' <- take (T.length str)
    if (str' == str)
    then return str
    else parseFail $
        "Expected: " <> str <> " -- Found: " <> str'
You wouldn't even know those were proxies if it were not for that single request statement.

Let's write a contrived parser based off of those primitives:
parser () = do
    string "Hello"
    str <- take 5
    lift $ T.putStrLn str
... and supply it with some input:
source () = do
    respond "Hell"
    respond "o, world!"
Now compose!
>>> runProxy $ runParserK $ parser <-< mapP source
, wor
Right ((),"ld!")
Let's see how failed parses turn out:
invalid () = do
    respond "A"
    respond "AAAAAAAA"
>>> runProxy $ runParseK $ parser <-< mapP invalid
Left "Expected: Hello -- Found: AAAAA"
I didn't include parsers in the library because I didn't want to add a bytestring or text dependency to the main pipes package. Instead, I will release the parsing extension as a separate library. This library will provide you with the streaming benefits of attoparsec with the ability to interleave effects.


Pushback


The above parsing example suggests my solution to push-back, which is to give each proxy its own local state using the StateP proxy transformer. You can then use the local state to keep track of unused input, as the above parsing example did.

Like all proxy transformers, this extension requires no special integration with the underlying proxy type and you can layer it anywhere within a proxy transformer stack with no special considerations.


Extra categories


The library now provides two additional categories for interacting with the Proxy type. These are term-rewriting categories (I believe the technical term is "sesquicategory", but I may be mistaken).

The first category's composition operator replaces all request statements within a Proxy with a suitably typed replacement:
f /</ g -- Replace all occurrences of 'request' in 'f' with 'g'
request is the identity of this category, so we expect that:
-- Replacing 'request' with 'request' changes nothing
f /</ request = f

-- Replacing 'request' with 'f' gives 'f'
request /</ f = f
Also, this substitution is associative:
(f /</ g) /</ h = f /</ (g /</ h)
Similarly, the respond command has its own substition operator, (\<\), and they form their own category:
f \<\ g  -- Replaces all 'respond's in 'g' with 'f'

f \<\ respond = f

respond \<\ f = f

(f \<\ g) \<\ h = f \<\ (g \<\ h)
Each category distributes in one direction over the Kleisli category:
-- Distributivity
r \<\ (f <=< g) = (r \<\ f) <=< (r \<\ g)

-- Zero
r \<\ return = return

-- Distributivity
(f <=< g) /</ r = (f /</ r) <=< (g /</ r)

-- Zero
return /</ r = return

Lifting request and respond


I originally envisioned that proxy transformers would also automatically lift request and respond statements. The laws for this lifting are quite simple:
mapP request = request

mapP respond = respond
In other words, the functor laws, applied to the identity of the two new categories I just introduced. However, unfortunately Haskell's type class system severely got in my way and I could not solve the issue before the release. I have a tentative plan for how to solve this using Edward's constraint package but it will take time. Until then, you will have to manually lift request and respond statements from the base Proxy type.

Overall, I was pretty disappointed with Haskell's type class system (more so than usual). This library really exercised it considerably and I even had to drop an additional proxy transformer because it was unimplementable due to the broken constraint system.


Performance


Raw proxies give performance comparable to conduit when doing simple IO:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy hiding (await)
import Data.Conduit
import Data.Conduit.List as L
import Data.Maybe (fromJust) -- You did not see this

n = 100000 :: Int

-- Choose your poison
main = runProxy $ printD <-< enumFromToS 1 n
main = L.enumFromTo 1 n
    $$ forever (await >>= lift . print . fromJust)
Using pipes:
real    0m1.761s
user    0m0.384s
sys     0m0.712s
Using conduit:
real    0m1.528s
user    0m0.224s
sys     0m0.660s
Conduit is 15% faster.

The margin is substantially larger for entirely pure code:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy hiding (await)
import Data.Conduit
import Data.Conduit.List as L

n = 100000 :: Int

main = runProxy $ discard <-< enumFromToS 1 n

discard' = do
    a <- await
    case a of
        Nothing -> return ()
        Just _  -> discard'

main = L.enumFromTo 1 n $$ discard'
Using pipes:
real    0m0.085s
user    0m0.088s
sys     0m0.000s
Using conduit:
real    0m0.011s
user    0m0.004s
sys     0m0.004s
Conduit is almost 8(!) times faster.

Conduit dramatically improves for entirely pure code since it bends the monad transformer laws to skip binds in the base monad. This is one reason that this pipes release type-classes all the Proxy operations. If people request that I copy conduit's approach, I will release a separate library that copies conduit's optional monad bind and have it implement all the same type-classes. Then all the proxy transformers are guaranteed to work transparently with it because they abstract completely over the type classes.

Additionally, I want to note that the pipes library currently has only one optimization PRAGMA in the entire library:
{-# INLINABLE mapK #-} -- An obscure utility function
... whereas conduit uses a considerable number of rewrite rules and INLINABLE statements. I don't know how much these contribute to conduit's speed, but I will copy Michael's optimizations in the next few releases and benchmark how much they contribute to performance.

Additionally, I've also benchmarked the overhead of proxy transformers. First, comparing performance for some trivial IO:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy
import Control.Proxy.Trans.Writer
import Data.Monoid

n = 100000 :: Int

main = runProxy $ without <-< enumFromToS 1 n

main :: IO ((), Sum Int)
main = runProxy $ runWriterK $ with <-< mapP (enumFromToS 1 n)

with
 :: (Monoid w, Show a)
 => () -> WriterP w Proxy () a () C IO r
with () = forever $ do
    n <- liftP $ request ()
    lift $ print n

without :: (Show a) => () -> Proxy () a () C IO r
without () = forever $ do
    n <- request ()
    lift $ print n
Using the bind in the WriterP w Proxy monad (i.e. with):
real    0m1.739s
user    0m0.396s
sys     0m0.680s
Using the bind in the Proxy monad (i.e. without):
real    0m1.704s
user    0m0.368s
sys     0m0.668s
A difference of 2%(!).

Again, the difference widens if you switch to pure code:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy
import Control.Proxy.Trans.Writer
import Data.Monoid

n = 100000 :: Int

main = runProxy $ without <-< enumFromToS 1 n

main :: IO ((), Sum Int)
main = runProxy $ runWriterK $ with <-< mapP (enumFromToS 1 n)

with
 :: (Monoid w, Show a)
 => () -> WriterP w Proxy () a () C IO r
with () = forever $ liftP $ request ()

without :: (Show a) => () -> Proxy () a () C IO r
without () = forever $ request ()
Using WriterP w Proxy's bind:
real    0m0.134s
user    0m0.124s
sys     0m0.008s
Using Proxy's bind:
real    0m0.084s
user    0m0.076s
sys     0m0.004s
Now it's about a factor of 2.

So I can summarize these benchmarks by saying that if you are doing even a little bit of IO, the performance differences are pretty small, and as I aggressively optimize the library, they should get even smaller.


Switch to free


Edward was kind enough to migrate my transformers-free functionality into his free package, so now pipes uses free for its free monad transformer dependency.


Resource management


I plan on releasing a Proxy-like type that implements resource management that will replace the Frame type. This type will include functions to promote existing Proxy code to this resource-managed version. Until then, you will have to manually manage resources by opening all file handles before composition, and closing them all afterwards, like so:
import Control.Proxy
import System.IO

main = do
    h <- openFile "test.txt" WriteMode
    runProxy $ hPrintD h <-< enumFromToS 1 10
    hClose h
... or you can use Michael's ResourceT in the base monad, if that is your thing.

You won't get the benefit of conserving handles, but you will still get predictable streaming performance.


Library writers


If you are considering building off the pipes library, I recommend implementing any functionality using the Proxy type, which I guarantee will be promotable to any future extensions, and I plan on personally writing several Proxy-based libraries over the next few months.

While I still preserve the Pipe type, I fully endorse the Proxy type as the type to standardize on as it has many more nice theoretical properties than the Pipe type and also supports greater functionality.


Conclusions


This release is very close to the final state I envisioned for the core pipes library. Most existing features won't disappear, with the exception of Control.Frame, which I will phase out once I release a suitable replacement in a separate library.

Most additional features that I plan on implementing will go into separate libraries that build on top of this one. I only plan on adding functionality to the core library if I discover additional interesting structure for the Proxy type.