tag:blogger.com,1999:blog-1777990983847811806.post6720598980634542475..comments2024-03-16T16:29:29.582-07:00Comments on Haskell for all: pipes-2.5: Faster and slimmerGabriella Gonzalezhttp://www.blogger.com/profile/01917800488530923694noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-1777990983847811806.post-45761980152921113952012-12-12T09:43:29.527-08:002012-12-12T09:43:29.527-08:00Yes, but probably not before the end of the year. ...Yes, but probably not before the end of the year. The highest priority libraries for me at the moment are bytestring/text support and parsing.<br /><br />Jeremy Shaw is working on a pipes-http server, but he is waiting on the parsing library which would make his life a lot easier. The parsing extension is not rocket science, and I already demonstrated how it works in principle in the pipes-2.4 announcement post. It's just a matter of writing it up and polishing it.<br /><br />Also, I just released pipes-3.0, which will be the base library for all these expansion libraries, so please check that out. It makes everything a lot cleaner.Gabriella Gonzalezhttps://www.blogger.com/profile/01917800488530923694noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-85314834870942307772012-12-11T17:54:54.533-08:002012-12-11T17:54:54.533-08:00Will you release an http client that uses pipes un...Will you release an http client that uses pipes underneath?David Sargeantnoreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-56791424651342559192012-11-08T20:44:46.099-08:002012-11-08T20:44:46.099-08:00Yeah, the parsing thing was a complete oversight. ...Yeah, the parsing thing was a complete oversight. I've been so focused on comparisons to conduit lately that I forget about the rest of the field. I'll fix it after I post this comment.<br /><br />You are mostly right about IO, which is a sufficiently large bottle-neck that the pure differences don't matter. In fact, that was my original justification for switching to the transformers version, because I thought that the differences were negligible for IO code. However, there were more reasons I've been concentrating on improving pure performance lately.<br /><br />The first important reason is that I would like this library to replace lazy bytestrings and lazy text, even the purely-generated kind. For example, sometimes you want to purely generate a bytestring but you don't want to bring the entire bytestring into memory immediately, so you use a lazy bytestring. You could in principle use the already-existing lazy bytestrings for this performance (since it is safe, no lazy IO is involved), but then you would have to use a separate data type for bytestrings generated by IO and bytestrings generated by pure computations. I would like pipes to be used everywhere one needs a generator of any sort, pure or not.<br /><br />The second reason is that I want people to feel as comfortable about performance as possible, so comfortable that they use proxies pervasively. I consider the underlying abstraction of proxies to be a very very fundamental building block that should play a central role in almost all programs. This is why I don't contaminate the core implementation with cross-cutting domain-specific concerns like state/error-handling/finalization/parsing and instead decompose those features out as extensions, because I see the underlying abstraction as being much more general in scope than what people envisioned for it and I think its use should be completely ubiquitous in modern Haskell programming, even for completely pure computations.<br /><br />I encounter these pure applications of the pipe abstraction all the time, but just haven't blogged about them because all of my efforts right now are on making the library as feature complete as possible before I begin promoting its use really heavily.Gabriella Gonzalezhttps://www.blogger.com/profile/01917800488530923694noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-90867947463663159162012-11-08T19:05:40.133-08:002012-11-08T19:05:40.133-08:00I agree that benchmarks which perform no IO are us...I agree that benchmarks which perform no IO are useful for comparing the overheads of libraries, but I often wonder how well they translate to real-world (i.e. IO-performing) code. In particular, I suspect GHC's optimizer does a much better job of unrolling/inlining/etc. pure code, so I wouldn't be surprised if there are some major differences in the generated core of IO-performing code and non-IO code. I've seen this in the past, but haven't checked recently.<br /><br />Partly because of this, I think it's important to focus on the performance of IO-based code. This is especially true as one of the major benefits of these libraries, deterministic interleaving of effects, isn't relevant to non-IO code, so a user can often write traditional, lazy functions in those cases (or more likely, already has lazy functions that can be used directly).<br /><br />I'm also not entirely sure what you mean when you state that other iteratee implementations can't implement arbitrary effects interleaved with parsing. In iteratee at least, an iteratee *is* a parser, and can be interleaved freely with other arbitrary effects. This has been true since the beginning. I'm probably misunderstanding your argument here.Anonymoushttps://www.blogger.com/profile/14086288462709102194noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-83721982527851876812012-11-02T08:13:48.052-07:002012-11-02T08:13:48.052-07:00Thanks for the in-depth explanation!Thanks for the in-depth explanation!Tom Enoreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-83519494801221939192012-11-01T12:47:02.054-07:002012-11-01T12:47:02.054-07:00Cool, thanks. Good stuff, looking forward to more...Cool, thanks. Good stuff, looking forward to more.Richard Wallacehttps://www.blogger.com/profile/06782237293774470606noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-61235089473888341272012-11-01T12:22:26.872-07:002012-11-01T12:22:26.872-07:00The issue isn't so much that the binds are int...The issue isn't so much that the binds are intrinsically expensive, but rather that they interfere with a LOT of compiler optimizations since the compiler doesn't know about the monad laws or the monad transformer laws. For example, in the correct version I would end up having to return a value in the base monad like:<br /><br /> let p1' = return x<br /><br />... and consequently rebind the result in the next step of composition, and the compiler doesn't know that binding a return's result is equivalent to just a pure let binding, so a lot of optimizations just disappear.<br /><br />The monad transformer also gets in the way in a lot of other respects, too. In the non-monad-transformer version I can very aggressively rewrite composition into a very tight and compact form that generates really excellent core. With the monad transformer version, even if I take advantage of as many laws as possible I cannot achieve anywhere near as compact and simple of an implementation by hand.<br /><br />I also tried lots of rewrite rule tricks to try to help the compiler along and try to do this kind of work automatically, but getting them to reliably fire is very difficult and even when they did work it still didn't help that much.<br /><br />The best solution is to offer both implementations and let the user choose whether they want monad transformer laws or speed. However, I can't do that until I can type-class the utilities and that requires some form of polymorphic constraints. I did a lot of work with Edward's constraints library and other tricks to try to type-class my utilities, but that was not working for various reasons that I am still trying to figure out and I felt I needed to move on to fleshing out the standard libraries for now.Gabriella Gonzalezhttps://www.blogger.com/profile/01917800488530923694noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-55056989065173751192012-11-01T11:57:07.643-07:002012-11-01T11:57:07.643-07:00Is this something you are certain there is no solu...Is this something you are certain there is no solution to, or is it possible there is a nice fix?Tom Ehttp://web.jaguarpaw.co.uk/~tom/contact/noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-15375952696664606772012-11-01T10:34:25.839-07:002012-11-01T10:34:25.839-07:00Oh, I forgot to mention that I answered the questi...Oh, I forgot to mention that I answered the question about bending the monad transformer laws in response to the comment before you, so just read that. Long story short, this won't affect you if you don't mind that the internal implementation keeps track of how many times you invoked the base monad.Gabriella Gonzalezhttps://www.blogger.com/profile/01917800488530923694noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-4140440960033673552012-11-01T10:32:26.391-07:002012-11-01T10:32:26.391-07:00The main advantage of pipes (I really should renam...The main advantage of pipes (I really should rename it to proxies at this point) are:<br /><br />* Proxies let you communicate bidirectionally<br />* The API is small: only one way to compose them<br />* Careful to attention to laws to avoid bugs from corner cases.<br />* An extension framework that doesn't change the user-facing API.<br /><br />That last one is actually quite significant and probably the most underappreciated feature of the library. For example, using the EitherP extension you can throw and catch errors locally within a pipe without disturbing the rest of the pipeline (see the `pipes-2.4` release post for an example), something that no other iteratee implementations can do.<br /><br />The advantage of this approach will be more apparent when I release the parsing extension that lets you interleave arbitrary effects with parsing, something that other iteratee implementations cannot do. What they do is simply convert an attoparsec parser into a consumer rather than let you mix parsing commands with other effects. However, I will probably also provide that feature, too, just for completeness.<br /><br />All of these extensions don't change the composition operation or request/respond commands. For example, you'll notice that conduit provides the $$+/$$++/$$- operations for dealing with leftovers. With proxies, the `StateP` extension handles all leftover state for you correctly, guaranteeing that it is never dropped without any additional operators (you still use <-< for composition). I also will release a push-back-specific proxy transformer that is just a glorified wrapper around StateP [a] that makes it easier for users to understand what is going on and how pushback works with proxies.<br /><br />A lot of features that conduit provides can be replicated in proxies, but I haven't had the time to explain how yet, mainly because I'm focusing on getting the basic standard libraries out. Once the standard libraries are out I will post a "proxy cookbook" that explains the one true way to do everything you are used to doing with conduit, like folding or pushback or parsing. Until then, I mainly field questions from users and explain to them how to do things, so if you have any specific questions for how to do a particular task using proxies just let me know.<br /><br />There is one feature that I also plan to release for pipes-3.0 that will also vastly simplify the API, which is to improve the way I structure all the type synonyms. Once I do this, you will be able to use the Pipe type synonyms from Control.Proxy.Pipe transparently with proxy transformers so that you can still get the nicer unidirectional API to play nice with everything else. However, changing the type synonyms to fix this is a sufficiently breaking change that I'm delaying it until I bump the major version number.Gabriella Gonzalezhttps://www.blogger.com/profile/01917800488530923694noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-24428692663938659372012-11-01T10:01:19.703-07:002012-11-01T10:01:19.703-07:00I first described the issue here, back when I was ...I first described the issue here, back when I was first considering using the free package for pipes:<br /><br />https://github.com/ekmett/free/issues/3<br /><br />It is not exactly the same, but it is still the same basic idea: The monad transformer laws basically say that you should not be able to distinguish how many calls in a row were made to the base monad. The faster implementation does keep track of how many times you invoke the base monad, which violates the laws.<br /><br />When you call runProxy, though, then that information vanishes when it fuses all calls to the base monad together. This is called observational equivalence, which is that the laws are correct when viewed through the lens of runProxy.<br /><br />The only way you will notice a law violation is if you apply some function that counts the number of calls to the base monad.<br /><br />If you are worried about code breaking, that will not happen if you don't assume the monad transformer laws are correct. I would have preferred to use a separate function to replace lift and make clear the laws do not hold, but I believe users would not like that.<br /><br />Also, even if you DO assume the monad transformer laws hold, it still won't break if you don't use any functions that distinguish how many tines you invoke the base monad. However, I would prefer that users just don't assume the laws hold in order to be safe.Gabriella Gonzalezhttps://www.blogger.com/profile/01917800488530923694noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-49296839381892503012012-11-01T09:22:29.517-07:002012-11-01T09:22:29.517-07:00I'm also curious about the specifics of the mo...I'm also curious about the specifics of the monad transformer law bending. A link to read up on it would be great.<br /><br />It sounds likes pipes and conduit are moving closer together in many ways, I'm curious what are the ways they are still different - the resource finalization being abstracted away is obviously one, and a good one - but I'm curious about where else they still differ.Richard Wallacehttps://www.blogger.com/profile/06782237293774470606noreply@blogger.comtag:blogger.com,1999:blog-1777990983847811806.post-56090340137153541402012-10-31T23:31:12.815-07:002012-10-31T23:31:12.815-07:00> I spent a considerable amount of effort tryin...> I spent a considerable amount of effort trying to get the correct version to work, but I was led inexorably to the same conclusion that Michael already reached, which was that the original approach was best and that the gain in performance is worth bending the monad transformer laws.<br /><br />Can you go into more detail about exactly what the difficulty is and what "law bending" had to take place to overcome it?Tom Ehttp://web.jaguarpaw.co.uk/~tom/contact/noreply@blogger.com