## Sunday, October 6, 2013

### Manual proofs for the `pipes` laws

Out of all of Haskell's streaming libraries, `pipes` is the only that does not have a test suite. This is because I prefer to use equational reasoning to prove the correctness of `pipes`. For those new to Haskell, equational reasoning is the use of equations to prove code is correct without ever running it. Haskell makes this possible by strictly enforcing purity.

Equational reasoning works well for a library like `pipes` because most properties of interest can easily be specified as category theory laws. If you are unfamiliar with category theory, you can read this short primer to learn how it relates to programming.

Every primitive in the `pipes` library is related to a category in some way:
• `yield` and `(~>)` are the identity and composition operator of a category
• `await` and `(>~)` are the identity and composition operator of a category
• `cat` and `(>->)` are the identity and composition operator of a category
This means that we state the expected behavior for all six primitives just by stating their category laws:
```-- Left Identity
yield ~> f = f

-- Right Identity
f ~> yield = f

-- Associativity
(f ~> g) ~> h = f ~> (g ~> h)

-- Left Identity
await >~ f = f

-- Right Identity
f ~> await = f

-- Associativity
(f >~ g) >~ h = f >~ (g >~ h)

-- Left Identity
cat >-> f = f

-- Right Identity
f >-> cat = f

-- Associativity
(f >-> g) >-> h = f >-> (g >-> h)
```
Category theory laws are like tests: they are properties that we expect our code to obey if we wrote it correctly. These kinds of laws tend to have the nice property that they tend to uncannily summarize many useful properties we would like to know in a remarkably small set of equations. For example, `(~>)` is defined in terms of `for` like this:
```(f ~> g) x = for (f x) g
```
When you restate the category laws for `yield` and `(~>)` in terms of `for`, you get the "for loop laws":
```-- Looping over a single yield simplifies to function
-- application

for (yield x) \$ \e -> do
f e

= f x

-- Re-yielding every element of a generator returns the
-- original generator

for gen \$ \e -> do
yield e

= gen

-- Nested for loops can become a sequential for loops if the
-- inner loop body ignores the outer loop variable

for gen0 \$ \i -> do
for (f i) \$ \j -> do
g j

= for gen1 \$ \j -> do
g j
where
gen1 = for gen0 \$ \i -> do
f i
```
These are common sense laws that we would expect any language with generators and for loops to obey. Amazingly, the for loop laws emerge naturally from category laws, almost as if they were handed down to us on stone tablets.

For a while I've been writing these proofs out by hand informally in notebooks, but now that I'm writing up my thesis I needed to organize my notes and fill in all the little steps that I would previously skip over. You can find the full set of my notes as an organized markdown file that I've committed to the `pipes` repository that contains all my manual proofs of the pipe laws.

#### Caveats

There are several caveats that I need to state about these proofs:

First, there is the obvious caveat that these proofs are still not machine-checked, but that's the next step. Paolo Capriotti has been patiently teaching me how to encode my reasoning into Agda. However, I believe this should be possible and that the proofs are very close to sound, taking into account the following caveats below.

Second, these proofs do not account for bottom (i.e. `_|_`). My intuition is that the `pipes` laws do hold even in the presence of bottom, by I am still new to reasoning about bottom, so I omitted that for now. If anybody can point me to a good reference on this I would be grateful.

Third, these proofs use a non-standard coinductive reasoning. To explain why, it's important to note that typical coinductive reasoning in Haskell requires the result to be productive and to guard coinduction behind the production of a constructor. However, not all pipelines are productive, as the following examples illustrate:
```infinite = forever \$ yield ()

infinite >-> forever await = _|_

```
So instead of using productivity to impose a total ordering on coinductive reasoning I use "consuming" to order my coinductive proofs, meaning that all uses of coinduction are guarded behind the consumption of a constructor. In other words, my proofs do not guarantee that pipe composition makes progress in production, but they do guarantee that pipe composition makes progress in consumption.

Fourth, another departure from typical coinductive proofs is notation. Coinductive reasoning uses bisimilarity to prove "equality". I also use bisimilarity, but notationally I just write it down as a single chain of equations that is bisected by a reference to coinduction. The two halves of the equation chain, above and below the bisection, constitute the two bisimilar halves.

Fifth, proofs notationally discharge other proofs using headers as the namespace. If you see a reference like `[Kleisli Category - Left Identity Law - Pointful]`, that means you should look under top-level header `Kleisli Category` for sub-header `Left Identity Law` and sub-sub-header `Pointful`.

Sixth, I've only proven the laws for the bidirectional API in the `Pipes.Core` module. However, the category laws for the unidirectional API are very easy to derive from the laws for the bidirectional API, so I leave those as an exercise until I have time to add them myself.

#### Conclusions

My goal is to eventually machine-check these proofs so that `pipes` will be a practical and instructive example of a library with a statically verified kernel. Moreover, I hope that these proofs will allow other people to equationally reason about even more sophisticated systems built on top of `pipes` and prove higher-level properties by discharging the proofs that I have assembled for them.

1. Much as I respect your use of equational reasoning and your CT-inspired design philosophy, I can't help but disagree with you here. Not having a test suite is not something to be proud of.

I'm sure you're aware of the following quotation:

> Beware of bugs in the above code; I have only proved it correct, not tried it.
> -- Donald Knuth

Proofs are not a replacement for tests, in the same way that a type system is not. Proofs definitely reduce the burden on the test suite, but there are things that can be tested that cannot be reasonably proved (that is, it is enough effort to prove them that it would be better to have tests, at least in the interim, than to wait).

Here are some things I would expect to see in a test suite if you develop it:

* Do your combinators behave as expected in the cases of bottom? In the case of async exceptions?
* Do they use and free resources as you expect?
* Are fixed bugs still fixed (regression tests)?

To reiterate: I love the formal reasoning approach, but I think it's naive to eschew tests in light of it.

1. Knuth's comment should be taken as a joke. A proof of correctness is just as strong as a test suite. Neither guarantees that a program works. Tests lack exhaustiveness and proofs require faith of model-correctness. In an ideal world, you'd want both.

That said, if the maintainer is taking the time to concoct a formal proof of correctness, there should be little stopping them from porting over those lemmas into quickcheck.

2. Sorry Alex, I accidentally replied as a top level comment so read my response below.

For the Anonymous commenter, there is one limitation when transporting things to QuickCheck, which is that I would need to write an Arbitrary instance for the underlying type (i.e. `Proxy`) and there is no way to do that.

2. What makes you think I'm proud of it? I've had an open issue for adding a test suite to pipes for ages:

Also, note that some downstream libraries (like `pipes-concurrency`) do have test suites:

I meant that first sentence in the following way: "Here's my excuse for not having written unit tests for so long". I was prioritizing equational reasoning before tackling unit tests.

There is also a second reason why I have delayed writing unit tests, particularly QuickCheck, which is that it is very difficult to do property-based testing of pipes because every constructor holds either a polymorphic functor or a function, and I don't have a good way to enumerate polymorphic functors or functions. In other words, I can't write an `Arbitrary` instance for the `Proxy` type.

I could also do non-property-based testing, too, and if somebody can contribute patches for that then I would welcome them.

3. Forgive. I misread the opening sentence as pride.

I was actually advocating non-property-based testing (or at least, not randomized property-based testing); quickcheck is sort of a middle ground between unit tests and proofs.

1. It is my fault, too. I have a history of taking extreme positions.

You're right that testing specific cases is feasible and useful until I figure out how to do property-based testing. I will try that out.

4. Hi Gabriel, thanks for this enlightening post! I do have one question. Do you design your programs with these proofs as an end result, and if so what do you do during the design phase to facilitate that?

1. As a library builder, I want my libraries to be as reusable as possible. However, it's difficult to anticipate in advance how people might use my libraries. Obviously, *I* know how I want to use it within my own applications, but one use case doesn't justify separating something out into a library.

This is where category theory comes in. Category theory is all about abstractions that have proven themselves among several different disciplines (i.e. math/physics/programming), so I figured that if I stick to these abstractions then I would be more likely to produce something that could be reused in more ways than I originally planned for.

That is in fact what happened. `pipes` ended up modeling things that I had not originally designed it for, like io-streams[1], conduits[2], iteratees (using `Consumer`s and `(>~)`), and for loops (see `Pipes.Tutorial`). These were all just emergent properties of the theory.

So the answer to your question is that I design with category theory as the end result, so that novel and unanticipated use cases can emerge from my work.

You asked how I facilitate that. I don't have a really good answer for that. All I can say is that I experiment a lot and keep playing with lots of ideas until it "snaps" to something elegant. However, probably the best practical advice I can give for how to produce elegant Haskell code is to only use type classes inspired by category theory (i.e. `Functor`/`Monad`) and to never create type classes of your own unless you are sure they are also theoretically inspired. Almost all the inelegant Haskell code I see in the wild is just senseless type class abuse. If you force yourself to avoid inelegant type classes then you just sort of naturally steer yourself towards elegant solutions by necessity.