## Friday, November 1, 2013

### Test stream programming using Haskell's `QuickCheck`

`pipes` is a stream programming library built on top of a foundation of basic category theory. The core of the library consists of a set of five categories that all intersect in a single streaming data type and the library's contract is the set of associated category laws.

For example, one such category is the "respond category", which `pipes` uses to implement `for` loops and `ListT`. The two key operations are `(/>/)`, which is the category composition operator, and `respond`, which is the identity. These must satisfy the following category laws:

``````respond />/ f = f                  -- Left Identity

f />/ respond = f                  -- Right Identity

(f />/ g) />/ h = f />/ (g />/ h)  -- Associativity``````

Previously, I described how I manually proved the category laws using equational reasoning, which elicited a strong response from readers that I should also verify the laws empirically using Haskell's `QuickCheck` library. After all, Haskell's `QuickCheck` library shines at automated property-based testing. Why not just fire up `QuickCheck` and test something like:

``> quickCheck \$ \f -> f />/ respond == f``

However, this leads to several problems:

• You can't compare pipes for `Eq`uality
• `QuickCheck` can't `Show` pipes when it discovers a counter-example
• You can't generate `Arbitrary` pipes

The latter is the most challenging problem of the three to solve: how do we generate random pipes to test? This has to be done in such a way that it exercises the system and efficiently discovers corner cases.

#### Randomizing pipes

I decided to try encoding a random pipe as random sequence of operations (specifically Kleisli arrows). This actually works out quite well, because we already have two natural operations built into the core bidirectional API: `request` and `respond`. Both of them allow information to pass through twice, as illustrated by the following ASCII diagrams:

``````request :: Monad m => a' -> Proxy a' a y' y m a

a'
|
+----|----+
|    |    |
a' <=====/    |
|         |
a  ======\    |
|    |    |
+----|----+
v
a

respond :: Monad m => a  -> Proxy x' x a' a m a'

a
|
+----|----+
|    |    |
|    \ /==== a'
|     X   |
|    / \===> a
|    |    |
+----|----+
v
a'``````

If we generate a random sequence of them them we just connect them head to toe. For simplicity, I will just assume that all inputs and outputs are of type `Int`:

``````pRandom1 = respond >=> request >=> respond

Int
|
+----|----+
|    |    |
|    \ /==== Int
|     X   |
|    / \===> Int
|    |    |
+----|----+
v
Int
|
+----|----+
|    |    |
Int <=====/    |
|         |
Int ======\    |
|    |    |
+----|----+
v
Int
|
+----|----+
|    |    |
|    \ /==== Int
|     X   |
|    / \===> Int
|    |    |
+----|----+
v
Int``````

When we compose proxies using something like `(>+>)` (the bidirectional generalization of `(>->)`), we conceptually place these random chains side by side and match inputs with outputs. For example, if we generate the following two random sequences of `request` and `respond`:

``````pRandom1 = respond >=> request >=> respond
pRandom2 = request >=> request``````

... then this generates the following flow of information, beginning from the top-right `Int`.

``````pRandom1 >+> pRandom2:

Int
|
+----|----+
|    |    |
Int<======================/    |
|                   |         |
+----|----+            /======\    |
|    |    |           /  |    |    |
|    \ /==== Int <=\ /   +----|----+
|     X   |         X         |
|    / \===> Int ==/ \        v
|    |    |          |       Int
+----|----+          |        |
v               \   +----|----+
Int               \  |    |    |
|                 \======/    |
+----|----+              |         |
|    |    |            /======\    |
... <=====/    |           /  |    |    |
|         |          /   +----|----+
... ======\    |          |        v
|    |    |          |       Int
+----|----+          |
v               |
Int              |
|               |
+----|----+          |
|    |    |          |
|    \ /==== Int     /
|     X   |         /
|    / \===> Int ==/
|    |    |
+----|----+
v
Int``````

This shows how you can easily generate complex flows of information to exercise the API with just a few simple permutations of `request` and `respond`.

However, notice how the `Int` flows through the pipeline without interruption until the pipeline terminates. This means that unless we do something else we will always return the same `Int` we began with.

So we add two additional actions to our random repertoire which will provide diagnostic information about which paths we took. The first one is `inc`, which increments the `Int` flowing through it:

``````inc :: (Monad m) => Int -> m Int
inc n = return (n + 1)``````

The second is `log`, which stores the current value of the `Int` using a `Writer []` base monad and reforwards the `Int` further along:

``````log :: Int -> Proxy a' a b' b (Writer [Int]) Int
log n = do
lift (tell [n])
return n``````

Now we can easily generate random `Proxy`s just by assembling random chains of these four actions:

``pRandom = log >=> inc >=> request >=> inc >=> inc >=> ...``

We can use this same trick to generate arbitrary `Server`s and `Client`s (i.e. one-sided `Proxy`s) by just selectively removing either `request` or `respond` from the list of permissible actions.

Once we have random `Proxy`s, `Server`s, and `Client`s, we can test for equality by testing the values that each side returns and logs when given an initial value of `0`:

``````p1 === p2 = run p1 == run p2
where
run p = runWriter (runEffect (p 0))

infix 4 ===``````

This comparison is pure because the `Writer []` base monad is pure, so we can pass it as suitable property that `QuickCheck` can test. Well, almost...

We also need to be able to `Show` the randomized values that we selected so that `QuickCheck` can print any counter-examples it discovers. The solution, though, is pretty simple. We can use an intermediate representation that is just an enumeration. This just stores placeholders for each action in our chain, and these placeholders are `Show`able:

``````-- Place holder for a single `Proxy` action
data ProxyStep
= ProxyRequest
| ProxyRespond
| ProxyLog
| ProxyInc
deriving (Enum, Bounded)

instance Arbitrary ProxyStep where
arbitrary = arbitraryBoundedEnum
shrink _  = []

instance Show ProxyStep where
show x = case x of
ProxyRequest -> "request"
ProxyRespond -> "respond"
ProxyLog     -> "log"
ProxyInc     -> "inc"

-- Place holder for a single `Client` action
data ClientStep
= ClientRequest
| ClientLog
| ClientInc
deriving (Enum, Bounded)

instance Arbitrary ClientStep where
arbitrary = arbitraryBoundedEnum
shrink _  = []

instance Show ClientStep where
show x = case x of
ClientRequest -> "request"
ClientLog     -> "log"
ClientInc     -> "inc"

-- Place holder for a single `Server` action
data ServerStep
= ServerRespond
| ServerLog
| ServerInc
deriving (Enum, Bounded)

instance Arbitrary ServerStep where
arbitrary = arbitraryBoundedEnum
shrink _  = []

instance Show ServerStep where
show x = case x of
ServerRespond -> "respond"
ServerLog     -> "log"
ServerInc     -> "inc"``````

Then we tell `QuickCheck` to generate random lists of these actions which will form our complete `Proxy`s, `Client`s, or `Server`s:

``````correct :: String -> String
correct str = case str of
[] -> "return"
_  -> str

-- Place holder for a `Proxy`
newtype AProxy = AProxy { unAProxy :: [ProxyStep] }

instance Arbitrary AProxy where
arbitrary = fmap AProxy arbitrary
shrink = map AProxy . shrink . unAProxy

instance Show AProxy where
show = correct . intercalate " >=> " . map show . unAProxy

-- Place holder for a `Client`
newtype AClient = AClient { unAClient :: [ClientStep] }

instance Arbitrary AClient where
arbitrary = fmap AClient arbitrary
shrink = map AClient . shrink . unAClient

instance Show AClient where
show = correct . intercalate " >=> " . map show . unAClient

-- Place holder for a `Server`
newtype AServer = AServer { unAServer :: [ServerStep] }

instance Arbitrary AServer where
arbitrary = fmap AServer arbitrary
shrink = map AServer . shrink . unAServer

instance Show AServer where
show = correct . intercalate " >=> " . map show . unAServer``````

Once `QuickCheck` generates these randomized lists of actions we can convert them to honest-to-god `Proxy`s/`Server`s/`Client`s for testing purposes using the following conversion functions:

``````aProxy
:: AProxy -> Int -> Proxy Int Int Int Int (Writer [Int]) Int
aProxy = foldr (>=>) return . map f . unAProxy
where
f x = case x of
ProxyRequest -> request
ProxyRespond -> respond
ProxyLog     -> log
ProxyInc     -> inc

aClient :: AClient -> Int -> Client Int Int (Writer [Int]) Int
aClient = foldr (>=>) return . map f . unAClient
where
f x = case x of
ClientRequest -> request
ClientLog     -> log
ClientInc     -> inc

aServer :: AServer -> Int -> Server Int Int (Writer [Int]) Int
aServer = foldr (>=>) return . map f . unAServer
where
f x = case x of
ServerRespond -> respond
ServerLog     -> log
ServerInc     -> inc``````

#### Test-driving `pipes`

Combined with the previous `(===)` function for testing equality, I can now test high-level properties like associativity:

``````main = quickCheck \$ \p1' p2' p3' ->
let p1 = aServer p1'
p2 = aProxy  p2'
p3 = aClient p3'
in  p1 >+> (p2 >+> p3) === (p1 >+> p2) >+> p3``````

`QuickCheck` then generates 100 random test cases and verifies that all of them obey the associativity law:

``````>>> main
++ OK, passed 100 tests.``````

However, this is still not enough. Perhaps my randomization scheme is simply not exercising corner cases sufficiently well. To really convince myself that I have a good randomization scheme I must try a few negative controls to see how effectively `QuickCheck` uncovers property violations.

For example, let's suppose that some enterprising young NSA employee were to try to commit a modification to the identity pipe `pull'` to try to `log` the second value flowing upstream. We could set up a test case to warn us if the modified `pull'` function failed to obey the identity law:

``````main = quickCheck \$ \p1' p2' ->
let p1    = aServer p1'

-- The maliciously modified `pull` function
pull' = request >=> respond >=> log >=> pull'

p2    = aClient p2'
in  p1 >+> (pull' >+> p2) === p1 >+> p2``````

`QuickCheck` would then snoop this out in a hurry and catch the law violation.

``````*** Failed! Falsifiable (after 12 tests and 6 shrinks):
respond
request >=> request``````

Not only does `QuickCheck` detect violations, but it also goes out of its way to minimize the violation to the minimum reproducing test case. In this case, the way you read the `QuickCheck` output is that the minimal code necessary to trigger the violation is when:

``````p1 = respond
p2 = request >=> request``````

In other words, `QuickCheck` detects a spurious `log` on the left-hand side of the equation if `p2` `request`s two values and `p1` `respond`s with at least one value. Notice that if `p1` did not `respond` with at least one value then the left pipeline would terminate before `p2`'s second request and avoid triggering the `log` statement.

`QuickCheck` can do this kind of minimization because of purity. Since our test case is pure, `QuickCheck` can safely run it repeatedly as it tries to `shrink` the counter-example, without having to worry that repeated runs will interfere with each other because of side effects or statefulness.

Here's another example, where we accidentally wrote `pull` wrong and inserted one extra `request` too many:

``````main = quickCheck \$ \p1' p3' ->
let p1    = aServer p1'
pull' = request >=> respond >=> request >=> pull
p3    = aClient p3'
in  p1 >+> (pull' >+> p3) === p1 >+> p3``````

`QuickCheck` immediately pulls out this clever counter-example:

``````*** Failed! Falsifiable (after 15 tests and 8 shrinks):
respond >=> respond
request >=> request >=> inc``````

This instructs `p1` and `p2` to exchange information twice. This triggers `pull'` to accidentally `request` one value too many after the second exchange and terminate early before `p2` can call `inc`.

Examples like these give me confidence that permutations on these four actions suffice to build most useful counter-examples. I could probably even narrow it down to three commands by eliminating `log`, but for now I will keep it.

#### Conclusion

I believe this is the first example of a useful `Arbitrary` instance for randomizing a stream programming data type. This allows `pipes` to test more powerful properties than most stream programming libraries, which would normally settle for randomizing input to the system instead of randomizing the flow of control.

I want to thank Csernik Flaviu Andrei who took the time to write up all the `pipes` laws into a `QuickCheck` suite integrated with `cabal test`. Thanks to him you can now have even greater confidence in the correctness of the `pipes` library.

The next step would be to machine-check the `pipes` laws using Agda, which would provide the greatest assurance of correctness.