Monday, May 18, 2015

The internet of code

In this post I will introduce a proof-of-concept implementation for distributing typed code over the internet where the unit of compilation is individual expressions.

The core language

To motivate this post, consider this Haskell code:

data Bool = True | False

and :: Bool -> Bool -> Bool
and b1 b2 = if b1 then b2 else False

or :: Bool -> Bool -> Bool
or b1 b2 = if b1 then True else b2

data Even = Zero | SuccE Odd

data Odd = SuccO Even

four :: Even
four = SuccE (SuccO (SuccE (SuccO Zero)))

doubleEven :: Even -> Even
doubleEven (SuccE o) = SuccE (SuccO (doubleOdd o)
doubleEven  Zero     = Zero

doubleOdd :: Odd -> Even
doubleOdd (SuccO e) = SuccE (SuccO (doubleEven e)

I will encode each one of the above types, terms, and constructors as separate, closed, non-recursive expressions in the calculus of constructions. You can think of the calculus of constructions as a typed assembly language for functional programs which we will use to distribute program fragments over the internet. You can learn more about the calculus of constructions and other pure type systems by reading this clear paper by Simon Peyton Jones: "Henk: a typed intermediate language".

For example, here is how you encode the True constructor in the calculus of constructions:

λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True

Note that the entire expression is the True constructor, not just the right-hand side:

             This is the True constructor
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True
                                                 ^^^^
                                               Not this

I just chose the variable names so that you can tell at a glance what constructor you are looking at from the right-hand side of the expression.

Similarly, here is how you encode the type Bool in the calculus of constructions:

               This is the `Bool` type
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool
                                                 ^^^^
                                               Not this

Again, the entire expression is the Bool type, but I chose the variable names so that you can tell which type you are looking at from the right-hand side.

You can learn more about the full set of rules for translating data types to System F (a subset of the calculus of constructions) by reading the paper: "Automatic synthesis of typed Λ-programs on term algebras". Also, I will soon release a compiler named annah that automates this translation algorithm, and I used this compiler to translate the above Haskell code to the equivalent expressions in the calculus of constructions.

Distribution

We can distribute these expressions by hosting each expression as text source code on the internet. For example, I encoded all of the above types, terms and constructors in the calculus of constructions and hosted them using a static web server. You can browse these expressions by visiting sigil.place/post/0/.

Click on one of the expressions in the directory listing to see how they are encoded in the calculus of constructions. For example, if you click the link to four you will find an ordinary text file whose contents look like this (formatted for clarity):

  λ(Even : *)                        -- This entire
→ λ(Odd : *)                         -- expression is
→ λ(Zero : Even)                     -- the number `four`
→ λ(SuccE : ∀(pred : Odd) → Even)    --
→ λ(SuccO : ∀(pred : Even) → Odd)    --
→ SuccE (SuccO (SuccE (SuccO Zero))) -- Not just this last line

Each one of these expressions gets a unique URL, and we can embed any expression in our code by referencing the appropriate URL.

Remote imports

We can use the morte compiler to download, parse, and super-optimize programs written in the calculus of constructions. The morte compiler reads in a program from standard input, outputs the program's type to standard error, then super-optimizes the program and outputs the optimized program to standard output.

For example, we can compute and True False at compile time by just replacing and, True, and False by their appropriate URLs:

$ cabal install 'morte >= 1.2'
$ morte
#http://sigil.place/post/0/and
    #http://sigil.place/post/0/True
    #http://sigil.place/post/0/False

When we hit <Ctrl-D> to signal end of standard input then morte will compile the program:

$ morte
#http://sigil.place/post/0/and
    #http://sigil.place/post/0/True
    #http://sigil.place/post/0/False
<Ctrl-D>
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False

The program's type is Bool and morte optimizes away the program to False at compile time. Both the type (Bool) and the value (False) are encoded in the calculus of constructions.

Here we are using morte as a compile-time calculator, mainly because Morte does not yet compile to a backend language. When I release a backend language I will go into more detail about how to embed expressions to evaluate at runtime instead of compile time.

Local imports

We can shorten this example further because morte also lets you import expressions from local files using the same hashtag syntax. For example, we can create local files that wrap remote URLs like this:

$ echo "#http://sigil.place/post/0/Bool"  > Bool
$ echo "#http://sigil.place/post/0/True"  > True
$ echo "#http://sigil.place/post/0/False" > False
$ echo "#http://sigil.place/post/0/or"    > or

We can then use these local files as convenient short-hand aliases for remote imports:

$ morte
#or #True #False
<Ctrl-D>
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True

We can use imports anywhere in our program, even as types! For example, in the calculus of constructions you encode if as the identity function on #Bools:

λ(b : #Bool ) → b  # Note: Imports must end with whitespace

We can then save our implementation of if to a file named if, except using the ASCII symbols \ and -> instead of λ and :

$ echo "\(b : #Bool ) -> b" > if

Now we can define our own and function in terms of if. Remember that the Haskell definition of and is:

and b1 b2 = if b1 then b2 else False

Our definition won't be much different:

$ echo "\(b1 : #Bool ) -> \(b2 : #Bool ) -> #if b1 #Bool b2 #False" > and

Let's confirm that our new and function works:

$ echo "#and #True #False" | morte
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False

We can also ask morte to resolve all imports for our and function and optimize the result:

$ morte < and
  ∀(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ ∀(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

  λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
     b2
     (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

We can then compare our version with the and expression hosted online, which is identical:

$ curl sigil.place/post/0/and
  λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
     b2
     (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

Reduction

When we write an expression like this:

#or #True #False

The compiler resolves all imports transitively until all that is left is an expression in the calculus of constructions, like this one:

-- or
( λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
     (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)
)
-- True
(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True )
-- False
(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

Then the compiler reduces this expression using β-reduction and ε-reduction. We can safely reduce these expressions at compile time because these reductions always terminate in the calculus of constructions, which is a total and non-recursive language.

For example, the above expression reduces to:

  ( λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
  → b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
       (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)
  )
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True )
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

-- β-reduce
= (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True )
  (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

-- β-reduce
= ( λ(True : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
  → λ(False : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
  → True
  )
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

-- β-reduce
= ( λ(False : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
  → λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True
  )
  (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

-- β-reduce
= λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True

Linking

The and we defined is "dynamically linked", meaning that the file we saved has not yet resolved all imports:

$ cat and
\(b1 : #Bool ) -> \(b2 : #Bool ) -> #if b1 #Bool b2 #False

The morte compiler will resolve these imports every time we import this expression within a program. To be precise, each import is resolved once per program and then cached and reused for subsequent duplicate imports. That means that the compiler only imports #Bool once for the above program and not three times. Also, we can transparently cache these expressions just like any other web resource by providing the appropriate Cache-Control HTTP header. For example, my static web server sets max-age to a day so that expressions can be cached for up to one day.

If our imported expressions change then our program will reflect those changes, which may or may not be desirable. For the above program dynamic linking is undesirable because if we change the file #False to point to sigil.place/post/0/True then we would break the behavior of the and function.

Alternatively, we can "statically link" the and function by resolving all imports using the morte compiler. For example, I statically linked my remote and expression because the behavior should never change:

$ curl sigil.place/post/0/and
  λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
     b2
     (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)

In other scenarios you might want to dynamically link expressions if you want to automatically pull in upgrades from trusted upstream sources. This is the same rationale behind service-oriented architectures which optimize for transparent system-wide updates, except that instead of updating a service we update an expression.

Partial application

We can store partially applied functions in files, too. For example, we could store and True in a statically linked file named example using morte:

$ echo "#and #True" | morte > example
  ∀(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)
→ ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

The type still goes to standard error, but the partially applied function goes to the example file. We can use the partially applied function just by referencing our new file:

$ morte
#example #False  -- Same as: #and #True #False
<Ctrl-D>
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False

We can even view example and see that it's still just an ordinary text source file:

$ cat example
λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → b2

We can also see that morte was clever and optimized #and #True to the identity function on #Bools.

If we wanted to share our example code with our friends, we'd just host the file using any static web server. I like to use Haskell's warp server (from the wai-app-static package) for static hosting, but even something like python -m SimpleHTTPServer would work just as well:

$ cabal install wai-app-static
$ warp
Serving directory /tmp/code on port 3000 with ["index.html","index.htm"] index files.

Then we could provide our friends with a URL pointing to the example file and they could embed our code within their program by pasting in our URL.

Types

The calculus of constructions is typed, so if you make a mistake, you'll know immediately:

$ morte
#True #True
<Ctrl-D>
Expression:
(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)
(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)

Error: Function applied to argument of the wrong type

Expected type: *
Argument type: ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

Types are what differentiates morte from a curl into sh. You can use the type system to whitelist the set of permissible values to import.

For example, in this code, there are only two values of #x that will type-check (up to α-conversion):

(λ(b : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → b) #x

Therefore, we can safely import a remote value knowing that the type-checker will reject attempts to inject arbitrary code.

When building a program with effects, we can similarly refine the set of permissible actions using the types. I introduced one such example in my previous post on morte, where the recursive.mt program restricts the effects to reading and printing lines of text and nothing else. You could then import a remote expression of type:

  ∀(String : *)
→ ∀(U : *)
→ ∀(Unit : U)
→ ∀(IO : *)
→ ∀(GetLine : String → IO → IO)
→ ∀(PutStrLn : (String → IO) → IO)
→ ∀(Return : U → IO)
→ IO

... which is the type of an effect syntax tree built from GetLine/PutStrLn/Return constructors. The type-checker will then enforce that the imported syntax tree cannot contain any other constructors and therefore cannot be interpreted to produce any other effects.

Recursive data types

You can encode recursive data types and functions in the calculus of constructions. This is all the more amazing when you realize that the calculus of constructions does not permit recursion! Also, morte's import system forbids recursion as well. If you try to recurse using imports you will get an error:

$ echo "#foo" > bar
$ echo "#bar" > foo
$ morte < foo
morte: 
⤷ #bar 
⤷ #foo 
Cyclic import: #bar 

Joe Armstrong once proposed that the core language for an internet of code would require built-in support for recursion (via letrec or something similar), but that's actually not true! The paper "Automatic synthesis of typed Λ-programs on term algebras" spells out how to encode recursive data types in the non-recursive System F language. What's amazing is that the algorithm works even for mutually recursive data types like Even and Odd from our original Haskell example.

You don't have to take my word for it! You can verify for yourself that the Even and Odd types and the Zero/SuccE/SuccO constructors that I hosted online are not recursive:

Let's create local aliases for the constructors so we can built our own Even or Odd values:

$ echo "#http://sigil.place/post/0/Zero"  > Zero
$ echo "#http://sigil.place/post/0/SuccE" > SuccE
$ echo "#http://sigil.place/post/0/SuccO" > SuccO

We can then assemble the number four using these constructors:

$ morte
#SuccE (#SuccO (#SuccE (#SuccO #Zero )))
<Ctrl-D>
  ∀(Even : *)
→ ∀(Odd : *)
→ ∀(Zero : Even)
→ ∀(SuccE : ∀(pred : Odd) → Even)
→ ∀(SuccO : ∀(pred : Even) → Odd)
→ Even

  λ(Even : *)
→ λ(Odd : *)
→ λ(Zero : Even)
→ λ(SuccE : ∀(pred : Odd) → Even)
→ λ(SuccO : ∀(pred : Even) → Odd)
→ SuccE (SuccO (SuccE (SuccO Zero)))

The result is identical to the four that I hosted:

$ curl sigil.place/post/0/four
  λ(Even : *)
→ λ(Odd : *)
→ λ(Zero : Even)
→ λ(SuccE : ∀(pred : Odd) → Even)
→ λ(SuccO : ∀(pred : Even) → Odd)
→ SuccE (SuccO (SuccE (SuccO Zero)))

We can even encode functions over mutually recursive types like doubleEven and doubleOdd. You can verify that the ones I wrote are not recursive:

... and then we can test that they work by doubling the number four:

$ morte
#http://sigil.place/post/0/doubleEven
    #http://sigil.place/post/0/four
<Ctrl-D>
  ∀(Even : *)
→ ∀(Odd : *)
→ ∀(Zero : Even)
→ ∀(SuccE : ∀(pred : Odd) → Even)
→ ∀(SuccO : ∀(pred : Even) → Odd)
→ Even

  λ(Even : *)
→ λ(Odd : *)
→ λ(Zero : Even)
→ λ(SuccE : ∀(pred : Odd) → Even)
→ λ(SuccO : ∀(pred : Even) → Odd)
→ SuccE (SuccO (SuccE (SuccO (SuccE (SuccO (SuccE (SuccO Zero)))))))

We get back the Even number eight encoded in the calculus of constructions.

Stack traces

morte will provide a "stack trace" if there is a type error or parse error:

$ echo "\(a : *) ->" > foo  # This is a malformed program
$ echo "#foo" > bar
$ echo "#bar" > baz
$ echo "#baz" > qux
$ morte < qux
morte: 
⤷ #qux 
⤷ #baz 
⤷ #bar 
⤷ #foo 

Line:   2
Column: 1

Parsing: EOF

Error: Parsing failed

You can learn more about how morte's import system works by reading the newly add "Imports" section of the morte tutorial.

Comparison to other software architectures

morte's code distribution system most closely resembles the distribution model of Javascript, meaning that code can be downloaded from any URL and is compiled or interpreted client-side. The most important difference between the two is the granularity of imports and the import mechanism.

In morte the unit of distribution is individual types, terms, and constructors and you can inject a remote expression anywhere in the syntax tree by referencing its URL. This is why we can do crazy things like use a URL for a type:

λ(b : #http://sigil.place/post/0/Bool ) → ...

The second difference is that morte is designed from the ground up to be typed and highly optimizable (analogous to asm.js, a restricted subset of Javascript designed for ease of optimization).

The third difference is that morte lets you precisely delimit what remote code can do using the type system, unlike Javascript.

Future directions

This is just one piece of the puzzle in a long-term project of mine to build a typed and distributed intermediate language that we can use to share code across language boundaries. I want to give people the freedom to program in the language of their choice while still interoperating freely with other languages. In other words, I'm trying to build a pandoc for programming languages.

However, this project is still not really usable, even in anger. There are several missing features to go, some of which will be provided by my upcoming annah library:

Requirement #1: There needs to be a way to convert between restricted subsets of existing programming languages and the calculus of constructions

annah currently provides logic to encode medium-level language abstractions to and from the calculus of constructions. In fact, that's how I converted the Haskell example at the beginning of this post into the calculus of constructions. For example, I used annah to derive how to encode the SuccE constructor in the calculus of constructions:

$ annah compile
type Even
data Zero
data SuccE (pred : Odd)

type Odd
data SuccO (pred : Even)

in SuccE
<Ctrl-D>

... and annah correctly deduced the type and value in the calculus of constructions:

  ∀(pred : ∀(Even : *)
         → ∀(Odd : *)
         → ∀(Zero : Even)
         → ∀(SuccE : ∀(pred : Odd)
         → Even )
→ ∀(SuccO : ∀(pred : Even) → Odd) → Odd)
→ ∀(Even : *)
→ ∀(Odd : *)
→ ∀(Zero : Even)
→ ∀(SuccE : ∀(pred : Odd) → Even)
→ ∀(SuccO : ∀(pred : Even) → Odd)
→ Even

  λ(pred : ∀(Even : *)
         → ∀(Odd : *)
         → ∀(Zero : Even)
         → ∀(SuccE : ∀(pred : Odd) → Even)
         → ∀(SuccO : ∀(pred : Even) → Odd)
         → Odd )
→ λ(Even : *)
→ λ(Odd : *)
→ λ(Zero : Even)
→ λ(SuccE : ∀(pred : Odd) → Even)
→ λ(SuccO : ∀(pred : Even) → Odd)
→ SuccE (pred Even Odd Zero SuccE SuccO)

Among other things, annah automates the algorithm from "Automatic synthesis of typed Λ-programs on term algebras", which is known as "Böhm-Berarducci encoding".

Requirement #2: There must be an efficient way to transmit bytes, text, and numbers alongside code.

My plan is to transmit this information out-of-band as a separate file rather than embedding the data directly within the code and annah will provide a systematic convention for distributing data and referencing that data within source code.

Requirement #3: There needs to be a standard library of types, data structures, functions, and side effects that all target languages must support.

In other words, there needs to be some sort of thrift for code so that languages can maximize code sharing.

Requirement #4: There must be better tooling for mass installation and hosting of expressions.

For example, I'd like to be able to alias all imports within a remote directory to local files with a single command.

Requirement #5: I need to figure out a way to mesh type inference with an expression-level distribution system.

As far as I can tell this is still an open research problem and this is most likely going to be the greatest obstacle to making this usable in practice.

Resources

If you would like to learn more about Morte or contribute, then check out the following resources:

Wednesday, May 6, 2015

Haskell content spinner

Recently somebody posted a template for generating blog comment spam, so I thought: "What sillier way to show how elegant Haskell is than generating comment spam?!"

The first "stanza" of the template looks like this:

{I have|I've} been {surfing|browsing} online 
more than {three|3|2|4} hours today, yet
I never found any interesting article like yours. 
{It's|It is}
pretty worth enough for me. {In my 
opinion|Personally|In my view},
if all {webmasters|site owners|website owners|web 
owners} and bloggers made good content as you 
did, the {internet|net|web} will be {much more|a 
lot more} useful than ever before.|
I {couldn't|could not} {resist|refrain from} 
commenting.
{Very well|Perfectly|Well|Exceptionally well} 
written!|
{I will|I'll} {right away|immediately} {take 
hold of|grab|clutch|grasp|seize|snatch} your 
{rss|rss feed} as I {can not|can't} {in 
finding|find|to find} your {email|e-mail} 
subscription {link|hyperlink} or 
{newsletter|e-newsletter} service.
Do {you have|you've} any? {Please|Kindly} 
{allow|permit|let} me 
{realize|recognize|understand|recognise|know} {so 
that|in order that} I {may
just|may|could} subscribe. Thanks.|
{It is|It's} {appropriate|perfect|the best} 
time to
make some plans for the future and {it is|it's} 
time to be happy. 

Anything of the form {x|y|z} represents a choice between alternative text fragments x, y, and z. The above template has four large alternative comments to pick from, each with their own internal variations. The purpose of these alternatives is to evade simple spam detection algorithms, much like how some viruses evade the immune system by mutating antigens.

I wanted to write a Haskell program that selected a random template from one of the provided alternatives and came up with this:

{-# LANGUAGE OverloadedStrings #-}

import Control.Foldl (random)  -- Requires `foldl-1.0.10` or higher
import Turtle

main = do
    x <- foldIO spam random
    print x

spam :: Shell Text
spam =  -- 1st major template
       ""
     * ("I have" + "I've")
     *  " been "
     * ("surfing" + "browsing")
     *  " online more than "
     * ("three" + "3" + "2" + "4")
     *  " hours today, yet I never found any interesting article like yours. "
     * ("It's" + "It is")
     *  " pretty worth enough for me. "
     * ("In my opinion" + "Personally" + "In my view")
     *  ", if all "
     * ("webmasters" + "site owners" + "website owners" + "web owners")
     *  " and bloggers made good content as you did, the "
     * ("internet" + "net" + "web")
     *  " will be "
     * ("much more" + "a lot more")
     *  " useful than ever before."

       -- 2nd major template
   +   " I "
     * ("couldn't" + "could not")
     *  " "
     * ("resist" + "refrain from")
     *  " commenting. "
     * ("Very well" + "Perfectly" + "Well" + "Exceptionally well")
     *  " written!"

        -- 3rd major template
   +    " "
     * ("I will" + "I'll")
     *  " "
     * ("right away" + "immediately")
     *  " "
     * ("take hold of" + "grab" + "clutch" + "grasp" + "seize" + "snatch")
     *  " your "
     * ("rss" + "rss feed")
     *  " as I "
     * ("can not" + "can't")
     *  " "
     * ("in finding" + "find" + "to find")
     *  " your "
     * ("email" + "e-mail")
     *  " subscription "
     * ("link" + "hyperlink")
     *  " or "
     * ("newsletter" + "e-newsletter")
     *  " service. Do "
     * ("you have" + "you've")
     *  " any? "
     * ("Please" + "Kindly")
     *  " "
     * ("allow" + "permit" + "let")
     *  " me "
     * ("realize" + "recognize" + "understand" + "recognise" + "know")
     *  " "
     * ("so that" + "in order that")
     *  " I "
     * ("may just" + "may" + "could")
     *  " subscribe. Thanks."

        -- 4th major template
   +    " "
     * ("It is" + "It's")
     *  " "
     * ("appropriate" + "perfect" + "the best")
     *  " time to make some plans for the future and "
     * ("it is" + "it's")
     *  " time to be happy."

Conceptually, all I did to embed the template in Haskell was to:

  • add a quote to the beginning of the template: "
  • replace all occurences of { with "*(" (including quotes)
  • replace all occurences of } with ")*" (including quotes)
  • replace all occurences of | with "+" (including quotes)
  • add a quote to the end of the template: "

In fact, I mechanically transformed the template to Haskell code using simple sed commands within vi and then just formatted the result to be more readable.

Before explaining why this works, let's try our program out to verify that it works:

$ ghc -O2 spam.hs
$ ./spam
Just " I will right away grab your rss feed as I 
can not find your email subscription hyperlink or 
newsletter service. Do you've any? Please let me 
recognise in order that I could subscribe. 
Thanks."
$ ./spam
Just " I'll immediately seize your rss as I can not find 
your email subscription link or e-newsletter service. Do 
you have any? Please allow me realize in order that I may 
subscribe. Thanks."

You might wonder: how does the above program work?

Types

Let's begin from the type of the top-level utility named foldIO:

foldIO
    :: Shell a       -- A stream of `a`s
    -> FoldM IO a b  -- A fold that reduces `a`s to a single `b`
    -> IO b          -- The result (a `b`)

foldIO connects a producer of as (i.e. a Shell) to a fold that consumes as and produces a single b (i.e. a FoldM). For now we will ignore how they are implemented. Instead we will play type tetris to see how we can connect things together.

The first argument we supply to foldIO is spam, whose type is:

spam :: Shell Text

Think of a Shell as a stream, and spam is a stream whose elements are Text values. Each element of this stream corresponds to one possible alternative for our template. For example, a template with exactly one alternative would be a stream with one element.

When we supply spam as the first argument to foldIO, the compiler infers that the first a in the type of foldIO must be Text

foldIO :: Shell a -> FoldM IO a b -> IO b
                ^
                |
                |
                |
spam   :: Shell Text

... therefore, the second a must also be Text:

foldIO :: Shell a -> FoldM IO a b -> IO b
                ^             ^
                |             |
                +-------------+
                |
spam   :: Shell Text

... so in this context foldIO has the more specialized type:

foldIO :: Shell Text -> FoldM IO Text b -> IO b

... and when we apply foldIO to spam we get the following narrower type:

foldIO spam :: FoldM IO Text b -> IO b

Now all we need to do is to provide a fold that can consume a stream of Text elements. We choose the random fold, which uses reservoir sampling to pick a random element from the stream. The type of random is:

random :: FoldM IO a (Maybe a)

In other words, given an input stream of as, this fold reduces the stream to a single Maybe a. The Maybe is either Nothing if the stream is empty or Just some random element from the stream if the stream is non-empty.

When we supply random as the second argument to foldIO, the compiler infers that the a in random must be Text:

foldIO spam :: FoldM IO Text        b -> IO b
                        |
                        |
                        |
                        v
random      :: FoldM IO a    (Maybe a)

... therefore the second a must also be Text:

foldIO spam :: FoldM IO Text        b -> IO b
                        |
                        +-----------+
                        |           |
                        v           v
random      :: FoldM IO a    (Maybe a)

So the specialized type of random becomes:

foldIO spam :: FoldM IO Text        b     -> IO b

random      :: FoldM IO Text (Maybe Text)

Now we can apply type inference in the opposite direction! The compiler infers that the b in the type of foldIO must be Maybe Text:

foldIO spam :: FoldM IO Text        b     -> IO b
                                    ^
                                    |
                                    |
                                    |
random      :: FoldM IO Text (Maybe Text)

... therefore the other b must also be Maybe Text:

foldIO spam :: FoldM IO Text        b     -> IO b
                                    ^           ^
                                    |           |
                                    +-----------+
                                    |
random      :: FoldM IO Text (Maybe Text)

... so we specialize foldIO's type even further to:

foldIO spam :: foldM IO Text (Maybe Text) -> IO (Maybe Text)

... and when we apply that to random the type simplifies down to:

foldIO spam random :: IO (Maybe Text)

The end result is a subroutine that loops over the stream using reservoir sampling, selects a random element (or Nothing if the stream is empty), and then returns the result.

All that's left is to print the result:

main = do
    x <- foldIO spam random
    print x

So that explains the top half of the code, but what about the bottom half? What is up with the addition and multiplication of strings?

Overloading

The first trick is that the strings are actually not strings at all! Haskell lets you overload string literals using the OverloadedStrings extension so that they type-check as any type that implements the IsString type class. The Shell Text type is one such type. If you provide a string literal where the compiler expects a Shell Text then the compiler will instead build a 1-element stream containing just that string literal.

The second trick is that Haskell lets you overload numeric operatoins to work on any type that implements the Num type class. The Shell Text type implements this type class, so you can add and multiply streams of text elements.

The behavior of addition is stream concatenation. In our template, when we write:

"Very well" + "Perfectly" + "Well" + "Exceptionally well"

... we are really concatenating four 1-element streams into a combined 4-element stream representing the four alternatives.

The behavior of multiplication is to sequence two templates. Either template may be a stream of multiple alternatives so when we sequence them we take the "cartesian product" of both streams. When we multiply two streams we concatenate every alterntaive from the first stream with every alternative from the second stream and return all possible combinations.

For example, when we write:

("couldn't " + "could not ") * ("resist" + "refrain from")

This reduces to four combinations:

  "couldn't resist"
+ "couldn't refrain from"
+ "could not resist"
+ "could not refrain from"

You can actually derive this using the rules of addition and multiplication:

("couldn't " + "could not ") * ("resist" + "refrain from")

-- Multiplication distributes over left addition
  "couldn't "  * ("resist" + "refrain from")
+ "could not " * ("resist" + "refrain from")

-- Multiplication distributes again
  "couldn't "   * "resist"
+ "couldn't "   * "refrain from"
+ "could not "  * "resist"
+ "could not "  * "refrain from"

-- Multiplying 1-element templates just sequences them
  "couldn't resist"
+ "couldn't refrain from"
+ "could not resist"
+ "could not refrain from"

Notice how if we sequence two 1-element templates

"I have " * "been"

... it's identical to string concatenation:

"I have been"

And that's it! We build the template using arithmetic and then we fold the results using random to select one template at random. That's the complete program! Or did we?

Weighting

Actually there's one catch: our spam generator is very heavily biased towards the third major template. This is because the generator weights all alternatives equally, but each major template has a different number of alternatives:

  • 1st template: 2304 alternatives
  • 2nd template: 16 alternatives
  • 3rd template: 829440
  • 4th template: 12 alternatives

As a result we're 360 times more likely to get the 3rd template than the next most common template (the 1st one). How can we weight each template to undo this bias?

The answer is simple: we can weight each template by using multiplication, scaling each template by the appropritae numeric factor.

In this case, the weights we will apply are:

  • 1st template: Increase frequency by 360x
  • 2nd template: Increase frequency by 51840x
  • 3rd template: Keep frequency the same (1x)
  • 4th template: Increase frequency by 69120x

Here's the implementation:

spam =  -- 1st major template
        360
     *  ""
     * ("I have" + "I've")
     *  " been "
     * ("surfing" + "browsing")
     *  " online more than "
     * ("three" + "3" + "2" + "4")
     *  " hours today, yet I never found any interesting article like yours. "
     * ("It's" + "It is")
     *  " pretty worth enough for me. "
     * ("In my opinion" + "Personally" + "In my view")
     *  ", if all "
     * ("webmasters" + "site owners" + "website owners" + "web owners")
     *  " and bloggers made good content as you did, the "
     * ("internet" + "net" + "web")
     *  " will be "
     * ("much more" + "a lot more")
     *  " useful than ever before."

        -- 2nd major template
   +    51840
     *  " I "
     * ("couldn't" + "could not")
     *  " "
     * ("resist" + "refrain from")
     *  " commenting. "
     * ("Very well" + "Perfectly" + "Well" + "Exceptionally well")
     *  " written!"

        -- 3rd major template
   +    1
     *  " "
     * ("I will" + "I'll")
     *  " "
     * ("right away" + "immediately")
     *  " "
     * ("take hold of" + "grab" + "clutch" + "grasp" + "seize" + "snatch")
     *  " your "
     * ("rss" + "rss feed")
     *  " as I "
     * ("can not" + "can't")
     *  " "
     * ("in finding" + "find" + "to find")
     *  " your "
     * ("email" + "e-mail")
     *  " subscription "
     * ("link" + "hyperlink")
     *  " or "
     * ("newsletter" + "e-newsletter")
     *  " service. Do "
     * ("you have" + "you've")
     *  " any? "
     * ("Please" + "Kindly")
     *  " "
     * ("allow" + "permit" + "let")
     *  " me "
     * ("realize" + "recognize" + "understand" + "recognise" + "know")
     *  " "
     * ("so that" + "in order that")
     *  " I "
     * ("may just" + "may" + "could")
     *  " subscribe. Thanks."

        -- 4th major template
   +    69120
     *  " "
     * ("It is" + "It's")
     *  " "
     * ("appropriate" + "perfect" + "the best")
     *  " time to make some plans for the future and "
     * ("it is" + "it's")
     *  " time to be happy."

Now this produces a fairer distribution between the four major alternatives:

$ ./spam
Just "I have been surfing online more than three 
hours today, yet I never found any interesting 
article like yours. It's pretty worth enough for 
me. In my view, if all web owners and bloggers 
made good content as you did, the internet will 
be a lot more useful than ever before."
$ ./spam
Just " It's the best time to make some plans for 
the future and it's time to be happy."
$ ./spam
Just " I will right away clutch your rss feed as 
I can't in finding your e-mail subscription link 
or newsletter service. Do you have any? Kindly 
let me understand so that I may just subscribe. 
Thanks."
$ ./spam
Just " I could not refrain from commenting. Exceptionally well written!"

Remember how we said that Shell Text implements the Num type class in order to get addition and multiplication? Well, you can also use the same Num class to overload integer literals. Any time the compiler sees an integer literal where it expects a Shell Text it will replace that integer with a stream of empty strings whose length is the given integer.

For example, if you write the number 3, it's equivalent to:

-- Definition of 3
3 = 1 + 1 + 1

-- 1 = ""
3 = "" + "" + ""

So if you write:

3 * "some string"

... that expands out to:

("" + "" + "") * "some string"

... and multiplication distributes to give us:

("" * "some string") + ("" * "some string") + ("" * "some string")

... which reduces to three copies of "some string":

"some string" + "some string" + "some string"

This trick works even when multiplying a number by a template with multiple alternatives:

2 * ("I have" + "I've")

-- 2 = 1 + 1
= (1 + 1) * ("I have" + "I've")

-- 1 = ""
= ("" + "") * ("I have" + "I've")

-- Multiplication distributes
= ("" * "I have") + ("" * "I've") + ("" * "I have") + ("" * "I've")

-- Simplify
= "I have" + "I've" + "I have" + "I've"

Arithmetic

In fact, Shell Text obeys all sort of arithmetic laws:

-- `0` is the identity of addition
0 + x = x
x + 0 = x

-- Addition is associative
(x + y) + z = x + (y + z

-- `1` is the identity of multiplication
1 * x = 1
x * 1 = 1

-- Multiplication is associative
(x * y) * z = x * (y * z)

-- Multiplication right-distributes over addition
(x + y) * z = (x * z) + (y * z)

-- 1-element streams left-distribute over addition
"string" * (x + y) = ("string" * x) + ("string" * y)

I'm not sure what the mathematical name is for this sort of structure. I usually call this a "semiring", but that's technically not correct because in a semiring we expect addition to commute, but here it does not because Shell Text preserves the ordering of results. For the case of selecting a random element ordering does not matter, but there are other operations we can perform on these streams that are order-dependent.

In fact, the laws of arithmetic enforce that you weigh all fine-grained alternatives equally, regardless of the frequencies of the top-level coarse-grained alternatives. If you weighted things based on the relative frequencies of top-level alternatives you would get very inconsistent behavior.

For example, if you tried to be more "fair" for outer alternatives, then addition stops being associative, meaning that these templates would no longer behave the same:

{x|{y|z}}
{{x|y}|z}
{x|y|z}

Conclusion

The Haskell template generator was so concise for two main reasons:

  • We embedded the template directly within Haskell, so we skipped having to parse the template
  • We reuse modular and highly generic components (like random, foldIO, (+), and (*) and numeric literals) instead of writing our own custom code

Also, our program is easily modifiable. For example, if we want to collect all the templates, we just replace random with vector:

import Control.Foldl (vector)
import Data.Vector (Vector)

main = do
    x <- foldIO spam vector
    print (x :: Vector Text)

That efficiently builds a vector in place using mutation that stores all the results, then purifies the result.

However, these sorts of tricks come at a cost. Most of the awesome tricks like this are not part of the standard library and instead exist in libraries, making them significantly harder to discover. Worse, you're never really "done" learning the Haskell language. The library ecosystem is a really deep rabbit hole full of jewels and at some point you just have to pick a point to just stop digging through libraries and build something useful ...

... or useless, like a comment spam generator.

Monday, April 6, 2015

Mathematical APIs

You can use mathematical APIs to smooth the onboarding process for new users of your library by reusing their existing intuition for algebraic operations.

Let's use an example from my turtle library for shell scripting, which uses the Shell Text type to represent a stream of lines. You can forward such a stream to the console using stdout:

stdout :: Shell Text -> IO ()

Thanks to Haskell's OverloadedStrings extension, we can use a string literal to encode a 1-element stream that emits just that one string literal:

>>> :set -XOverloadedStrings
>>> import Turtle
>>> stdout "Hello, world!"
Hello, world
>>>

Now let's drop a thin mathematical API on top of these streams so that we can treat streams like ordinary numbers. These examples will only work with turtle-1.1.0 and later.

If we add streams, we concatenate their elements in order. Here we concatenate three 1-element streams to generate a combined 3-element stream:

>>> stdout ("Line 1" + "Line 2" + "Line 3")
Line 1
Line 2
Line 3
>>>

0 will be the empty stream:

>>> stdout 0
>>> -- This prints nothing

1 is a stream that emits a single empty string:

>>> stdout 1

>>>

So what is 2? Well, it stands to reason that 2 must be 1 + 1:

>>> stdout 2


>>> stdout (1 + 1) -- Same thing


>>>

Similarly, 3 must be 1 + 1 + 1:

>>> stdout 3



>>>

... and so forth.

If we multiply two 1-element streams, we generate a new 1-element stream that concatenates their strings.

>>> stdout ("123" * "456")
123456
>>>

Now what do you think will happen if I do this?

>>> let x = "Line 1" + "Line 2"  -- A stream with two elements
>>> let y = "Line A" + "Line B"  -- A stream with two elements
>>> stdout (x * ", " * y)        -- What this will print?

Well, we can reason about this using the same rules of addition and multiplication we learned in school.

First we substitute x and y with their definitions:

x * ", " * y

= ("Line 1" + "Line 2") * ", " * ("Line A" + "Line B")

Then we expand out the multiplication to get four terms:

= "Line 1" * ", " * "Line A"
+ "Line 1" * ", " * "Line B"
+ "Line 2" * ", " * "Line A"
+ "Line 2" * ", " * "Line B"

Then we use the rule that multiplying 1-element streams just concatenates their strings:

= "Line 1, Line A"
+ "Line 1, Line B"
+ "Line 2, Line A"
+ "Line 2, Line B"

... and there's our final result! We expect our stream to have four elements, one for each permutation of elements from the left and right streams.

Let's verify that:

>>> stdout (x * ", " * y)
Line 1, Line A
Line 1, Line B
Line 2, Line A
Line 2, Line B

We can even leverage our algebraic intuition when working with impure streams:

>>> stdout (stdin * ("!" + "?"))
Test<Enter>
Test!
Test?
ABC<Enter>
ABC!
ABC?
...

This is why I love mathematical APIs: they are concise, general, and expressive.

People judge programming languages using mindshare, but the mindshare of mathematics dwarfs all programming languages combined. By specifying programs mathematically we can unlock a large and latent pool of programming talent.

Friday, March 27, 2015

Algebraic side effects

Haskell differentiates itself from most other functional languages by letting you reason mathematically about programs with side effects. This post begins with a pure Haskell example that obeys algebraic equations and then generalizes the example to impure code that still obeys the same equations.

Algebra

In school, you probably learned algebraic rules like this one:

f * (xs + ys) = (f * xs) + (f * ys)

Now let's make the following substitutions:

  • Replace the mathematical multiplication with Haskell's map function
  • Replace the mathematical addition with Haskell's ++ operator

These two substitutions produce the following Haskell equation:

map f (xs ++ ys) = (map f xs) ++ (map f ys)

In other words, if you concatenate the list xs with the list ys and then map a function named f over the combined list, the result is indistinguishable from mapping f over each list individually and then concatenating them.

Let's test this equation out using the Haskell REPL:

>>> map (+ 1) ([2, 3] ++ [4, 5])
[3,4,5,6]
>>> (map (+ 1) [2, 3]) ++ (map (+ 1) [4, 5])
[3,4,5,6]

Evaluation order

However, the above equation does not hold in most other languages. These other languages use function evaluation to trigger side effects, and therefore if you change the order of evaluation you change the order of side effects.

Let's use Scala to illustrate this. Given the following definitions:

>>> def xs() = { print("!"); Seq(1, 2) }
>>> def ys() = { print("?"); Seq(3, 4) }
>>> def f(x : Int) = { print("*"); x + 1 }

.. the order of side effects differs depending on whether I concatenate or map first:

>>> (xs() ++ ys()).map(f)
!?****res0: Seq[Int] = List(2, 3, 4, 5)
>>> (xs().map(f)) ++ (ys().map(f))
!**?**res1: Seq[Int] = List(2, 3, 4, 5)

One line #1, the two lists are evaluated first, printing "!" and "?", followed by evaluating the function f on all four elements, printing "*" four times. On line #2, we call f on each element of xs before beginning to evaluate ys. Since evaluation order matters in Scala we get two different programs which print the punctuation characters in different order.

The solution

Haskell, on the other hand, strictly separates evaluation order from side effect order using a two-phase system. In the first phase you evaluate your program to build an abstract syntax tree of side effects. In the second phase the Haskell runtime executes the tree by interpreting it for its side effects. This phase distinction ensures that algebraic laws continue to behave even in the presence of side effects.

To illustrate this, we'll generalize our original Haskell code to interleave side effects with list elements and show that it still obeys the same algebraic properties as before. The only difference from before is that we will:

  • Generalize pure lists to their impure analog, ListT
  • Generalize functions to impure functions that wrap side effects with lift
  • Generalize (++) (list concatenation) to (<|>) (ListT concatenation)
  • Generalize map to (=<<), which streams elements through an impure function

This means that our new equation becomes:

-- f  *  (xs  +  ys) = (f  *  xs)  +  (f  * ys)
   f =<< (xs <|> ys) = (f =<< xs) <|> (f
=<< ys)

You can read this as saying: if we concatenate xs and ys and then stream their values through the impure function f, the behavior is indistinguishable from streaming each individual list through f first and then concatenating them.

Let's test this equation out with some sample definitions for xs, ys, and f that mirror their Scala analogs:

>>> import Control.Applicative
>>> import Pipes
>>> let xs = do { lift (putChar '!'); return 1
<|> return 2 }
>>> let ys = do { lift (putChar '?'); return 3
<|> return 4 }
>>> let f x = do { lift (putChar '*'); return (x + 1) }
>>> runListT (f =<< (xs <|> ys)) -- Note:
`runListT` discards the result
!**?**>>> runListT ((f =<< xs) <|> (f =<< ys))
!**?**>>>

The resulting punctuation order is identical. Many people mistakenly believe that Haskell's mathematical elegance breaks down when confronted with side effects, but nothing could be further from the truth.

Conclusion

Haskell preserves algebraic equations even in the presence of side effects, which simplifies reasoning about impure code. Haskell separates evaluation order from side effect order so that you spend less time reasoning about evaluation order and more time reasoning about your program logic.

Thursday, January 29, 2015

Use Haskell for shell scripting

Right now dynamic languages are popular in the scripting world, to the dismay of people who prefer statically typed languages for ease of maintenance.

Fortunately, Haskell is an excellent candidate for statically typed scripting for a few reasons:

  • Haskell has lightweight syntax and very little boilerplate
  • Haskell has global type inference, so all type annotations are optional
  • You can type-check and interpret Haskell scripts very rapidly
  • Haskell's function application syntax greatly resembles Bash

However, Haskell has had a poor "out-of-the-box" experience for a while, mainly due to:

  • Poor default types in the Prelude (specifically String and FilePath)
  • Useful scripting utilities being spread over a large number of libraries
  • Insufficient polish or attention to user experience (in my subjective opinion)

To solve this, I'm releasing the turtle library, which provides a slick and comprehensive interface for writing shell-like scripts in Haskell. I've also written a beginner-friendly tutorial targeted at people who don't know any Haskell.

Overview

turtle is a reimplementation of the Unix command line environment in Haskell. The best way to explain this is to show what a simple "turtle script" looks like:

#!/usr/bin/env runhaskell

{-# LANGUAGE OverloadedStrings #-}

import Turtle

main = do
    cd "/tmp"
    mkdir "test"
    output "test/foo" "Hello, world!"  -- Write "Hello, world!" to "test/foo"
    stdout (input "test/foo")          -- Stream "test/foo" to stdout
    rm "test/foo"
    rmdir "test"
    sleep 1
    die "Urk!"

If you make the above file executable, you can then run the program directly as a script:

$ chmod u+x example.hs
$ ./example.hs
Hello, world!
example.hs: user error (Urk!)

The turtle library renames a lot of existing Haskell utilities to match their Unix counterparts and places them under one import. This lets you reuse your shell scripting knowledge to get up and going quickly.

Shell compatibility

You can easily invoke an external process or shell command using proc or shell:

#!/usr/bin/env runhaskell

{-# LANGUAGE OverloadedStrings #-}

import Turtle

main = do
    mkdir "test"
    output "test/file.txt" "Hello!"
    proc "tar" ["czf", "test.tar.gz", "test"] empty

    -- or: shell "tar czf test.tar.gz test" empty

Even people unfamiliar with Haskell will probably understand what the above program does.

Portability

"turtle scripts" run on Windows, OS X and Linux. You can either compile scripts as native executables or interpret the scripts if you have the Haskell compiler installed.

Streaming

You can build or consume streaming sources. For example, here's how you print all descendants of the /usr/lib directory in constant memory:

#!/usr/bin/env runhaskell

{-# LANGUAGE OverloadedStrings #-}

import Turtle

main = view (lstree "/usr/lib")

... and here's how you count the number of descendants:

#!/usr/bin/env runhaskell

{-# LANGUAGE OverloadedStrings #-}

import qualified Control.Foldl as Fold
import Turtle

main = do
    n <- fold (lstree "/usr/lib") Fold.length
    print n

... and here's how you count the number of lines in all descendant files:

#!/usr/bin/env runhaskell

{-# LANGUAGE OverloadedStrings #-}

import qualified Control.Foldl as Fold
import Turtle

descendantLines = do
    file <- lstree "/usr/lib"
    True <- liftIO (testfile file)
    input file

main = do
    n <- fold descendantLines Fold.length
    print n

Exception Safety

turtle ensures that all acquired resources are safely released in the face of exceptions. For example, if you acquire a temporary directory or file, turtle will ensure that it's safely deleted afterwards:

example = do
    dir <- using (mktempdir "/tmp" "test")
    liftIO (die "The temporary directory will still be deleted!")

However, exception safety comes at a price. turtle forces you to consume all streams in their entirety so you can't lazily consume just the initial portion of a stream. This was a tradeoff I chose to keep the API as simple as possible.

Patterns

turtle supports Patterns, which are like improved regular expressions. Use Patterns as lightweight parsers to extract typed values from unstructured text:

$ ghci
>>> :set -XOverloadedStrings
>>> import Turtle
>>> data Pet = Cat | Dog deriving (Show)
>>> let pet = ("cat" *> return Cat) <|> ("dog" *> return Dog) :: Pattern Pet
>>> match pet "dog"
>>> [Dog]
>>> match (pet `sepBy` ",") "cat,dog,cat"
[[Cat,Dog,Cat]]

You can also use Patterns as arguments to commands like sed, grep, find and they do the right thing:

>>> stdout (grep (prefix "c") "cat")             -- grep '^c'
cat
>>> stdout (grep (has ("c" <|> "d")) "dog")      -- grep 'cat\|dog'
dog
>>> stdout (sed (digit *> return "!") "ABC123")  -- sed 's/[[:digit:]]/!/g'
ABC!!!

Unlike many Haskell parsers, Patterns are fully backtracking, no exceptions.

Formatting

turtle supports typed printf-style string formatting:

>>> format ("I take "%d%" "%s%" arguments") 2 "typed"
"I take 2 typed arguments"

turtle even infers the number and types of arguments from the format string:

>>> :type format ("I take "%d%" "%s%" arguments")
format ("I take "%d%" "%s%" arguments") :: Text -> Int -> Text

This uses a simplified version of the Format type from the formatting library. Credit to Chris Done for the great idea.

The reason I didn't reuse the formatting library was that I spent a lot of effort keeping the types as simple as possible to improve error messages and inferred types.

Learn more

turtle doesn't try to ambitiously reinvent shell scripting. Instead, turtle just strives to be a "better Bash". Embedding shell scripts in Haskell gives you the the benefits of easy refactoring and basic sanity checking for your scripts.

You can find the turtle library on Hackage or Github. Also, turtle provides an extensive beginner-friendly tutorial targeted at people who don't know any Haskell at all.

Saturday, January 10, 2015

total-1.0.0: Exhaustive pattern matching using traversals, prisms, and lenses

The lens library provides Prisms, which are a powerful way to decouple a type's interface from its internal representation. For example, consider the Either type from the Haskell Prelude:

data Either a b = Left a | Right b

Instead of exposing the Left and Right constructors, you could instead expose the following two Prisms:

_Left  :: Prism (Either a b) (Either a' b ) a a'

_Right :: Prism (Either a b) (Either a  b') b b'

Everything you can do with a constructor, you can do with the equivalent Prism. For example, if you want to build a value using a Prism, you use review:

review _Left  :: a -> Either a b
review _Right :: b -> Either a b

You can also kind-of pattern match on the Prism using preview, returning a Just if the match succeeds or Nothing if the match fails:

preview _Left  :: Either a b -> Maybe a
preview _Right :: Either a b -> Maybe b

However, preview is not quite as good as real honest-to-god pattern matching, because if you wish to handle every branch you can't prove in the types that your pattern match was exhaustive:

nonTypeSafe :: Either Char Int -> String
nonTypeSafe e =
    case preview _Left e of
        Just c  -> replicate 3  c
        Nothing -> case preview _Right e of
            Just n  -> replicate n '!'
            Nothing -> error "Impossible!"  -- Unsatisfying

However, this isn't a flaw with Prisms, but rather a limitation of preview. Prisms actually preserve enough information in the types to support exhaustive pattern matching! The trick for this was described a year ago by Tom Ellis, so I decided to finally implement his idea as the total library.

Example

Here's an example of a total pattern matching using the total library:

import Lens.Family.Total
import Lens.Family.Stock

total :: Either Char Int -> String       -- Same as:
total = _case                            -- total = \case
    & on _Left  (\c -> replicate 3  c )  --     Left  c -> replicate 3 c
    & on _Right (\n -> replicate n '!')  --     Right n -> replicate n '!'

The style resembles LambdaCase. If you want to actually supply a value in the same expression, you can also write:

e & (_case
    & on _Left  ...
    & on _Right ... )

There's nothing unsafe going on under the hood. on is just 3 lines of code:

on p f g x = case p Left x of
    Left  l -> f l
    Right r -> g r

(&) is just an operator for post-fix function application:

x & f = f x

... and _case is just a synonym for a type class method that checks if a type is uninhabited.

class Empty a where
    impossible :: a -> x

_case = impossible

The rest of the library is just code to auto-derive Empty instances using Haskell generics. The entire library is about 40 lines of code.

Generics

Here's an example of exhaustive pattern matching on a user-defined data type:

{-# LANGUAGE DeriveGeneric   #-}
{-# LANGUAGE TemplateHaskell #-}

import Control.Lens.TH (makePrisms)
import GHC.Generics (Generic)
import Lens.Family.Total

data Example a b c = C1 a | C2 b | C3 c deriving (Generic)

makePrisms ''Example

instance (Empty a, Empty b, Empty c) => Empty (Example a b c)

example :: Example String Char Int -> String  -- Same as:
example = _case                               -- example = \case
    & on _C1 (\s -> s              )          --     C1 s -> s
    & on _C2 (\c -> replicate 3  c )          --     C2 c -> replicate 3  c
    & on _C3 (\n -> replicate n '!')          --     C3 n -> replicate n '!'

This uses two features to remove most of the boiler-plate:

  • deriving (Generic) auto-generates the code for the Empty instance
  • makePrisms uses Template Haskell to auto-generate Prisms for our type.

For those who prefer the lens-family-* suite of libraries over lens, I opened up an issue against lens-family-th so that Lens.Family.TH.makeTraversals can be used in place of makePrisms for exhaustive pattern matching.

When we write this instance in our original code:

instance (Empty a, Empty b, Empty c) => Empty (Example a b c)

... that uses Haskell generics to auto-generate the implementation of the impossible function:

instance (Empty a, Empty b, Empty c) => Empty (Example a b c)
    impossible (C1 a) = impossible a
    impossible (C2 b) = impossible b
    impossible (C3 c) = impossible c

That says that the type Example a b c is uninhabited if and only if the types a, b, and c are themselves uninhabited.

When you write makePrisms ''Example, that creates the following three prisms:

_C1 :: Prism (Example a b c) (Example a' b  c ) a a'
_C2 :: Prism (Example a b c) (Example a  b' c ) b b'
_C3 :: Prism (Example a b c) (Example a  b  c') c c'

These Prisms are useful in their own right. You can export them in lieu of your real constructors and they are more powerful in several respects.

Backwards compatibility

There is one important benefit to exporting Prisms and hiding constructors: if you export Prisms you can change your data type's internal representation without changing your API.

For example, let's say that I change my mind and implement Example as two nested Eithers:

type Example a b c = Either a (Either b c)

Had I exposed my constructors, this would be a breaking change for my users. However, if I expose Prisms, then I can preserve the old API by just changing the Prism definition. Instead of auto-generating them using Template Haskell, I can just write:

import Lens.Family.Stock (_Left, _Right)

_C1 = _Left
_C2 = _Right . _Left
_C3 = _Right . _Right

Done! That was easy and the user of my Prisms are completely unaffected by the changes to the internals.

Under the hood

The trick that makes total work is that every branch of the pattern match subtly alters the type. The value that you match on starts out as:

Example String Char Int

... and every time you pattern match against a branch, the on function Voids one of the type parameters. The pattern matching flows from bottom to top:

example = _case
    & on _C1 (\s -> s              )  -- Step 3: Example Void   Void Void
    & on _C2 (\c -> replicate 3  c )  -- Step 2: Example String Void Void
    & on _C3 (\n -> replicate n '!')  -- Step 1: Example String Char Void
                                      -- Step 0: Example String Char Int

Finally, _case just checks that the final type (Example Void Void Void in this case) is uninhabited. All it does is check if the given type is an instance of the Empty type class, which only provides instances for uninhabited types. In this case Example Void Void Void is definitely uninhabited because Void (from Data.Void) is itself uninhabited:

instance Empty Void where
    impossible = absurd

Lenses

What's neat is that this solution works not only for Prisms and Traversals, but for Lenses, too! Check this out:

example :: (Int, Char) -> String     -- Same as:
example = _case                      -- example = \case
    & on _1 (\n -> replicate n '1')  --     (n, _) -> replicate n '1'

... and the identity lens works (which is just id) works exactly the way you'd expect:

example :: (Int, Char) -> String        -- Same as:
example = _case                         -- example = \case
    & on id (\(n, c) -> replicate n c)  --     (n, c) -> replicate n c

I didn't intend that when I wrote the library: it just fell naturally out of the types.

Conclusion

The total library now allows library authors to hide their constructors and export a backwards compatible Prism API all while still preserving exhaustive pattern matching.

Note that I do not recommend switching to a lens-only API for all libraries indiscriminately. Use your judgement when deciding whether perfect backwards compatibility is worth the overhead of a lens-only API, both for the library maintainers and your users.

I put the code under the Lens.Family.Total module, not necessarily because I prefer the Lens.Family.* hierarchy, but rather because I would like to see Edward Kmett add these utilities to lens, leaving my library primarily for users of the complementary lens-family-* ecosystem.

You can find the total library on Hackage or on GitHub.

Saturday, December 6, 2014

A very general API for relational joins

Maps and tuples are useful data types for modeling relational operations. For example, suppose we have the following table, indexed by the Id column:

| Id | First Name | Last Name |
|----|------------|-----------|
|  0 | Gabriel    | Gonzalez  |
|  1 | Oscar      | Boykin    |
|  2 | Edgar      | Codd      |

We can model that as a Map where the key is the Id column and the value is a tuple of FirstName and LastName:

import Data.Map (Map) -- from the `containers` library
import qualified Data.Map as Map

type Id        = Int
type FirstName = String
type LastName  = String

names :: Map Id (FirstName, LastName)
names = Map.fromList
    [ (0, ("Gabriel", "Gonzalez"))
    , (1, ("Oscar"  , "Boykin"  ))
    , (2, ("Edgar"  , "Codd"    ))
    ]

Now suppose we have another table containing Twitter handles, also indexed by Id:

| Id | Twitter Handle |
|----|----------------|
| 0  |  GabrielG439   |
| 1  |  posco         |
| 3  |  avibryant     |

We can also encode that as a Map:

type TwitterHandle = String

handles :: Map Id TwitterHandle
handles = Map.fromList
    [ (0, "GabrielG439")
    , (1, "posco"      )
    , (3, "avibryant"  )
    ]

One of the nice properties of Maps is that you can join them together:

import Control.Applicative

-- I'm using `join2` to avoid confusion with `Control.Monad.join`
join2 :: Map k v1 -> Map k v2 -> Map k (v1, v2)
join2 = ... -- Implementation elided for now

What if we could generalize join2 to work on types other than Map. Perhaps we could use the Applicative interface to implement join2:

join2 :: Applicative f => f a -> f b -> f (a, b)
join2 = liftA2 (,)

However, the Map type cannot implement Applicative in its current form. The reason why is that there is no sensible definition for pure:

pure :: v -> Map k v
pure = ???

This would require a Map that was defined for every key, which we cannot encode. Or can we?

Tables

Well, who says we need to use the Map type from containers? What if I were to encode my Map this way:

import Prelude hiding (lookup)

-- | A map encoded as a lookup function
newtype Table k v = Table { lookup :: k -> Maybe v }

-- | Encode a traditional map as a lookup function
from :: Ord k => Map k v -> Table k v
from m = Table (\k -> Map.lookup k m)

This new type of Map only permits a single operation: lookup, but because we constrain our API to this single operation we can now implement the full Applicative interface:

instance Functor (Table k) where
    fmap f (Table g) = Table (\k -> fmap f (g k))
           -- Same as: Table (fmap (fmap f) g)

instance Applicative (Table k) where
    pure v            = Table (\k -> Just v)
            -- Same as: Table (pure (pure v))

    Table f <*> Table x = Table (\k -> f k <*> x k)
              -- Same as: Table (liftA2 (<*>) f x)

We can promote conventional Maps to this new Table type using the above from function:

names' :: Table Id (FirstName, LastName)
names' = from names

handles' :: Table Id TwitterHandle
handles' = from handles

... and now the more general Applicative join2 will work on these two tables:

>>> let table = join2 names' handles'
>>> :type table
table :: Table Id ((FirstName, LastName), TwitterHandle)
>>> lookup table 0
Just (("Gabriel","Gonzalez"),"GabrielG439")
>>> lookup table 1
Just (("Oscar","Boykin"),"posco")
>>> lookup table 2
Nothing

However, in its present form we can't dump the table's contents because we don't know which keys are present in the table. Let's fix that by adding an additional field to the Table type listing the keys. We will treat functions as being defined for all keys:

import Data.Set (Set)
import qualified Data.Set as Set 

data Keys k = All | Some (Set k)

instance Ord k => Num (Keys k) where
    fromInteger 0         = Some Set.empty
    fromInteger n | n > 0 = All

    All     + _       = All
    _       + All     = All
    Some s1 + Some s2 = Some (Set.union s1 s2)

    All     * ks      = ks
    ks      * All     = ks
    Some s1 * Some s2 = Some (Set.intersection s1 s2)

-- | A map encoded as a lookup function
data Table k v = Table
    { keys   :: Keys k
    , lookup :: k -> Maybe v
    }

-- | Encode a traditional map as a lookup function
from :: Ord k => Map k v -> Table k v
from m = Table
    { keys   = Some (Set.fromList (Map.keys m))
    , lookup = \k -> Map.lookup k m
    }

Even after extending Table with keys we can still implement the Applicative interface:

instance Functor (Table k) where
    fmap f (Table ks g) = Table ks (fmap (fmap f) g)

instance Ord k => Applicative (Table k) where
    pure v =
        Table 1 (pure (pure v))

    Table s1 f <*> Table s2 x =
        Table (s1 * s2) (liftA2 (<*>) f x)

... and now we can add a Show instance, too!

instance (Show k, Show v) => Show (Table k v) where
    show (Table ks f) = case ks of
        All    -> "<function>"
        Some s -> unlines (do
            k <- Set.toList s
            let Just v = f k
            return (show (k, v)) )

Let's give it a test drive:

>>> names'
(0,("Gabriel","Gonzalez"))
(1,("Oscar","Boykin"))
(2,("Edgar","Codd"))

>>> handles'
(0,"GabrielG439")
(1,"posco")
(3,"avibryant")

>>> join2 names' handles'
(0,(("Gabriel","Gonzalez"),"GabrielG439"))
(1,(("Oscar","Boykin"),"posco"))

So far, so good!

Alternative

However, we need to support more than just inner joins. We'd also like to support left, right, and outer joins, too.

Conceptually, a left join is one in which values from the right table may be optionally present. One way we could implement this would be to define a function that converts a finite map to a function defined on all keys. This function will return Nothing for keys not present in the original finite map and Just for keys that were present:

optional :: Table k v -> Table k (Maybe v)
optional (Table ks f) =
    Table All (\k -> fmap Just (f k) <|> pure Nothing)

Then we could define leftJoin in terms of join2 and optional:

leftJoin :: Table k a -> Table k b -> Table k (a, Maybe b)
leftJoin t1 t2 = join2 t1 (optional t2)

However, if we try to compile the above code, the compiler will give us a really interesting error message:

Ambiguous occurrence ‘optional’
It could refer to either ‘Main.optional’,
                      or ‘Control.Applicative.optional’

Apparently, Control.Applicative has an optional function, too. Let's pause to check out the type of this mysterious function:

optional :: Alternative f => f v -> f (Maybe v)

Wow! That type signature is suprisingly similar to the one we wrote. In fact, if Table k implemented the Alternative interface, the types would match.

Alternative is a type class (also provided by Control.Applicative) that greatly resembles the Monoid type class:

class Applicative f => Alternative f where
    empty :: f a

    (<|>) :: f a -> f a -> f a

... and the core Alternative laws are identical to the Monoid laws:

x <|> empty = x

empty <|> x = x

(x <|> y) <|> z = x <|> (y <|> z)

Let's dig even deeper and see how optional uses this Alternative type class:

optional v = fmap Just v <|> pure Nothing

Even the implementation is eerily similar! This strongly suggests that we should find a way to make our Table type implement Alternative. Perhaps something like this would work:

instance Ord k => Alternative (Table k) where
    empty =
        Table 0 (pure empty)

    Table ks1 f1 <|> Table ks2 f2 =
        Table (ks1 + ks2) (liftA2 (<|>) f1 f2)

Compare this to our Applicative instance (duplicated here):

instance Ord k => Applicative (Table k) where
    pure v =
        Table 1 (pure (pure v))

    Table s1 f <*> Table s2 x =
        Table (s1 * s2) (liftA2 (<*>) f x)

The Alternative instance mirrors our Applicative instance except that we:

  • replace pure v with empty
  • replace (<*>) with (<|>)
  • replace 1 with 0
  • replace (*) with (+)

... and what's really neat is that now the optional function from Control.Applicative behaves just like the original optional function we wrote. (Exercise: Use equational reasoning to verify this)

Derived joins

Armed with this Alternative instance, we can now implement leftJoin in terms of the optional provided by Control.Applicative:

leftJoin t1 t2 = join2 t1 (optional t2)

Sure enough, it works:

>>> leftJoin names' handles'
(0,(("Gabriel","Gonzalez"),Just "GabrielG439"))
(1,(("Oscar","Boykin"),Just "posco"))
(2,(("Edgar","Codd"),Nothing)

Let's check out the type that the compiler infers for leftJoin:

>>> :type leftJoin
leftJoin :: Alternative f => f a -> f b -> f (a, Maybe b)

Notice how there's no longer anything Table-specific about leftJoin. It works for anything that implements the Alternative interface. I could leftJoin two Maybes if I really wanted to:

>>> leftJoin (Just 1) (Just 2)
Just (1,Just 2)
>>> leftJoin (Just 1) Nothing
Just (1,Nothing)
>>> leftJoin Nothing (Just 1)
Nothing

... or two lists:

>>> leftJoin [1, 2] [3, 4]
[(1,Just 3),(1,Just 4),(1,Nothing),(2,Just 3),(2,Just 4),(2,
Nothing)]

In fact, I don't even really need specialized leftJoin or rightJoin functions. optional is sufficiently light-weight that I could inline a right join on the fly:

>>> join2 (optional names') handles'
(0,(Just ("Gabriel","Gonzalez"),"GabrielG439"))
(1,(Just ("Oscar","Boykin"),"posco"))
(3,(Nothing,"avibryant"))

What happens if I try to do an "outer join"?

>>> -- DISCLAIMER: Technically not the same as an SQL outer join
>>> let o = join2 (optional names') (optional handles')
>>> o
<function>

The above "outer join" is defined for all keys (because both sides are optional), so we get back a function! While we can't list the Table (because it's conceptually infinite), we can still perform lookups on it:

>>> lookup o 0
Just (Just ("Gabriel","Gonzalez"),Just "GabrielG439")
>>> lookup o 2
Just (Just ("Edgar","Codd"),Nothing)
>>> lookup o 3
Just (Nothing,Just "avibryant")
>>> lookup o 4
Just (Nothing, Nothing)

... and if we were to join our "infinite" table against a finite table, we get back a finite table (Exercise: Try it! Define a new finite table to join against o and see what happens)

What's nice about optional is that we can easily left-join or right-join in multiple tables at a time. If I had four tables of types:

t1 :: Table k a
t2 :: Table k b
t3 :: Table k c
t4 :: Table k d

... I could left join t2, t3, and t4 into t1 by just writing:

liftA4 (,,,) t1 (optional t2) (optional t3) (optional t4)
    :: Table k (a, Maybe b, Maybe c, Maybe d)

Now that I think about it, I don't even really need to provide join2/join3/join4/join5 since they are not much shorter than using the liftA family of functions in Control.Applicative:

-- Exercise: What would `join1` be?
join2 = liftA2 (,)
join3 = liftA3 (,,)
join4 = liftA4 (,,,)
join5 = liftA5 (,,,,)

In other words, I can implement almost any imaginable join just by using liftA{n} and some permutation of optionals. I don't even know what I'd call this join:

liftA5 (,,,,) t1 (optional t2) t3 (optional t4) t5

... but the beauty is that I don't have to give a name for it. I can easily write anonymous joins on the fly using the Control.Applicative module. Moreover, the above code will work for anything that implements the Alternative interface.

Conclusion

Control.Applicative provides a very general API for relational joins: the Alternative type class (which includes Applicative, since Applicative is a super-class of Alternative). Perhaps Control.Applicative could be improved slightly by providing the join{n} family of functions listed above, but it's still highly usable in its present state.

Note that this trick only works for relational abstractions embedded within Haskell. This API can be generalized for external relational data stores (i.e. Postgres), which I will cover in a subsequent post.