Saturday, June 14, 2014

Spreadsheet-like programming in Haskell

What if I told you that a spreadsheet could be a library instead of an application? What would that even mean? How do we distill the logic behind spreadsheets into a reusable abstraction? My mvc-updates library answers this question by bringing spreadsheet-like programming to Haskell using an intuitive Applicative interface.

The central abstraction is an Applicative Updatable value, which is just a Fold sitting in front of a Managed Controller:

data Updatable a =
    forall e . On (Fold e a) (Managed (Controller e))

The Managed Controller originates from my mvc library and represents a resource-managed, concurrent source of values of type e. Using Monoid operations, you can interleave these concurrent resources together while simultaneously merging their resource management logic.

The Fold type originates from my foldl library and represents a reified left fold. Using Applicative operations, you can combine multiple folds together in such a way that they still pass over the data set just once without leaking space.

To build an Updatable value, just pair up a Fold with a Managed Controller:

import Control.Foldl (last, length)
import MVC
import MVC.Updates
import MVC.Prelude (stdinLines, tick)
import Prelude hiding (last, length)

-- Store the last line from standard input
lastLine :: Updatable (Maybe String)
lastLine = On last stdinLines

-- Increment every second
seconds :: Updatable Int
seconds = On length (tick 1.0)

What's amazing is that when you stick a Fold in front of a Controller, you get a new Applicative. This Applicative instance lets you combine multiple Updatable values into new derived Updatable values. For example, we can combine lastLine and seconds into a single data type that tracks updates to both values:

import Control.Applicative ((<$>), (<*>))

data Example = Example (Maybe String) Int deriving (Show)

example :: Updatable Example
example = Example <$> lastLine <*> seconds

example will update every time lastLine or seconds updates, caching and reusing portions that do not update. For example, if lastLine updates then only the first field of Example will change. Similarly, if seconds updates then only the second field of Example will change.

When we're done combining Updatable values we can plug them into mvc using the updates function:

updates :: Buffer a -> Updatable a -> Managed (Controller a)

This gives us back a Managed Controller we can feed into our mvc application:

viewController :: Managed (View Example, Controller Example)
viewController = do
    controller <- updates Unbounded example
    return (asSink print, controller)

model :: Model () Example Example
model = asPipe $
    Pipes.takeWhile (\(Example str _) -> str /= Just "quit")

main :: IO ()
main = runMVC () model viewController

This program updates in response to two concurrent inputs: time and standard input.

$ ./example
Example Nothing 0
Test<Enter>
Example (Just "Test") 0  <-- Update: New input
Example (Just "Test") 1  <-- Update: 1 second passed
Example (Just "Test") 2  <-- Update: 1 second passed
ABC<Enter>
Example (Just "ABC") 2   <-- Update: New input
Example (Just "ABC") 3   <-- Update: 1 second 
quit<Enter>
$

Spreadsheets

The previous example was a bit contrived, so let's step it up a notch. What better way to demonstrate spreadsheet-like programming than .. a spreadsheet!

I'll use Haskell's gtk library to set up the initial API, which consists of a single exported function:

module Spreadsheet (spreadsheet) where

spreadsheet
    :: Managed                   -- GTK setup
        ( Updatable Double       -- Create input  cell
        , Managed (View Double)  -- Create output cell
        , IO ()                  -- Start spreadsheet
        )
spreadsheet = ???

You can find the full source code here.

Using spreadsheet, I can now easily build my own spread sheet application:

{-# LANGUAGE TemplateHaskell #-}

import Control.Applicative (Applicative, (<$>), (<*>))
import Lens.Family.TH (makeLenses)
import MVC
import MVC.Updates (updates)
import Spreadsheet (spreadsheet)

-- Spreadsheet input (4 cells)
data In  = I
    { _i1 :: Double
    , _i2 :: Double
    , _i3 :: Double
    , _i4 :: Double
    }

-- Spreadsheet output (4 cells)
data Out = O
    { _o1 :: Double
    , _o2 :: Double
    , _o3 :: Double
    , _o4 :: Double
    }
makeLenses ''Out

-- Spreadsheet logic that converts input to output
model :: Model () In Out
model = asPipe $ loop $ \(I i1 i2 i3 i4) -> do
    return $ O (i1 + i2) (i2 * i3) (i3 - i4) (max i4 i1)

main :: IO ()
main = runMVC () model $ do
    (inCell, outCell, go) <- spreadsheet

    -- Assemble the four input cells
    c <- updates Unbounded $
        I <$> inCell <*> inCell <*> inCell <*> inCell

    -- Assemble the four output cells
    v <- fmap (handles o1) outCell
      <> fmap (handles o2) outCell
      <> fmap (handles o3) outCell
      <> fmap (handles o4) outCell

    -- Run the spread sheet
    liftIO go

    return (v, c)

You can install this program yourself by building my mvc-updates-examples package on Github:

$ git clone https://github.com/Gabriel439/Haskell-MVC-Updates-Examples-Library.git
$ cd Haskell-MVC-Updates-Examples-Library
$ cabal install
$ ~/.cabal/bin/mvc-spreadsheet-example

Or you can watch this video of the spreadsheet in action:

The key feature I want to emphasize is how concise this spreadsheet API is. We provide our user an Applicative input cell builder and a Monoid output cell builder, and we're done. We don't have to explain to the user how to acquire resources, manage threads, or combine updates. The Applicative instance for Updatable handles all of those trivial details for them. Adding extra inputs or outputs is as simple as chaining additional inCell and outCell invocations.

Reactive animations

We don't have to limit ourselves to spread sheets, though. We can program Updatable graphical scenes using these same principles. For example, let's animate a cloud that orbits around the user's mouse using the sdl library. Just like before, we will begin from a concise interface:

-- Animation frames for the cloud
data Frame = Frame0 | Frame1 | Frame2 | Frame3

-- To draw a cloud we specify the frame and the coordinates
data Cloud = Cloud Frame Int Int

-- mouse coordinates
data Mouse = Mouse Int Int

sdl :: Managed         -- SDL setup
    ( View Cloud       -- Draw a cloud
    , Updatable Mouse  -- Updatable mouse coordinates
    )

The full source is located here.

In this case, I want to combine the Updatable mouse coordinates with an Updatable time value:

main :: IO ()
main = runMVC () (asPipe cat) $ do
    (cloudOut, mouse) <- sdl
    let seconds = On length (tick (1 / 60))

        toFrame n = case (n `div` 15) `rem` 4 of
            0 -> Frame0
            1 -> Frame1
            2 -> Frame2
            _ -> Frame3

        cloudOrbit t (Mouse x y) = Cloud (toFrame t) x' y'
          where
            x' = x + truncate (100 * cos (fromIntegral t / 10))
            y' = y + truncate (100 * sin (fromIntegral t / 10))

    cloudIn <- updates Unbounded (cloudOrbit <$> seconds <*> mouse)
    return (cloudOut, cloudIn)

cloudOrbit is defined as a pure function from the current time and mouse coordinates to a Cloud. With the power of Applicatives we can lift this pure function over two Updatable values (mouse and seconds) to create a new Updatable Cloud that we pass intact to our program's View.

Like before, you can either run this program yourself:

$ git clone https://github.com/Gabriel439/Haskell-MVC-Updates-Examples-Library.git
$ cd Haskell-MVC-Updates-Examples-Library
$ cabal install
$ ~/.cabal/bin/mvc-spreadsheet-example

... or you can watch the video:

Under the hood

mvc-updates distinguishes itself from similar libraries in other languages by not relying on a semantics for concurrency. The Applicative instance for Updatable uses no concurrent operations, whatsoever:

instance Applicative Updatable where
    pure a = On (pure a) mempty

    (On foldL mControllerL) <*> (On foldR mControllerR)
        = On foldT mControllerT
      where
        foldT = onLeft foldL <*> onRight foldR

        mControllerT = fmap (fmap Left ) mControllerL
                    <> fmap (fmap Right) mControllerR

        onLeft (Fold step  begin done) =
                Fold step' begin done
          where
            step' x (Left a) = step x a
            step' x  _       = x

        onRight (Fold step  begin done) =
                 Fold step' begin done
          where
            step' x (Right a) = step x a
            step' x  _        = x

In fact, this Applicative instance only assumes that the Controller type is a Monoid, so this trick generalizes to any source that forms a Monoid.

This not only simplifies the proof of the Applicative laws, but it also greatly improves efficiency. This Applicative instance introduces no new threads or buffers. The only thread or buffer you will incur is in the final call to the updates function, but expert users can eliminate even that overhead by inlining the logic of the updates function directly into their mvc program.

Lightweight

The mvc-updates library is incredibly small. Here's the entire API:

data Updatable = forall e . On (Fold e a) (Controller e)

instance Functor     Updatable
instance Applicative Updatable

updates :: Buffer a -> Updatable a -> Managed (Controller a)

The library is very straightforward to use:

  • Build Updatable values
  • Combine them using their Applicative instance
  • Convert them back to a Managed Controller when you're done

That's it!

The small size of the library is no accident. The Updatable abstraction is an example of a scalable program architecture. When we combine Updatable values together, the end result is a new Updatable value. This keeps the API small since we always end up back where we started and we never need to introduce additional abstractions.

There is no need to distinguish between "primitive" Updatable values or "derived" Updatable values or "sheets" of Updatable values. The Applicative interface lets us unify these three concepts into a single uniform concept. Moreover, the Applicative interface is one of Haskell's widely used type classes inspired by category theory, so we can reuse people's pre-existing intuition for how Applicatives work. This is a common theme in Haskell where once you learn the core set of mathematical type classes they go a very, very long way.

Conclusion

Hopefully this post will get you excited about the power of Applicative programming. If you would like to learn more about Applicatives, I highly recommend the "Applicative Programming with Effects" paper by Conor McBride and Ross Paterson.

I would like to conclude by saying that there many classes of problems that the mvc-updates library does not solve well, such as:

  • build systems,
  • programs with computationally expensive Views, and:
  • Updatable values that share state.

However, mvc-updates excels at:

  • data visualizations,
  • control panels, and:
  • spread sheets (of course).

You can find the mvc-updates library up on Hackage or on Github.

12 comments:

  1. Look up constraint programming... :-)

    ReplyDelete
    Replies
    1. I'm a little bit familiar with constraint programming, but the main thing I look for in a programming paradigm are programming interfaces inspired by category theory or abstract algebra (i.e. monoids, functors, categories, etc.). Are there analogs of that in constraint programming?

      Delete
    2. Constraint programming systems show you that "spreadsheets" can be implemented (simply) as libraries without the need for any of category theory or abstract algebra.

      So these don't add anything substantive here, apart from pleasing your personal palate (which is a perfectly fine thing to do).

      Delete
    3. I don't use mathematics for the sake of using mathematics. The purpose behind structuring programs mathematically is to compose small bits of mathematical functionality, each of which is correct in isolation, to build larger mathematical structures which are still correct.

      Sure, you can always whip up some specialized and non-mathematical solution, but these will rarely generalize to more complex problems well. They will usually solve some very specific problem very well, but the moment you deviate from the problem it was intended to solve it will become very brittle.

      Even the very example you give (constraint programming systems) demonstrates this issues. Constraint programming lacks the resource management sophistication of `mvc-updates`, where as you combine updatable values it automatically merges their resource management logic, and it's not clear to me how I would extend it with this feature, whereas with `mvc-updates` it was trivial because it took the principled approach.

      Delete
    4. No one cares about constraint programming. Nice article though.

      Delete
    5. This comment has been removed by the author.

      Delete
    6. @Russel: considering that Gabriel implemented spreadsheet-like programming (and spreadsheet = one-way dataflow constraints), I'd say at least one person cares about constraint programming, enough to do a (somewhat simplistic) implementation.

      Then there are the million plus iOS/Mac programmers using AutoLayout and KVO/Bindings. AutoLayout is based on the Cassowary constraint solver, KVO/Bindings is equivalent to a simple one-way constraint solver (without formulae)

      And finally, spreadsheets are the most wide-spread form of programming, and again, spread-sheets = one-way dataflow constraints.

      So you have a funny definition of "nobody" :-)

      @Gabriel: what do you mean with "resource management"? Considering the wide variety of constraint systems, are you certain that none have this? In fact, mvc-updates seems quite limited compared to most constraint systems I am aware of.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Looks great! I don't understand it yet, but I will, soon! ;)

    Has anyone went so far and implemented a spreadsheet app (expanding on your example)? I would like to do that because I always wanted to have my own light-version of Excel.

    Also it seems, there are similarities to functional reactive programming (maybe solving similar problems).

    ReplyDelete
    Replies
    1. Nobody has implemented an actual spread sheet application on top of this yet.

      There are some similarities to functional reactive programming. This answer on Stack Overflow was part of the inspiration for this library (and the second example):

      http://stackoverflow.com/a/1028642/1026598

      Delete
  4. Gabriel, you say that mvc-updates "excel at spread sheets" but as far as I see it deals only with finite and fixed number of Updatables which does not look like spread sheets which could contain variable and possibly huge number of cells.
    Am I correct at that of if I'm wrong how could you model variable number of updatables using mvc-updates? Or maybe there is something outside of this library which could make it possible in some comparatively easy way?
    BTW Thanks for all your writings - I (and probably many others too) get a lot from reaging them.

    ReplyDelete
    Replies
    1. You can model a variable number of updatables if they all store the same type of value. For example, assuming you have a sequence of updatable values of type:

      example :: Seq (Updatable a)

      ... you can turn that into an updatable sequence of values by using `Data.Traversable.sequenceA`:

      sequenceA example :: Updatable (Seq a)

      Under the hood this will generate a minimal update tree where each primitive update of an `a` triggers O(log N) downstream updates to build the new `Seq a`, where N is the size of the original sequence.

      The reason I use `Seq` instead of list is because it has a much more efficient implementation of `sequenceA`.

      Delete