Haskell for all: First-class modules without defaults

Thursday, July 19, 2012

First-class modules without defaults

Recently Chris Doner proposed a first-class module approach which uses the Default type-class and then I revised his approach to not use type-classes at all, encoding all the information in dictionaries. I'm using this post to expand upon my variation of Chris's approach and show how one would translate his approach to my approach and explain what I believe are the advantages of this improvement.

Dictionaries

This approach builds off the classic encoding of a type-class as a dictionary. The trick is simple, first you convert a module to an equivalent type-class representing that module's interface, then you convert that type-class to a dictionary.

As an example, I'm going to start from Dan Burton's Modular Prelude and rework his ByteStringModule type to use my improvement and then show how they differ.

Dan's ByteStringModule looks like this:

data ByteStringModule = S
  { map :: (Word8 -> Word8) -> ByteString -> ByteString
  , concatMap :: (Word8 -> ByteString) -> ByteString -> ByteString
  , filter :: (Word8 -> Bool) -> ByteString -> ByteString
  , length :: ByteString -> Int
  , singleton :: Word8 -> ByteString
  , null :: ByteString -> Bool
  , pack :: [Word8] -> ByteString
  , unpack :: ByteString -> [Word8]
  , empty :: ByteString
  , readFile :: FilePath -> IO ByteString
  , writeFile :: FilePath -> ByteString -> IO ()
  , break :: (Word8 -> Bool) -> ByteString -> (ByteString, ByteString)
  , span :: (Word8 -> Bool) -> ByteString -> (ByteString, ByteString)
  , dropWhile :: (Word8 -> Bool) -> ByteString -> ByteString
  , takeWhile :: (Word8 -> Bool) -> ByteString -> ByteString
  , any :: (Word8 -> Bool) -> ByteString -> Bool
  , all :: (Word8 -> Bool) -> ByteString -> Bool
  , splitAt :: Int -> ByteString -> (ByteString, ByteString)
  }


_Data_ByteString_ :: ByteStringModule
_Data_ByteString_ = S
  { null = ...
    ... }

instance Default ByteStringModule where
  def = _Data_ByteString_

To use my approach, you instead parametrize the dictionary on the type of the "string-like" thing:

data StringModule s = String
  { map :: (Word8 -> Word8) -> s -> s
  , concatMap :: (Word8 -> s) -> s -> s
  , filter :: (Word8 -> Bool) -> s -> s
  , length :: s -> Int
  , singleton :: Word8 -> s
  , null :: s -> Bool
  , pack :: [Word8] -> s
  , unpack :: s -> [Word8]
  , empty :: s
  , readFile :: FilePath -> IO s
  , writeFile :: FilePath -> s -> IO ()
  , break :: (Word8 -> Bool) -> s -> (s, s)
  , span :: (Word8 -> Bool) -> s -> (s, s)
  , dropWhile :: (Word8 -> Bool) -> s -> s
  , takeWhile :: (Word8 -> Bool) -> s -> s
  , any :: (Word8 -> Bool) -> s -> Bool
  , all :: (Word8 -> Bool) -> s -> Bool
  , splitAt :: Int -> s -> (s, s)
  }

Using this approach you can encode String, ByteString and Text, all using the same dictionary type:

import qualified Data.ByteString as S
import qualified Data.ByteString.Lazy as L
import qualified Data.Text as T
import qualified Prelude as P

lazyByteString :: StringModule L.ByteString
lazyByteString = String {
    null = L.null,
    ... }

strictByteString :: StringModule S.ByteString
strictByteString = String {
    null = S.null,
    ... }

text :: StringModule T.Text
text = String {
    null = T.null,
    ... }

string :: StringModule P.String
string = String {
    null = P.null,
    ... }

Here I've parametrized the dictionary on the "string-like" type, so we can reuse the same dictionary type for all of them. This represents the dictionary equivalent of the following type-class:

class StringModule s where
    map :: (Word8 -> Word8) -> s -> s
    concatMap :: (Word8 -> s) -> s -> s
    filter :: (Word8 -> Bool) -> s -> s
    length :: s -> Int
    singleton :: Word8 -> s
    null :: s -> Bool
    pack :: [Word8] -> s
    unpack :: s -> [Word8]
    empty :: s
    readFile :: FilePath -> IO s
    writeFile :: FilePath -> s -> IO ()
    break :: (Word8 -> Bool) -> s -> (s, s)
    span :: (Word8 -> Bool) -> s -> (s, s)
    dropWhile :: (Word8 -> Bool) -> s -> s
    takeWhile :: (Word8 -> Bool) -> s -> s
    any :: (Word8 -> Bool) -> s -> Bool
    all :: (Word8 -> Bool) -> s -> Bool
    splitAt :: Int -> s -> (s, s)

Comparison

This improvement has several advantages over Chris's original approach:

You don't have to define a new data structure for each "instance" of the module.
The shared dictionary type guarantees that all "instance"s expose the same signature
You can program generically over the StringModule "class"
No type-classes are required, so you can define multiple competing "instances" of the same module without conflicts.
You can qualify first-class modules instead of completely unpacking them.

Syntax

Now I'll show how you syntactically translate all the features of Chris's modules to my improved version. Let's assume we have some module Data.String, that exports the above four StringModule instances (i.e. lazyByteString, strictByteString, text, and string.

Using Chris's approach, you distinguish which module you wish to unpack into the current scope by choosing which constructor you unpack into:

zot :: (L.ByteString, S.ByteString)
zot = (a,b) where
  a = pack [1,2,3] where L{..} = def
  b = pack [1,2,3] where B{..} = def

Using the improved version, you distinguish which module you wish to unpack by explicitly selecting which dictionary you unpack:

zot = (a,b) where
  a = pack [1,2,3] where String{..} = lazyByteString
  b = pack [1,2,3] where String{..} = strictByteString

Hmmm, those are long names. Wouldn't it be nice if we could somehow alias them?

l = lazyByteString
s = strictByteString

zot = (a,b) where
  a = pack [1,2,3] where String{..} = l
  b = pack [1,2,3] where String{..} = s

Oh yeah! Dictionaries are first-class because they are ordinary Haskell values, so renaming them is easy. We can even locally alias modules, something which the ordinary module system cannot do:

zot = (a,b) where
  l = lazyByteString
  s = strictByteString
  a = pack [1,2,3] where String{..} = l
  b = pack [1,2,3] where String{..} = s

Also, using my improvement we can program generically over the string dictionary type:

zot :: StringModule s -> (s, s)
zot s = (a,b) where
  String{..} = s
  a = pack [1,2,3]
  b = pack [1,2,3]

This is just the dictionary version of a class constraint, where you pass the instance as an ordinary parameter. This is not possible using Chris's approach, which is one reason I am advocating this change.

Another advantage of this improvement is that you don't even need to unpack the module at all. You get qualification for free:

l = lazyByteString
s = strictByteString

zot = (a,b) where
  a = pack l [1,2,3]
  b = pack s [1,2,3]

In fact, if you alias first-class modules to single-letter names (as you might do with ordinary modules), then the syntactic overhead is identical to using ordinary modules: 2 extra characters, except as a suffix instead of a prefix.

Using Chris's approach, you would have to use the ordinary module system to qualify which pack you meant:

zot s = (a, b) where
  a = L.pack def [1,2,3]
  b = S.pack def [1,2,3]

So using his approach there is actually more syntactic overhead for qualifying imports, plus you must rely on the ordinary module system to namespace qualified imports.

Also, using his approach there is no good way to nest a module within a module and still qualify a nested import, since you can't program generically over the outer module. So there would be no good way to do something like this:

data OuterModule a = Out { outVal :: a }
data InnerModule a = In  {  inVal :: a }

dict :: OuterModule (InnerModule String)
dict = Out (In "Hello, world!")

contrived :: String
contrived = inVal (outVal dict) -- nested qualified import

Common features

No matter which approach you like, there are several cool features that both approaches share. For example, you can unpack unqualified names into the top-level global namespace. For Chris's approach you would insert the following top-level declaration:

L {..} = def

... and for my variation you would use:

String {..} = lazyByteString

Also, I will argue (to the death!) that both approaches are superior to type-class-based approaches. While I believe type-classes are okay for theoretically-grounded constructs (like Monad or Category), I believe that the dictionary approach is superior for banal interfaces like ListLike/ListModule or StringLike/StringModule since it is completely first-class.

Conclusions

This post is NOT intended to rip on Chris, but simply to improve on his original proposal. He had the two brilliant ideas of both using "type-classes" as modules and using the RecordWildCards extension to unpack names unqualified. I think the only mistake he made was unnecessarily filtering everything through the `Default` type-class and I only want to say that I think the pure dictionary approach is a strict improvement on his otherwise already brilliant idea.

15 comments:

SawadyJuly 19, 2012 at 12:38 PM
I'm thinking. A common porpouse of modules is to hide implementation info to the user, then they are not only interfaces.

For example:

--------------
module (Stack,push,pop,emptyS,isEmptyS) where

data Stack a = MkStack {stack :: [a]} -- we don't export the abstraction function

push x = MkStack . (x:) . stack
pop = MkStack . tail . stack
emptyS = MkStack []
isEmptyS = null . stack
--------------

And the user doesn't know that I implement a Stack using a List. I can impose invariants too, because I know that the user cannot get access to any implementation of functions provided by the interface. I can't do the same with these first-class modules that you are talking about, right? (I don't see how)
ReplyDelete
Replies
LennartJuly 20, 2012 at 12:01 AM
What you call "Chris's big insight" is well known Haskell folklore.
ReplyDelete
Replies
Scott KilpatrickAugust 20, 2012 at 7:09 AM
This can't really be considered a replacement for Haskell modules when it only encapsulates value components as opposed to the full shebang of values, data types, classes, and instances. How would I define, e.g., a Set type and hide the constructors so that clients are forced to use my provided functions to interact with it?

What this shows is that Haskell modules could really use a notion of interface -- analogous to your moduley data types -- to allow matching, parameterization, and perhaps even separate compilation.
ReplyDelete
Replies
UnknownAugust 15, 2016 at 3:13 AM
String and Text work with Char, not Word8. ByteString works with Word8, not Char. How does this design address that?
ReplyDelete
Replies

Add comment