Wednesday, May 2, 2012

Scrap your type classes

Edit: My opinion on type classes has mellowed since I wrote this post, but I still keep it around as a critique against the excesses of type classes.

What I'm about to propose is that all Haskell type class programming can (and should) be implemented purely at the value level using a simple and ordinary code transformation.

The trick is simple and I will begin by transforming the Monad type-class. Given any class:
class Monad m where
    return :: a -> m a
    (>>=)  :: m a -> (a -> m b) -> m b
You can delete the class and replace it with a corresponding data type:
{-# LANGUAGE Rank2Types #-}

-- MonadI = Monad "I"nstance
data MonadI m = MonadI {
    _return :: forall a . a -> m a,
    _bind   :: forall a b . m a -> (a -> m b) -> m b }
Then, given any instance for that class:
instance Monad Maybe where
    return = Just
    m >>= f = case m of
        Nothing -> Nothing
        Just x  -> f x
... delete that instance and replace it with a value of our data type containing the method definitions:
monad'Maybe :: MonadI Maybe
monad'Maybe = MonadI {
    _return = Just,
    _bind   = \m f -> case m of
        Nothing -> Nothing
        Just x  -> f x }
This value-level representation of a class instance has several important benefits.


No class constraints


Now class constraints transform into ordinary parameters:
-- Before
sequence :: (Monad m) => [m a] -> m [a]

-- After
sequence' :: MonadI m -> [m a] -> m [a]
sequence' i x = case x of
    []   -> return []
    m:ms -> m           >>= \x  ->
            sequence' i ms >>= \xs ->
            return (x:xs)
  where
    return = _return i
    (>>=)  = _bind   i

sequence' monad'Maybe [Just 3, Just 4]
= Just [3, 4]
This means that we can now skip type-level programming and implement everything within lambda calculus at the value level. Things that previously required elaborate extensions now only require ordinary Haskell functions.


No more extension hell


Let's say I wanted to implement an isomorphism type class using traditional type classes:
class Isomorphism a b where
    fw :: a -> b
    bw :: b -> a
The first problem is that I'd need to turn on MultiParamTypeClasses to write a class like this.

Now I try to write this instance:
instance Isomorphism ((), a) a where
    fw = snd
    bw = (,) ()
Oops! Now I need FlexibleInstances for the () in the type. That extension's not as controversial, though.

But what if I then do this:
instance (Isomorphism a b, Isomorphism b c)
 => Isomorphism a c where
    fw = fw . fw
    bw = bw . bw
This is completely unresolvable, even using UndecidableInstances. GHC will just barf and terminate after several rounds of recursion.

On the other hand, had I written it like:
data Isomorphism a b = Isomorphism {
    fw :: a -> b,
    bw :: b -> a }
... then I can trivially combine isomorphisms:
combine :: Isomorphism b c -> Isomorphism a b -> Isomorphism a c
combine (Isomorphism fw1 bw1) (Isomorphism fw2 bw2)
    = Isomorphism (fw1 . fw2) (bw2 . bw1)

Haskell is better at value-level programming


The astute reader will notice that the last definition suggests a category. Let's go all the way and rewrite the Category class at the value level and use it:
data CategoryI cat = CategoryI {
    _compose :: forall a b c . cat b c -> cat a b -> cat a c,
    _id      :: forall a . cat a a }

category'Function :: CategoryI (->)
category'Function = CategoryI {
    _compose = \f g x -> f (g x),
    _id      = \x -> x }

category'Isomorphism :: CategoryI Isomorphism
category'Isomorphism = CategoryI {
    _compose = let (.) = _compose category'Function
                in \(Isomorphism fw1 bw1)
                    (Isomorphism fw2 bw2) ->
                       Isomorphism (fw1 . fw2) (bw1 . bw2),
    _id      = let id = _id category'Function
                in Isomorphism id id }
Now we can just combine isomorphisms using ordinary composition:
iso1 :: Isomorphism ((a, b), c) (a, (b, c))
iso1 = Isomorphism {
    _fw = \((a, b), c) = (a, (b, c)),
    _bw = \(a, (b, c)) = ((a, b), c) }

iso2 :: Isomorphism ((), a) a
    _fw = \((), a) = a,
    _bw = \a -> ((), a) }

(.) = _compose category'Function
iso1 . iso2 :: Isomorphism (((), b), c) (b, c)
... instead of attempting a bunch of type-class hackery that doesn't work. More importantly, we can now use our more featureful value-level programming tools to do what was incredibly difficult to do at the type level.


Class maintenance


One big issue in Haskell is maintaining class APIs. However, when we implement classes at the value level, this problem completely disappears.

For example, let's say that I realize in retrospect that my Monad class needed to be split into two classes, one named Pointed to hold return and one named Monad that has Pointed as a superclass. If people use my Monad class extensively, then I'd have to break all their Monad instances if I split it into two separate classes because now they would have to spin off all of their return implementations into separate instances for Pointed.

Now, had I implemented it as a data type, it wouldn't even matter. I'd just write:
data PointedI m = PointedI { _pure :: forall a . a -> m a }

-- Pointed is a super-class of Monad
pointed'Super'Monad :: MonadI m -> PointedI m
pointed'Super'Monad i = PointedI (_return i}
Similarly, I can translate:
class (Pointed m) => Monad m where ...
... into:
monad'Pointed'Bind ::
    PointedI m -> (m a -> (a -> m b) -> m b) -> MonadI m
monad'Pointed'Bind i b = MonadI (_pure i) b
Now users can automatically derive Pointed instances from their old Monad instances, or they can choose to write a Pointed instance and then build a Monad instance on top of it.


Backwards compatibility


Similarly, let's say that I forgot to make Functor a superclass of Monad. What's incredibly painful for the Haskell community to solve at the type-level is utterly straightforward to fix after-the-fact at the value level:
data FunctorI f = FunctorI {
    _fmap :: forall a b . (a -> b) -> f a -> f b }

functor'Monad :: MonadI m -> FunctorI m
functor'Monad i = FunctorI { _fmap = \f x -> x >>= return . f }
  where
    (.)    = _compose category'Function
    (>>=)  = _bind   i
    return = _return i

No more newtypes


Don't you hate having to wrap things using newtypes to get the correct class instance? Well, now that's unnecessary:
data MonoidI m = MonoidI {
    _mempty  :: m,
    _mappend :: m -> m -> m }

monoidSum :: MonoidI Int
monoidSum = MonoidI {
    _mempty  = 0,
    _mappend = (+) }

monoidProduct :: MonoidI Int
monoidProduct = MonoidI {
    _mempty  = 1,
    _mappend = (*) }

mconcat :: MonoidI a -> [a] -> a
mconcat i = foldr (_mappend i) (_mempty i)

sum     = mconcat monoidSum
product = mconcat monoidProduct
Now we're actually writing in a true functional style where sum and product are true functions of the instance, rather than fake functions of a class constraint using awkward newtypes.


Value-level programming is safer


Type classes are used most often for operator overloading. The dark side to this that your overloaded function will type-check on anything that is an instance of that class, including things you may not have intended it to type-check on.

For example, let's say I'm trying to write the following code using the ever-so-permissive Binary class:
main = encodeFile "test.dat" (2, 3)
... but it's 3:00 in the morning and I make a mistake and instead type:
main = encodeFile "test.dat" (2, [3])
This type-checks and silently fails! However, had I explicitly passed the instance I wished to use, this would have raised a compile-time error:
binPair :: BinaryI a -> BinaryI b -> BinaryI (a, b)
binInt  :: BinaryI Int

-- Won't compile!
main = encodeFile (binPair binInt binInt) "test.dat" (2, [3])
You might say, "Well, I don't want to have to annotate the type I'm using. I want it done automatically." However, this is the exact same argument made for forgiving languages like Perl or PHP were people advocate that in ambiguous situations the language or library should attempt to silently guess what you intended to do in instead of complaining loudly. This is exactly the antithesis of a strongly typed language!

Also, in the above case you would have had to annotate it anyway, because Binary wouldn't have been able to infer the specific type of the numeric literals!
main = encodeFile "test.dat" (2 :: Int, 3, :: Int)
Or what if I wanted to implement two different ways to encode a list, one which was the naive encoding and one which used more efficient arrays for certain types:
-- Naive version
instance Binary a => Binary [a] where ...
-- Efficient array version
instance Binary [Int] where ...
Oops! OverlappingInstances! I'd have to wrap one of them in a newtype, which take just as much effort to do as just passing the value instance:
binList :: BinaryI a -> BinaryI [a]
binInt  :: BinaryI Int

main = encodeFile (binList binInt) "listInt.dat" [1..10]
If I was really clever, I could even write implement both instances using the same binList function and then have it select whether to encode a list or array based on the sub-instance passed to it! That's not even possible using type-classes.


No type annotations


Here's another example of an incredibly awkward use of typeclasses:
class Storable a where
    ...
    sizeOf :: a -> Int
Anybody who has ever had to use this knows how awkward it is when you don't have a value of type a to provide it, which is common. You have to do this:
sizeOf (undefined :: a)
That's just horrible, especially when the solution with value-level instances is so simple in comparison:
data StorableI a = Storable {
    ...
    _sizeOf :: Int }

storable'CInt = StorableI {
    ...
    _sizeOf = 4 }
Now we'd just call:
_sizeOf storable'CInt
... instead of using undefined as a hack.

In fact, with value-level instances, type annotations are never ever necessary. Instead of:
readInt :: String -> Int
readInt = read
... or:
read "4" :: Int
... we'd just use:
read read'Int "4"
In other words, the value-level instance is all the information the function needs, and it's guaranteed to be sound and catch incorrect instance errors at compile-time.


Powerful Approach


I wanted to demonstrate that this is a really industrial-strength replacement to type classes, so I took the mtl's StateT, ReaderT, and Identity and implemented them entirely in value-level instances. The code is provided in the Appendix of this post. This implementation allows you to straightforwardly translate:
test :: (MonadState a m, MonadReader a m) => m ()
test = ask >>= put
... into
test :: MonadStateI a m -> MonadReaderI a m -> m ()
test = \is ir -> let (>>=)   = _bind (_monad'Super'MonadState is)
                  in (_ask ir) >>= (_put is)
You can then instantiate test at the value level using any monad instances that implement the State and Reader capabilities and it generates the correct type and implementation:
example1 :: ReaderT a (StateT a Identity) ()
example1 = test
    (monadState'ReaderT $ monadState'StateT $ monad'Identity)
    (monadReader'ReaderT $ monad'StateT $ monad'Identity)

example2 :: StateT a (ReaderT a Identity) ()
example2 = test
    (monadState'StateT $ monad'ReaderT $ monad'Identity)
    (monadReader'StateT $ monadReader'ReaderT $ monad'Identity)

run1 = runIdentity $ runStateT (runReaderT example1 'A') 'B'
run2 = runIdentity $ runReaderT (runStateT example2 'B') 'A'
-- Both output ((), 'A')
Despite the incredible verbosity, it achieves two amazing things:
  • It's implemented with only a single extension: Rank2Types. No UndecidableInstances required.
  • No type signatures or type annotations are necessary. You can delete every single type signature in the file, which is completely self-contained, and the compiler infers every single type correctly. Try it!


More tricks


This is just scratching the surface. This post doesn't even really cover all the things that are only possible with value-level instances like:
  • Generate lenses for instances (example: Lens (Binary [Int]) (Binary Int))
  • Instances parametrized by run-time values
  • Infinite families of instances (i.e. Stream (MyClassI m))
In other words, what I'm trying to say is that value-level instances are right now above us in the "power spectrum" of Haskell programming and you don't really get a feel for how incredibly useful they are until you actually start using them.


Simplicity


Another feature about value-level instances is the conceptual simplicity and elegance. Before there is a type-class checker and a type-checker. Now there is just a type-checker. You don't really appreciate how great this is until you try it and start getting amazingly clear compiler errors. Programming without type-classes is very intuitive! Really, the hardest part about it is simply naming things!

Also, the fact that it's implementable purely using ordinary functional programming is a very big win. If anything, it would make the GHC compiler writer's jobs much easier by not requiring them to entertain any of the half-baked type-class extensions that people propose. This approach allows you to completely remove type-classes from the language. I'm just putting that out there.

Flaws


On that note, that brings me to the last section, where I will frankly discuss all the huge problems with it. The four biggest problems are:
  • No ecosystem for it. To make effective use of it, you'd need new versions of most Haskell libraries.
  • No do syntactic sugar. This one hurts.
  • Verbosity. Every instance has to be named and passed around.
  • Inertia. Programmers used to overloading will be reluctant to start specifying the instance they want.
The first issue is a huge problem and can only be solved if the community agrees this is actually a good idea. I'm only one person and that's about all my opinion counts for. All I can do is mention that more recent data types are already moving in this direction, with Lens (from data-lens) being the best example. Just imagine how impossible it would have been to implement Lens as a class:
class Lens a b where
    get :: a -> b
    set :: b -> a -> a
It fails horrendously, for the exact same reason the Isomorphism class crashes and burns. When implemented as a data type, it works completely flawlessly at the expense of extra verbosity. So if you liked Lens, chances are you'll like value-level instances in general.

The second issue of syntactic sugar can be solved by something like RebindableSyntax and having do notation use whatever (>>=) is in scope. You would then specify which MonadI instance you use for each do block:
let (>>=) = _bind m
 in do ...
... or you pass the MonadI instance as a parameter to the do block.

This is not ideal, unfortunately and ties into the third issue of verbosity. All I can say is that the only way you can understand that the verbosity is "worth it" is if you try it out and see how much more powerful and easier it is than type-class programming. Also, value-level instances admit the exact same tricks to clean up code as normal parameter passing. For example, you can use Reader (MonadI m) to avoid explicitly passing a monad instance around.

However, this still doesn't solve the problem of just coming up with names for the instances, which is uncomfortable until you get used to it and come up with a systematic nomenclature. This is a case where a more powerful name-spacing system would really help.

The last problem is the most insidious one, in my opinion, which is that we as Haskell programmers have been conditioned to believe that it is correct and normal to have operators change behavior silently when passed different arguments, which completely subverts type-safety. I'm going to conclude by saying that this is absolutely wrong and that the most important reason that you should adopt value-level instances is precisely because they are the type-safe approach to ad-hoc polymorphism.


Appendix


The following code implements StateT, ReaderT, Identity, MonadState, and MonadReader from the mtl, along with some example functions. The code is completely self-contained and can be loaded directly into ghci. Every function is annotated with a comment showing how the mtl implements the exact same class or instance so you have plenty of examples for how you would translate the type-class approach into the value-level instance approach.
{-# LANGUAGE Rank2Types #-}

newtype StateT  s m a = StateT   { runStateT   :: s -> m (a, s) }
newtype ReaderT r m a = ReaderT  { runReaderT  :: r -> m a      }
newtype Identity    a = Identity { runIdentity :: a             }

{- class Monad m where
       return :: a -> m a
       (>>=) :: m a -> (a -> m b) -> m b -}
data MonadI m = MonadI {
    _return :: forall a . a -> m a,
    _bind   :: forall a b . m a -> (a -> m b) -> m b }

{- class MonadTrans t where
       lift :: Monad m => m a -> t m a -}
data MonadTransI t = MonadTransI {
    _lift :: forall a m . MonadI m -> m a -> t m a }

{- class Monad m => MonadState s m | m -> s where
       get :: m s
       put :: s -> m ()
       state :: (s -> (a, s)) -> m a -}
data MonadStateI s m = MonadStateI {
    -- This next line is the secret sauce
    _monad'Super'MonadState :: MonadI m,
    _put :: s -> m (),
    _get :: m s,
    _state :: forall a . (s -> (a, s)) -> m a }

{- class Monad m => Monadreader r m | m -> r where
       ask    :: m r
       local  :: (r -> r) -> m a -> m a
       reader :: (r -> a) -> m a -}
data MonadReaderI r m = MonadReaderI {
    _monad'Super'MonadReader :: MonadI m,
    _ask    :: m r,
    _local  :: forall a . (r -> r) -> m a -> m a,
    _reader :: forall a . (r -> a) -> m a }

{- get :: (Monad m) => StateT s m s
   get = StateT $ \s -> return (s, s) -}
get :: MonadI m -> StateT s m s
get i = StateT $ \s -> (_return i) (s, s)

{- put :: (Monad m) => s -> StateT s m ()
   put s = StateT $ \_ -> return ((), s) -}
put :: MonadI m -> s -> StateT s m ()
put i s = StateT $ \_ -> (_return i) ((), s)

{- state :: (Monad m) => (s -> (a, s)) -> StateT s m a
   state f = StateT (return . f) -}
state :: MonadI m -> (s -> (a, s)) -> StateT s m a
state i f = StateT ((_return i) . f)

{- ask :: (Monad m) => ReaderT r m r
   ask = ReaderT return -}
ask :: MonadI m -> ReaderT r m r
ask i = ReaderT (_return i)

{- local :: (Monad m) =>
       (r -> r) -> ReaderT r m a -> ReaderT r m a
   local f m = ReaderT $ runReaderT m . f -}
local :: MonadI m -> (r -> r) -> ReaderT r m a -> ReaderT r m a
local _ f m = ReaderT $ runReaderT m . f

{- reader :: (Monad m) => (r -> a) -> ReaderT r m a
   reader f = ReaderT (return . f) -}
reader :: MonadI m -> (r -> a) -> ReaderT r m a
reader i f = ReaderT ((_return i) . f)

{- instance Monad (Identity) where
       return = Identity
       m >>= k = k $ runIdentity m -}
monad'Identity :: MonadI Identity
monad'Identity = MonadI {
    _return = Identity,
    _bind = \m k -> k $ runIdentity m }

{- instance (Monad m) => Monad (StateT s m) where
       return a = state $ \s -> (a, s)
       m >>= k  = StateT $ \s -> do
           (a, s') <- runStateT m s
           runStateT (k a) s' -}
monad'StateT :: MonadI m -> MonadI (StateT s m)
monad'StateT i =
    let (>>=) = _bind i
     in MonadI {
            _return = \a -> state i $ \s -> (a, s),
            _bind   = \m k -> StateT $ \s ->
                          runStateT m s >>= \(a, s') ->
                          runStateT (k a) s' }

{- instance (Monad m) => Monad (ReaderT s m) where
       return = lift . return
       m >>= k = ReaderT $ \r -> do
           a <- runReaderT m r
           runReaderT (k a) r -}
monad'ReaderT :: MonadI m -> MonadI (ReaderT s m )
monad'ReaderT i =
    let return = _return i
        (>>=) = _bind i
        lift = _lift monadTrans'ReaderT i
     in MonadI {
            _return = lift . (_return i),
            _bind = \m k -> ReaderT $ \r ->
                runReaderT m r >>= \a ->
                runReaderT (k a) r }

{- instance MonadTrans StateT where
       lift m = StateT $ \s -> do
           a <- m
           return (a, s) -}
monadTrans'StateT :: MonadTransI (StateT s)
monadTrans'StateT = MonadTransI {
    _lift = \i m ->
        let return = _return i
            (>>=)  = _bind   i
         in StateT $ \s ->
                m >>= \a ->
                return (a, s) }

{- instance MonadTrans ReaderT where
       lift m = ReaderT (const m) -}
monadTrans'ReaderT :: MonadTransI (ReaderT r)
monadTrans'ReaderT = MonadTransI {
    _lift = \_ m -> ReaderT (const m) }

{- instance (Monad m) => MonadState s (StateT s m) where
       get = get -- from Control.Monad.Trans.State
       put = put
       state = state -}
monadState'StateT :: MonadI m -> MonadStateI s (StateT s m)
monadState'StateT i = MonadStateI {
    _monad'Super'MonadState = monad'StateT i,
    _get   = get   i,
    _put   = put   i,
    _state = state i }

{- instance (MonadState s m) => MonadState s (ReaderT r m) where
       get   = lift get
       put   = lift . put
       state = lift . state -}
monadState'ReaderT ::
    MonadStateI s m -> MonadStateI s (ReaderT r m)
monadState'ReaderT i =
    let monad'm = _monad'Super'MonadState i
        lift = _lift monadTrans'ReaderT monad'm
     in MonadStateI {
            _monad'Super'MonadState = monad'ReaderT monad'm,
            _get   = lift $ _get   i,
            _put   = lift . _put   i,
            _state = lift . _state i }

{- instance Monad m => MonadReader r (ReaderT r m) where
       ask = ask
       local = local
       reader = reader -}
monadReader'ReaderT :: MonadI m -> MonadReaderI r (ReaderT r m )
monadReader'ReaderT i = MonadReaderI {
    _monad'Super'MonadReader = monad'ReaderT i,
    _ask    = ask    i,
    _local  = local  i,
    _reader = reader i }

{- instance (MonadReader r m) => MonadReader r (StateT s m) where
       ask = lift ask
       local = \f m -> StateT $ local f . runStateT m
       reader = lift . reader -}
monadReader'StateT ::
    MonadReaderI r m -> MonadReaderI r (StateT s m)
monadReader'StateT i =
    let monad'm = _monad'Super'MonadReader i
        lift = _lift monadTrans'StateT monad'm
     in MonadReaderI {
            _monad'Super'MonadReader = monad'StateT monad'm,
            _ask = lift $ _ask i,
            _local = \f m -> StateT $ (_local i f) . runStateT m,
            _reader = lift . (_reader i) }

{- test :: (MonadState a m, MonadReader a m) => m ()
   test = ask >>= put -}
test :: MonadStateI a m -> MonadReaderI a m -> m ()
test = \is ir -> let (>>=)   = _bind (_monad'Super'MonadState is)
                  in (_ask ir) >>= (_put is)

example1 :: ReaderT a (StateT a Identity) ()
example1 = test
    (monadState'ReaderT $ monadState'StateT $ monad'Identity)
    (monadReader'ReaderT $ monad'StateT $ monad'Identity)

example2 :: StateT a (ReaderT a Identity) ()
example2 = test
    (monadState'StateT $ monad'ReaderT $ monad'Identity)
    (monadReader'StateT $ monadReader'ReaderT $ monad'Identity)

run1, run2 :: ((), Char)
run1 = runIdentity $ runStateT (runReaderT example1 'A') 'B'
run2 = runIdentity $ runReaderT (runStateT example2 'B') 'A'

42 comments:

  1. Type-classes in Scala are types themselves, and instances are first class values. Therefore they provide best of both worlds - no explicit passing of dictionaries, but at the same time you get all the value level goodness you describe in this post. Some example code here[1]. See this paper[2] for more details.

    [1]: https://gist.github.com/2575397
    [2]: http://ropas.snu.ac.kr/~bruno/papers/TypeClasses.pdf

    ReplyDelete
    Replies
    1. I never knew that! Thanks for the examples. I think that's a very interesting approach.

      Delete
  2. Much of what you're saying is basically "Compilers can't do this right so let's do it by hand instead".

    To which the answer has always been "What do you mean can't? You just need a better compiler".

    I won't disagree that the original H98 type classes are a flawed design, witnessed by the forest of extensions that were born around them. That's to be expected from greenfield work.

    But does that necessarily mean type class-ish overloading is a dead idea? No! It means we need a better type class design, informed by our 20 years of experience with Haskell 98.

    First of all, let me note that I'm going to ignore your point about safety. I've never actually encountered wrong code that typechecks using an unintended instance when doing actual development, nor have I heard anyone mention this as a significant problem for them.

    After this, the first point to make is this: since the use of type classes (class and instance declaration being a different business) has always been understood as being transformed into value-level code by the compiler, any expressiveness loss over manual dictionary passing can be alleviated by offering an escape hatch into the desugared version (I'm surprised there actually hasn't been any investigation in this direction).

    Regarding the extensibility problem of the current design, we have an extension ready.


    Of course, this means even more extensions. It looks like you have a problem with extensions. I don't. I'm very happy that Haskell remains committed to being the next Haskell while staying backward compatible.

    I'd definitely rather setup a quick keybinding to insert my favorite extension pragmas at the top of my module than write code in which every use of
    * Arithmetic
    * Comparison and equality
    * Monadic bind
    * Conversion to string
    requires me to pass dictionaries around manually.

    ReplyDelete
    Replies
    1. I'm aware of the class alias extension proposal (I am a fan of it!). It was foremost in my mind when writing that section.

      I have encountered many problems with type safety, mainly because I tend to use an fmap/return-heavy style (i.e. fmap instead of map, return instead of singleton, etc.). That's probably bad style, but type safety shouldn't be a matter of coding style.

      What I really want to know is if correct extension wiring is a decidable problem or not. Do you know if that's the case? However, even if it's undecidable, I can understand your approach that we should let the compiler do the parts that are decidable, but then we could also leave the undecidable parts to humans.

      Delete
  3. Just to make sure: wouldn't this then require that every single possible combination of type classes would require its own new datatype?

    ReplyDelete
    Replies
    1. No. You just pass them as separate arguments (see, the mtl example code), or as a tuple if that's your thing.

      Delete
  4. Type classes where created to solve the problem of ad-hoc polymorphism (i.e. "How to make ad-hoc polymorphism less ad hoc" in http://homepages.inf.ed.ac.uk/wadler/topics/type-classes.html). Having different functions for different types and manually threading/selecting them was a problem to be solved.

    OTOH I agree that having the ability to access the dictionary as a value is interesting and worth exploring.

    ReplyDelete
  5. Excellent post!

    Btw, I've found that using the ImplictParams extension with your value-level typeclasses makes them much less verbose: https://gist.github.com/2578479

    ReplyDelete
  6. I didn't even realize that extension existed! Thanks for the tip. I'll check it out.

    ReplyDelete
  7. You're making a very good point, and a I this post clarifies me some things :).

    I think that it's important the choice between type-level and value-level programming. For instance, in OOP some languages forces the creation of classes to model, and other languages use prototypes and permit value-level modeling (I'm making a contrast from this post :) ).

    I'm thinking in a sentence that I read a long time ago:

    "A language must provide a means for classifying similar objects, and for adding new classes of objects on equal footing with the kernel classes of the system." From: http://www.cs.virginia.edu/~evans/cs655/readings/smalltalk.html

    In Haskell we have the data construct that permit making this classification of values, and this is why you can model Monads in a value-level way.

    But the same article says:

    "Factoring: Each independent component in a system would appear in only one place"

    This is another good point. In the value-level modeling I can change a received in a function, so I think this is a disadvantage to take into account.

    For instance:

    f aMonoid = ...
    where newMonoind = aMonoid { _mappend = (+) }

    In the type-level version I can't do that. I can't mix interface implementation with functions that use those interfaces. This then introduces some inherent complexity. One may like the idea to make the model flow-depedent (in Haskell we have referential transparency so maybe this is not the case), but complexity is added.

    ReplyDelete
    Replies
    1. That's an interesting point. I would actually argue that the "single location" principle is actually wrong! This leads to a single point of failure and a non-robust system. The best example of this is the one I gave in my post about not being able to fix Monad to automatically derive a functor instance from it. Every type-class that the Haskell eco-system builds on becomes a single point of failure. However, given a value-level representation, it's trivial to duplicate a value and fix it.

      As you mentioned, this leads to the problem of instance coherency, which other people raised on reddit. I don't know the answer to this, but I think any solution that relies on there being a single location is inherently flawed (and not functional).

      Delete
  8. Please see this:

    http://ropas.snu.ac.kr/~bruno/papers/TypeClasses.pdf

    The Scala doesn't have type classes, but can simulate them using explicit dictionaries and implicit parameters. This combines the strengths of explicit dictionary approach and type classes - the dictionaries are inferred, but one can pass an explicitly named dictionary if needed.

    ReplyDelete
    Replies
    1. Thanks for the link. That's a great paper, even for a guy like me who hasn't programmed in Scala.

      Delete
  9. You can actually recover the do notation syntax. The trick, which Ed Kmett showed me, is to have all expressions that return a monadic value, f a, instead return a function, Monad f -> f a. So, return :: a -> (Monad f -> f a), (>>=) :: (Monad f -> f a) -> (a -> (Monad f -> f b)) -> (Monad f -> f b). You can implement all the monadic combinators like this, and desugar do notation to them. The do block evaluates to a function that accepts the monad dictionary, so you can even conveniently write code that's polymorphic in the choice of monad, without having to thread the dictionary around manually.

    ReplyDelete
    Replies
    1. That's a nice trick! I should have thought of using the Reader monad.

      Delete
    2. That's really nice. I don't have type-classes in my little Haskell subset that compiles to JavaScript (I should probably think of a name for it), but I want explicit dictionary monads and polymorphic do-notation. This approach will allow me to do that!

      Delete
  10. This is actually how I first implemented polymorphism before I learned how to use type classes! One major drawback that you don't point out, however, is that it prevents you from deriving instances of e.g. Show---of course you'd call it something else under your system---because functions cannot be shown.

    ReplyDelete
  11. How can you represent data families using this approach?

    ReplyDelete
  12. Interesting. This is what I've recently called "Interface-Passing Style". See my ILC2012 article: http://common-lisp.net/~frideau/lil-ilc2012/lil-ilc2012.html
    http://common-lisp.net/~frideau/lil-ilc2012/lil-ilc2012.pdf

    Question is: can you do that in Haskell and still achieve Ad Hoc polymorphism? Maybe by using Type-Classes instead of mere records to define and refine your Interface objects?

    ReplyDelete
    Replies
    1. Yes, you can. The current way to do this is to reify a type class as a dictionary:

      -- From the 'constraints' package
      data Dict :: Constraint -> * where
      Dict :: a => Dict a

      The `ConstraintKinds` extension makes the above trick possible and it has opened a lot of doors for users that need to work around limitations of the type class system.

      Delete
  13. I think newtypes are still required for monad transformers, since there is no type-lambda in Haskell.

    ReplyDelete
    Replies
    1. Yes, that's right. The type variable have to be in the right places otherwise the dictionary you get will be unusable since it won't type-check.

      Delete
  14. Really nice post, thanks for doing this. Is the Rank2Types extension used so that the data definitions only need one type parameter?

    ReplyDelete
    Replies
    1. Yes. You could remove the extension by parametrizing them on the hidden types.

      I also later found out that PolymorphicComponents is a milder version of the same extension.

      Delete
  15. If I don't missed something you missed that type classes are primarily a solution to the http://en.wikipedia.org/wiki/Expression_problem . Passing functions explicit couples the used code and is therefore inappropriate to solve problems on this level.

    ReplyDelete
    Replies
    1. I'm not sure about this. What's the difference between coupling induced by relying on a constraint and the coupling induced from relying on an explicit dictionary argument?

      Delete
    2. I'm not sure either (what is mostly based on my limited understanding of haskell and how exactly type classes are emerging - e.g. are they "global"). I just came to think that type classes allow to "add" a type to any data type without changing its original definition. The "dependency" is on the common contract defined (externally) by the type class. As I understand it you make the type class a data type with appropriate functions instead. What does that mean for a "user" of that construct ? Can you make a type (e.g. Int) *be* another type ? Hence you extend the "nature" of a type external from the type definition itself. I assume your approach would add something I would describe as an "adapter" between the Int from the example and the new "type class". What allows for some use cases of type classes and does not for others as I see it.

      Delete
    3. The main difference between a type class and a dictionary of functions is how much work the compiler does with you. The trick you are describing is defining a value by its interfaces, such as:

      (HasFoo a, HasBar a) => a -> Int

      The equivalent dictionary-based approach would be:

      (HasFoo a, HasBar a) -> a -> (S

      In the former approach the compiler would fill in the necessary dictionaries and make sure that they were unique for that type.

      to make that concrete, let's imagine that our classes had the following definitions:

      class HasFoo a where foo :: a -> String

      class HasBar a where bar :: a -> Char

      Then our function is basically the same thing as:

      (a -> String, a -> Char) -> a -> Int

      So type-classes work because they avoid dealing with the concrete underlying data type by abstracting over it completely and replacing it with `a`, instead providing the relevant functions that `a` must support.

      Delete
    4. So one could say that (also) type constraints are transformed into data (here a tuple of functions) what is - as you have shown - much easier to handle. I don't think this is just true for haskell in particular. The type system most often is something the programmer can't program while we can program data and functions. Beside this functionality aspect I try to understand if there is a difference between this and type classes when it comes to "who is defining what where, what depends on the other, how can things be extended". I didn't get this picture clear. My goal here is to find a replacement for most of the type argument programming because it easily gets in the way of the programmer as you have shown for haskell specific constructs.

      Delete
    5. So there are disadvantages to getting rid of type classes completely. The compiler does less work for the programmer and you can't enforce the "one unique instance per type" principle.

      The main difference between the two approaches in terms of the expression problem is that using type classes means that the person defining the type class instance supplies the missing functions. Without type classes the responsibility of supplying the dictionaries shifts to the person calling the function.

      Delete
    6. It took me a while but now I got something clear and found what I was looking for. And I made one surprising and amusing conclusion: type classes are inherently an object oriented thought. The say that a type *has a* function (or more) while your proposal is saying that there *is a* known function (or more) what is far more "functional" notion. Thanks for the discussion.

      Delete
    7. That's right. Type classes associate the function with the type, whereas dictionary passing does not implicitly assume that there is only one suitable function for a given type.

      Sometimes this is good, like when you want to ensure that the same function is used consistently, such as the `Ord` constraint that a `Set` uses to order its elements. If you keep changing the comparison function you use each time you do an operation on the `Set` then you will break its internal invariants. In that case the type class is good because it enforces that the comparison function is unique for that type.

      However, sometimes it causes problems, too. For example, `Bool` has two dual instances for `Monoid`, both equally useful and equally valid:

      instance Monoid Bool where mempty = False and mappend = (&&)

      instance Monoid Bool where mempty = True and mappend = (||)

      So we end up distinguishing which one we mean using the `All` or `Any` newtypes from `Data.Monoid`.

      Delete
    8. Interesting. While haskell is a great source of good ideas it surely also has its odds. I think I found a solution (for my type system) that allows the global enforcement as a special case of the general concept that is case based but still the programmer does not need to manually manage the dictionaries - that will be part of the language. I'm not 100% sure but I think that it will also eliminate all the complex/impossible corner cases of type parameters as in java or type classes. But it is way to early. Currently the only running compiler is in my mind.

      Delete
    9. What you described is what the Agda language does. Agda allows you to both manually manage the dictionaries or let the compiler insert the implicitly for you. I think it also has ways to enforce uniqueness as a special case, too, but I'm not completely sure.

      Delete
    10. The logic aspect is also interesting for me. I will read the tutorial paper on Agda.

      Delete
  16. > ... or you pass the MonadI instance as a parameter to the do block.

    This is pretty similar to how F# does it. A computation expression, which is the ~equivalent of a do-block, is started with an object that has a Bind and a Return method, plus a bunch of other optional ones that mostly only make sense because F# isn't pure. http://msdn.microsoft.com/en-us/library/dd233182.aspx

    ReplyDelete
    Replies
    1. Neat! I never knew that. That's an interesting approach to use an object as a dictionary. It's certainly less clumsy than the Haskell equivalent unless you use Haskell's `RecordWildCard` extension to unpack all the methods.

      Delete
  17. Nice. What you are discussing seems to be embraced by GADT's. See the GHC manual 7.4.6

    ReplyDelete
    Replies
    1. Maybe, I only have a tenuous grasp of what you are explaining--and in the field in general. The nuances are so important here so I may be off base, but here's what I'm thinking. I came to your page through a google search when I was trying to understand that manual section the passage:

      "For example, one possible application is to reify dictionaries:
      data NumInst a where
      MkNumInst :: Num a => NumInst a

      intInst :: NumInst Int
      intInst = MkNumInst

      plus :: NumInst a -> a -> a -> a
      plus MkNumInst p q = p + q
      "
      The plus application here is similar to your sequence' example, except that the pattern match on MkNumInst implies Num a, (makes that dictionary available(?)). IOW, GADT's allow for this pattern match to make available (+) automatically based on the class constraint included in the MkNumInst definition. So its not a scrap of the type class system but instead somewhat of a fusion the system and the methodology you are writing about.

      Delete
    2. Oh yeah, that's right! `GADT`s are one way to package up type class instances into dictionaries that act like witnesses to that instance. If you're really interested in this line of inquiry I've heard that you can do even more powerful things along these lines using the `ConstraintKinds` extension, although I don't know the specific details.

      Delete