#### Prelude

Some time ago I asked a question on /r/haskell about what unique purpose GADTs served that other language features could not provide. Edward Kmett (as usual) gave the best answer:

To be pedantic, there should be no examples that cannot be encoded without GADTs. You can always transform a GADT into a finally [sic] tagless representation through a class if you have Rank N types.The "Finally Tagless" paper came up several times in that discussion, so I decided to read it and it was a very well-written and exciting paper. However, both Edward's answer and one key sentence from the paper damaged my understanding of GADTs for 7 months:

In Haskell, typecase can be implemented using either GADTs or type-class functional dependenciesThis combined with Edward's answer led me to believe that type-classes and functional dependencies were necessary to implement GADTs in the absence of language support for GADTs, but I recently discovered this is wrong! The only extension you need is

`Rank2Types`, and no type-classes or functional dependencies are required at all.

#### GADTs

For those unfamiliar with GADTs ("Generalized Algebraic Data Types"), they allow you to restrict the type variables of a constructor's result. For example, when you define a type like:

data Maybe a = Just a | Nothing... one of the things it does is define two constructors with the following type signatures:

Just :: a -> Maybe a Nothing :: Maybe aUsing the

`GADTs`extensions, you can declare data types like

`Maybe a`by instead supplying the type signature of the constructors:

data Maybe a where Just :: a -> Maybe a Nothing :: Maybe aHowever, this extension unlocks an additional feature, namely the ability to type-restrict the type variables of the constructor's final result. The most common example is a list with type-level length annotations.

data Z -- i.e. "Z"ero data S n -- i.e. "S"ucc -- A List of length 'n' holding values of type 'a' data List a n where Nil :: List a Z Cons :: a -> List a m -> List a (S m)The advantage of type-level annotations is that we can now use that annotation to enforce stronger invariants. For example, we can write a safe

`head`function that only accepts lists with at least one element:

head :: List a (S n) -> a head (Cons a _) = aThe type-level length annotation guarantees that you can't pass an empty list (i.e.

`Nil`) to

`head`, so it is now a safe and total function.

However, you cannot directly translate the above

`List`constructors into ordinary data types because with ordinary data types the final result of each constructor must be polymorphic in all type variables. The above definitions for

`Nil`and

`Cons`require restricting the length type variable returned by the constructor.

#### The Yoneda Lemma

Fortunately, the Yoneda lemma from category theory provides the necessary trick to convert a GADT to an ordinary data type. The Yoneda lemma translated into Haskell is actually more understandable than the equivalent category theory explanation (at least, to me). It simply says that if

`f`is a functor, then the following two types are isomorphic:

(forall b. (a -> b) -> f b) ~ f a... which means that we can define two functions

`fw`and

`bw`that can convert back and forth between those two types:

fw :: (Functor f) => (forall b . (a -> b) -> f b) -> f a fw f = f id bw :: (Functor f) => f a -> (forall b . (a -> b) -> f b) bw x f = fmap f xActually, these functions must meet one more requirement to be an isomorphism. They must satisfy the following laws:

fw . bw = id bw . fw = idOne of those is easy to prove, the other one is difficult (and is the meat of the Yoneda lemma in Haskell). Also, you can translate the Yoneda lemma into Haskell in other ways, because it's very general, but the above version suffices for this post.

#### The Trick

Now we will use the Yoneda isomorphism to transform the GADT constructors (with restricted results) into isomorphic ordinary constructors (with polymorphic results). Let's begin with the above

`List`data type.

The

`Nil`constructor has the signature:

Nil :: List a Z... but the Yoneda lemma says that if

`List a`is a functor, then the following constructor is isomorphic:

Nil :: (Z -> n) -> List a nSimilarly, the signature for the

`Cons`constructor:

Cons :: a -> List a m -> List a (S m)... can be transformed into:

Cons :: a -> List a m -> (S m -> n) -> List a nNow we have a data type where the constructors have polymorphic type variables in their output:

data List a n where Nil :: (Z -> n) -> List a n Cons :: a -> List a m -> (S m -> n) -> List a n... meaning that we can now transform it into an ordinary data type:

data List a n = Nil (Z -> n) | forall m. Cons a (List a m) (S m -> n)The above type signature only requires the

`Rank2Types`extension to write.

Notice how the above type converts the "phantom type" into a concrete value-level dependency. By restricting the permissible functions for

`(Z -> n)`and

`(S m -> n)`we can constrain the GADTs equivalent "phantom type". Interestingly, those two constraints (correct me if I'm wrong) seem to define an F-algebra.

Remember that the Yoneda isomorphism only holds if

`List a`is a functor, so the final step is to define the appropriate functor:

instance Functor (List a) where fmap f (Nil k) = Nil (f . k) fmap f (Cons a as k) = Cons a as (f . k)Or you can use the

`DeriveFunctor`extension and just write:

data List a n = ... deriving (Functor)... and it will derive the correct

`Functor`instance automatically for you.

Take care to note that this

`Functor`instance is not the same as the ordinary

`Functor`instance for lists. This one maps functions over the length type parameter

`n`, as opposed to the value type parameter

`a`.

#### Conversion

The equations proving the Yoneda isomorphism say that we should be able to convert our data type back into the equivalent GADT just by using the

`fw`function. Let's check it out:

>>> :t fw Nil fw Nil :: List a Z >>> :t fw (Cons 'A' (fw Nil)) fw (Cons 'A' (fw Nil)) :: List Char (S Z)We now have lists with type-level length annotations, but without GADTs. All we would have to do to recapitulate the GADT constructors would be to define the smart constructors:

nil = fw Nil cons a as = fw (Cons a as) >>> :t nil nil :: List a Z >>> :t cons 'A' nil cons 'A' nil :: List Char (S Z)There is no magic to what

`fw`is doing. Remember all it does is supply an

`id`, so we could have rewritten the above as:

>>> :t Nil id Nil id :: List a Z >>> :t Cons 'A' (Nil id) id Cons 'A' (Nil id) id :: List Char (S Z)To understand why this works, notice that the type signature of

`Nil`is:

Nil :: (Z -> n) -> List a nSo when we pass it an

`id`, the compiler infers that the first field of

`Nil`must have type

`(Z -> Z)`, therefore

`n`must be

`Z`:

Nil :: (Z -> Z) -> List a ZIf our new

`List`data type is truly isomorphic to the original, then we should be able to write a type-safe

`head`function, just like with the GADT. Let's first write out the type signature for

`head`:

head :: List a (S n) -> aThe argument has type

`List a (S n)`, which expands to one of two possible constructors:

List a (S n) = Nil (Z -> S n) | forall m. Cons a (List a m) -> (S m -> S n)The Cons constructor is always permissible, since we can always pass it an

`id`for its final field, to satisfy the type (which would constrain

`m`to

`n`). However, the existence of the

`Nil`constructor depends on the existence of a function of type

`(Z -> S n)`.

Fortunately, no such function exists, so we can guarantee that such a

`Nil`constructor can never be built.

`S n`has no constructors, so the only way we can build it is to start from a pre-existing value of type

`S n`, and this is what the

`id`function does.

#### Initiality

However, more generally Haskell possesses a major flaw that empty types truly aren't empty and are in fact inhabited by the "bottom" value (i.e.

`undefined`/

`_|_`). This means that I could define the following function:

showZ :: Z -> String showZ _ = "Hello, world"... and it would type-check even though

`Z`has no constructors that you could pass to

`showZ`. Unfortunately, it type-checks because the following is valid Haskell:

fakeZ :: Z fakeZ = fakeZ -- the definition of bottom >>> showZ fakeZ "Hello, world!"This leads to weird situations where you could do things like build lists with lengths of type

`String`

>>> :t Nil showZ Nil showZ :: List a StringSo the absence of a truly initial object in Haskell (i.e. a truly empty type) means that the above GADT transformation is not as safe as true GADTs. If it were, then we could implement all our type-level tricks at the value level, using constructor fields like our

`(Z -> n)`to recapitulate type-level proof requirements at the value-level.

This means the best we can do is define our

`head`function to be:

head :: List a (S n) -> a head (Cons a _ _) = a... and pray that the user does not define functions of types with empty constructors.

`operational` vs. `free`

Another example of this equivalence between these two representations of GADTs came up recently on reddit in the comments to another post of mine discussing free monads. There was a discussion between the relationship between free monads and the

`operational`package. However, the above transformation shows that the free monad is as powerful as the

`operational`approach.

To demonstrate this, using the

`free`package you would define the following base functor for a teletype:

data TeletypeF x = PutChar Char x | GetChar (Char -> x) deriving (Functor).. and then free monad for a teletype program would then be:

type Teletype = Free TeletypeFUsing the

`operational`package, you would instead write:

data TeletypeF x where PutChar :: Char -> TeletypeF () GetChar :: TeletypeF Char.. and the corresponding program would be:

type Teletype = Program TeletypeFHowever, the Yoneda transformation shows that the two data types are in fact equivalent. Transforming the operational GADT using the Yoneda lemma gives:

data TeletypeF x = PutChar :: Char -> (() -> x) -> TeletypeF x GetChar :: (Char -> x) -> TeletypeF x deriving (Functor)... which is exactly isomorphic to the signature for the corresponding Free monad.

#### Conclusions

GADTs are nothing more than the Yoneda lemma in disguise.

This implies that the absence of truly initial objects in Haskell is the only obstacle to moving all dependently-typed programming "down" to the value level. Haskell's type-level programming extensions can be seen as nothing more than a work-around for the lack of value-level initial objects, and they solve this issue by moving parts of the program to the type level where there are true initial objects. Thus, it perhaps is more appropriate to think of type-level programming as not being a step "above" value level programming in some sort of heirarchy (i.e. values → types → kinds → hyper-kinds) but rather simply existing on the same plane with different rules. This would imply Haskell programming is simply a hybrid approach of programming using two arenas (the value-level arena where initial objects don't exist and the type-level arena where they do exist). Instead of value-level and type-level programming being two different "levels" (which implies an ordering or hierarchy), maybe they are simply two unordered arenas of programming.

But isn't a difference between the two styles that in the Yoneda lemma style, you actually need a value-level datum for the list length, whereas in the GADT style this datum never has to exist at the value level?

ReplyDeleteThe "datums" are the functions transforming the embedded index type (such as Z for Nil) to the exposed type (i.e. the polymorphic n). These functions are proofs that the exposed type is reachable from the embedded type (ala Curry-Howard correspondence), and every type is guaranteed to be reachable from itself via the "id", function. Therefore, at a minimum the embedded index is always reachable from itself simply by supplying "id", even if the type is completely uninhabited. However, that only works because of the lack of initiality. If Haskell somehow got rid of bottom, then yes you would need to provide a concrete constructor to define the appropriate id for that type.

DeleteYou don't need MPTCs to implement the class-based fake GADTs from the finally tagless paper, just classes.

ReplyDeleteAs for why quantification is enough, classes are themselves a limited form of quantification. If you attempt to desugar a typeclass into its dictionary, you'll find that almost all of them have rank 2 types.

If you have rank 2 types, you can encode the classes from that paper as a data type, and this result should be completely unsurprising.

As for the approach you give here, negating an existential (as you'd get from a GADT) yields the universal quantifier you see here.

The way I've come to think of the operational monad approach vs the more direct implementation free monad or my Church encoded variant is that the operational monad is just Free (Coyoneda f), and since Coyoneda f ~ f, (modulo the extra bottom forced on you by haskell) the operational monad is isomorphic to Free f. Since Coyoneda f provides 'fmap fusion' for you, and the Free monad only uses the fmap of the underlying functor, Free (Coyoneda f) or the operational monad provide fusion for >>='s. (Also, when you fuse the Coyoneda constructor with the Free constructor from the Free monad, you can eliminate the place the extra bottom I'd mentioned hides.

data FreeCoyoneda f a = Pure a | forall b. Free (f b) (a -> FreeCoyoneda f a)

Thanks. I'd always suspected they were isomorphic and hadn't yet proven the other direction, but your formulation makes the isomorphism clear.

DeleteYes I have used also the free monad of coyoneda and have discussed on IRC, so it is like that although I think it should be

Deletedata FreeCoyoneda f a = Pure a | forall b. Free (f b) (b -> FreeCoyoneda f a)

I think I am the guy who also thought of that. I have also made up things like that list with the specified number of values in the type, although I have used an uninhabited type Zero and then Maybe for the successor. You can make it like the GADT described although using this representation also means that (->) is the type of an array with the specified number of elements!

There is a mistake I think:

ReplyDelete(forall a. (a -> b) -> f b) ~ f a

'forall a' should be 'forall b'

Thanks. I fixed it.

DeleteInterestingly, this is really the coercion encoding that GHC implements, only using functions to approximate coercions!

ReplyDeleteWhat's wrong with the showZ function? I could write it in Agda, too. There's nothing verboten about ignoring the argument. You need to change the definition of List if you want to exclude List Int String. Taking coercions as inspiration, though, you could ask for functions *both* ways (just as coercions are invertible):

data List a n where

Nil :: (Z -> n) -> (n -> Z) -> List a n

Cons :: a -> List a m -> (S m -> n) -> (n -> S m) -> List a n

I guess List a isn't a functor anymore, though.

Also, here's a total function from Z to S n:

f :: Z -> S n

f x = case x of _ -> undefined

The undefined there is dead code. (In GHC with EmptyCase you can just say “case x of {}”.) You don't want Z and S to be *empty* types, you want them to be *abstract* types.