Some time ago I asked a question on /r/haskell about what unique purpose GADTs served that other language features could not provide. Edward Kmett (as usual) gave the best answer:
To be pedantic, there should be no examples that cannot be encoded without GADTs. You can always transform a GADT into a finally [sic] tagless representation through a class if you have Rank N types.The "Finally Tagless" paper came up several times in that discussion, so I decided to read it and it was a very well-written and exciting paper. However, both Edward's answer and one key sentence from the paper damaged my understanding of GADTs for 7 months:
In Haskell, typecase can be implemented using either GADTs or type-class functional dependenciesThis combined with Edward's answer led me to believe that type-classes and functional dependencies were necessary to implement GADTs in the absence of language support for GADTs, but I recently discovered this is wrong! The only extension you need is Rank2Types, and no type-classes or functional dependencies are required at all.
For those unfamiliar with GADTs ("Generalized Algebraic Data Types"), they allow you to restrict the type variables of a constructor's result. For example, when you define a type like:
data Maybe a = Just a | Nothing... one of the things it does is define two constructors with the following type signatures:
Just :: a -> Maybe a Nothing :: Maybe aUsing the GADTs extensions, you can declare data types like Maybe a by instead supplying the type signature of the constructors:
data Maybe a where Just :: a -> Maybe a Nothing :: Maybe aHowever, this extension unlocks an additional feature, namely the ability to type-restrict the type variables of the constructor's final result. The most common example is a list with type-level length annotations.
data Z -- i.e. "Z"ero data S n -- i.e. "S"ucc -- A List of length 'n' holding values of type 'a' data List a n where Nil :: List a Z Cons :: a -> List a m -> List a (S m)The advantage of type-level annotations is that we can now use that annotation to enforce stronger invariants. For example, we can write a safe head function that only accepts lists with at least one element:
head :: List a (S n) -> a head (Cons a _) = aThe type-level length annotation guarantees that you can't pass an empty list (i.e. Nil) to head, so it is now a safe and total function.
However, you cannot directly translate the above List constructors into ordinary data types because with ordinary data types the final result of each constructor must be polymorphic in all type variables. The above definitions for Nil and Cons require restricting the length type variable returned by the constructor.
The Yoneda Lemma
Fortunately, the Yoneda lemma from category theory provides the necessary trick to convert a GADT to an ordinary data type. The Yoneda lemma translated into Haskell is actually more understandable than the equivalent category theory explanation (at least, to me). It simply says that if f is a functor, then the following two types are isomorphic:
(forall b. (a -> b) -> f b) ~ f a... which means that we can define two functions fw and bw that can convert back and forth between those two types:
fw :: (Functor f) => (forall b . (a -> b) -> f b) -> f a fw f = f id bw :: (Functor f) => f a -> (forall b . (a -> b) -> f b) bw x f = fmap f xActually, these functions must meet one more requirement to be an isomorphism. They must satisfy the following laws:
fw . bw = id bw . fw = idOne of those is easy to prove, the other one is difficult (and is the meat of the Yoneda lemma in Haskell). Also, you can translate the Yoneda lemma into Haskell in other ways, because it's very general, but the above version suffices for this post.
Now we will use the Yoneda isomorphism to transform the GADT constructors (with restricted results) into isomorphic ordinary constructors (with polymorphic results). Let's begin with the above List data type.
The Nil constructor has the signature:
Nil :: List a Z... but the Yoneda lemma says that if List a is a functor, then the following constructor is isomorphic:
Nil :: (Z -> n) -> List a nSimilarly, the signature for the Cons constructor:
Cons :: a -> List a m -> List a (S m)... can be transformed into:
Cons :: a -> List a m -> (S m -> n) -> List a nNow we have a data type where the constructors have polymorphic type variables in their output:
data List a n where Nil :: (Z -> n) -> List a n Cons :: a -> List a m -> (S m -> n) -> List a n... meaning that we can now transform it into an ordinary data type:
data List a n = Nil (Z -> n) | forall m. Cons a (List a m) (S m -> n)The above type signature only requires the Rank2Types extension to write.
Notice how the above type converts the "phantom type" into a concrete value-level dependency. By restricting the permissible functions for (Z -> n) and (S m -> n) we can constrain the GADTs equivalent "phantom type". Interestingly, those two constraints (correct me if I'm wrong) seem to define an F-algebra.
Remember that the Yoneda isomorphism only holds if List a is a functor, so the final step is to define the appropriate functor:
instance Functor (List a) where fmap f (Nil k) = Nil (f . k) fmap f (Cons a as k) = Cons a as (f . k)Or you can use the DeriveFunctor extension and just write:
data List a n = ... deriving (Functor)... and it will derive the correct Functor instance automatically for you.
Take care to note that this Functor instance is not the same as the ordinary Functor instance for lists. This one maps functions over the length type parameter n, as opposed to the value type parameter a.
The equations proving the Yoneda isomorphism say that we should be able to convert our data type back into the equivalent GADT just by using the fw function. Let's check it out:
>>> :t fw Nil fw Nil :: List a Z >>> :t fw (Cons 'A' (fw Nil)) fw (Cons 'A' (fw Nil)) :: List Char (S Z)We now have lists with type-level length annotations, but without GADTs. All we would have to do to recapitulate the GADT constructors would be to define the smart constructors:
nil = fw Nil cons a as = fw (Cons a as) >>> :t nil nil :: List a Z >>> :t cons 'A' nil cons 'A' nil :: List Char (S Z)There is no magic to what fw is doing. Remember all it does is supply an id, so we could have rewritten the above as:
>>> :t Nil id Nil id :: List a Z >>> :t Cons 'A' (Nil id) id Cons 'A' (Nil id) id :: List Char (S Z)To understand why this works, notice that the type signature of Nil is:
Nil :: (Z -> n) -> List a nSo when we pass it an id, the compiler infers that the first field of Nil must have type (Z -> Z), therefore n must be Z:
Nil :: (Z -> Z) -> List a ZIf our new List data type is truly isomorphic to the original, then we should be able to write a type-safe head function, just like with the GADT. Let's first write out the type signature for head:
head :: List a (S n) -> aThe argument has type List a (S n), which expands to one of two possible constructors:
List a (S n) = Nil (Z -> S n) | forall m. Cons a (List a m) -> (S m -> S n)The Cons constructor is always permissible, since we can always pass it an id for its final field, to satisfy the type (which would constrain m to n). However, the existence of the Nil constructor depends on the existence of a function of type (Z -> S n).
Fortunately, no such function exists, so we can guarantee that such a Nil constructor can never be built. S n has no constructors, so the only way we can build it is to start from a pre-existing value of type S n, and this is what the id function does.
However, more generally Haskell possesses a major flaw that empty types truly aren't empty and are in fact inhabited by the "bottom" value (i.e. undefined/_|_). This means that I could define the following function:
showZ :: Z -> String showZ _ = "Hello, world"... and it would type-check even though Z has no constructors that you could pass to showZ. Unfortunately, it type-checks because the following is valid Haskell:
fakeZ :: Z fakeZ = fakeZ -- the definition of bottom >>> showZ fakeZ "Hello, world!"This leads to weird situations where you could do things like build lists with lengths of type String
>>> :t Nil showZ Nil showZ :: List a StringSo the absence of a truly initial object in Haskell (i.e. a truly empty type) means that the above GADT transformation is not as safe as true GADTs. If it were, then we could implement all our type-level tricks at the value level, using constructor fields like our (Z -> n) to recapitulate type-level proof requirements at the value-level.
This means the best we can do is define our head function to be:
head :: List a (S n) -> a head (Cons a _ _) = a... and pray that the user does not define functions of types with empty constructors.
operational vs. free
Another example of this equivalence between these two representations of GADTs came up recently on reddit in the comments to another post of mine discussing free monads. There was a discussion between the relationship between free monads and the operational package. However, the above transformation shows that the free monad is as powerful as the operational approach.
To demonstrate this, using the free package you would define the following base functor for a teletype:
data TeletypeF x = PutChar Char x | GetChar (Char -> x) deriving (Functor).. and then free monad for a teletype program would then be:
type Teletype = Free TeletypeFUsing the operational package, you would instead write:
data TeletypeF x where PutChar :: Char -> TeletypeF () GetChar :: TeletypeF Char.. and the corresponding program would be:
type Teletype = Program TeletypeFHowever, the Yoneda transformation shows that the two data types are in fact equivalent. Transforming the operational GADT using the Yoneda lemma gives:
data TeletypeF x = PutChar :: Char -> (() -> x) -> TeletypeF x GetChar :: (Char -> x) -> TeletypeF x deriving (Functor)... which is exactly isomorphic to the signature for the corresponding Free monad.
GADTs are nothing more than the Yoneda lemma in disguise.
This implies that the absence of truly initial objects in Haskell is the only obstacle to moving all dependently-typed programming "down" to the value level. Haskell's type-level programming extensions can be seen as nothing more than a work-around for the lack of value-level initial objects, and they solve this issue by moving parts of the program to the type level where there are true initial objects. Thus, it perhaps is more appropriate to think of type-level programming as not being a step "above" value level programming in some sort of heirarchy (i.e. values → types → kinds → hyper-kinds) but rather simply existing on the same plane with different rules. This would imply Haskell programming is simply a hybrid approach of programming using two arenas (the value-level arena where initial objects don't exist and the type-level arena where they do exist). Instead of value-level and type-level programming being two different "levels" (which implies an ordering or hierarchy), maybe they are simply two unordered arenas of programming.