Thursday, April 8, 2021

How to replace Proxy with AllowAmbiguousTypes

no-proxy

This post is essentially a longer version of this Reddit post by Ashley Yakeley that provides more explanation and historical context.

Sometimes in Haskell you need to write a function that “dispatches” only on a type and not on a value of that type. Using the example from the above post, we might want to write a function that, given an input type, prints the name of that type.

Approach 1 - undefined

One naive approach would be to do this:

class TypeName a where
    typeName :: a -> String

instance TypeName Bool where
    typeName _ = "Bool"

instance TypeName Int where
    typeName _ = "Int"

… which we could use like this:

>>> typeName False
"Bool"
>>> typeName (0 :: Int)
"Int"

However, this approach does not work well because we must provide the typeName method with a concrete value of type a. Not only is this superfluous (we don’t care which value we supply) but in some cases we might not even be able to supply such a value.

For example, consider this instance:

import Data.Void (Void)

instance TypeName Void where
    typeName _ = "Void"

There is a perfectly valid name associated with this type, but we cannot retrieve the name without cheating because we cannot produce a (total) value of type Void. Instead, we have to use something like undefined:

>>> typeName (undefined :: Void)
"Void"

The base package uses this undefined-based approach. For example, Foreign.Storable provides a sizeOf function that works just like this:

class Storable a where
    -- | Computes the storage requirements (in bytes) of the argument. The value
    -- of the argument is not used.
    sizeOf :: a -> Int

… and to this day the idiomatic way to use sizeOf is to provide a fake value using undefined:

>>> sizeOf (undefined :: Bool)
4

This works because sizeOf never evaluates its argument. It’s technically safe, albeit not very appealing to depend on undefined.

Approach 2A - Proxy

The next evolution of this approach was to use the Proxy type (now part of base in the Data.Proxy module). As the documentation notes:

Historically, Proxy :: Proxy a is a safer alternative to the undefined :: a idiom.

I’m not exactly sure what the name Proxy was originally meant to convey, but I believe the intention was that a term (the Proxy constructor) stands in as a “proxy” for a type (specifically, the type argument to the Proxy type constructor).

We can amend our original example to use the Proxy type like this:

import Data.Proxy (Proxy(..))
import Data.Void (Void)

class TypeName a where
    typeName :: Proxy a -> String

instance TypeName Bool where
    typeName _ = "Bool"

instance TypeName Int where
    typeName _ = "Int"

instance TypeName Void where
    typeName _ = "Void"

… and now we can safely get the name of a type without providing a specific value of that type. Instead we always provide a Proxy constructor and give it a type annotation which “stores” the type we wish to use:

>>> typeName (Proxy :: Proxy Bool)
"Bool"
>>> typeName (Proxy :: Proxy Int)
"Int"
>>> typeName (Proxy :: Proxy Void)
"Void"

We can simplify that a little bit by enabling the TypeApplications language extension, which permits us to write this:

>>> :set -XTypeApplications
>>> typeName (Proxy @Bool)
"Bool"
>>> typeName (Proxy @Int)
"Int"
>>> typeName (Proxy @Void)
"Void"

… or this if we prefer:

>>> typeName @Bool Proxy
"Bool"
>>> typeName @Int Proxy
"Int"
>>> typeName @Void Proxy
"Void"

Approach 2B - proxy

A minor variation on the previous approach is to use proxy (with a lowercase “p”) in the typeclass definition:

import Data.Void (Void)

class TypeName a where
    typeName :: proxy a -> String
    --          ↑

instance TypeName Bool where
    typeName _ = "Bool"

instance TypeName Int where
    typeName _ = "Int"

instance TypeName Void where
    typeName _ = "Void"

Everything else works the same, but now neither the author nor the consumer of the typeclass needs to depend on the Data.Proxy module specifically. For example, the consumer could use any other type constructor just fine:

>>> typeName ([] :: [Int])  -- Technically legal, but weird
"Int"

… or (more likely) the consumer could define their own Proxy type to use instead of the one from Data.Proxy, which would also work fine:

>>> data Proxy a = Proxy
>>> typeName (Proxy :: Proxy Int)
"Int"

This trick helped back when Proxy was not a part of the base package. Even now that Proxy is in base you still see this trick when people author typeclass instances because it’s easier and there’s no downside.

Both of these Proxy-based solutions are definitely better than using undefined, but they are both still a bit unsatisfying because we have to supply a Proxy argument to specify the desired type. The ideal user experience should only require the type and the type alone as an input to our function.

Approach 3 - AllowAmbiguousTypes + TypeApplications

We previously noted that we could shorten the Proxy-based solution by using TypeApplications:

>>> typeName @Bool Proxy
"Bool"

Well, what if we could shorten things even further and just drop the Proxy, like this:

>>> typeName @Bool

Actually, we can! This brings us to a more recent approach (the one summarized in the linked Reddit post), which is to use AllowAmbiguousTypes + TypeApplications, like this:

{-# LANGUAGE AllowAmbiguousTypes #-}

import Data.Void (Void)

class TypeName a where
    typeName :: String

instance TypeName Bool where
    typeName = "Bool"

instance TypeName Int where
    typeName = "Int"

instance TypeName Void where
    typeName = "Void"

… which we can invoke like this:

>>> :set -XTypeApplications
>>> typeName @Bool
"Bool"
>>> typeName @Int
"Int"
>>> typeName @Void
"Void"

The use of TypeApplications is essential, since otherwise GHC would have no way to infer which typeclass instance we meant. Even a type annotation would not work:

>>> typeName :: String  -- Clearly, this type annotation is not very helpful

<interactive>:1:1: error:
Ambiguous type variable ‘a0’ arising from a use of ‘typeName’
      prevents the constraint ‘(TypeName a0)’ from being solved.
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      These potential instances exist:
        instance [safe] TypeName Void -- Defined at Example.hs:14:10
        instance [safe] TypeName Bool -- Defined at Example.hs:8:10
        instance [safe] TypeName Int -- Defined at Example.hs:11:10
In the expression: typeName :: String
      In an equation for ‘it’: it = typeName :: String

Type applications work here because you can think of a polymorphic function as really having one extra function argument: the polymorphic type. I elaborate on this a bit in my post on Polymorphism for dummies, but the basic idea is that TypeApplications makes this extra function argument for the type explicit. This means that you can directly tell the compiler which type to use by “applying” the function to the right type instead of trying to indirectly persuade the compiler into using the the right type with a type annotation.

Conclusion

My personal preference is to use the last approach with AllowAmbiguousTypes and TypeApplications. Not only is it more concise, but it also appeals to my own coding aesthetic. Specifically, guiding compiler behavior using type-annotations feels more like logic programming to me and using explicit type abstractions and TypeApplications feels more like functional programming to me (and I tend to prefer functional programming over logic programming).

However, the Proxy-based approach requires no language extensions, so that approach might appeal to you if you prefer to use the simplest subset of the language possible.

No comments:

Post a Comment