The Foreign.Storable module provides a Storable class with two rather unusual methods:

class Storable a where
  sizeOf    :: a -> Int
  alignment :: a -> Int
  ...

What is unusual about sizeOf and alignment? To demonstrate, consider what happens if we look up the sizes of various Int8s:

λ> sizeOf (0 :: Int8)
1
λ> sizeOf (1 :: Int8)
1
λ> sizeOf (2 :: Int8)
1
λ> all (\c -> sizeOf c == 1) [minBound..maxBound :: Int8]
True

No matter what Int8 value we provide as an argument, sizeOf will always give the same answer: 1 byte. In fact, sizeOf completely ignores its argument! You could just as well pass an infinite loop to sizeOf, and it will return the same answer:

λ> sizeOf (let x = x in x :: Int8)
1

It’s not just the Storable Int8 instance that exhibits this quirk, either. Every single Storable instance in the base library implements sizeOf and alignment so that they ignore their arguments. From a certain perspective, sizeOf and alignment are “metadata” methods of Storable. They make use of the type of the argument to look up the appropriate instance, but nothing more.

Many Haskellers (understandably) find it quite weird that these methods never make use of their argument, and various competing designs have emerged which attempt to address this weirdness. If these competing designs had won out when the base library was first being developed, then we might have a Storable class that looks like this instead:

class Storable a where
  sizeOf    :: Proxy a -> Int
  alignment :: Proxy a -> Int

Or perhaps this:

class Storable a where
  sizeOf    :: proxy a -> Int
  alignment :: proxy a -> Int

Or even this:

class Storable a where
  sizeOf    :: Int
  alignment :: Int

I’ve often wondered if there is one design to rule them all, so I wrote up this blog post in an attempt to better understand the pros and cons of each approach. Will we discover which design is unambiguously the best one by the end? Let’s find out.

Dummy arguments

Requirements: Any version of GHC

I would be remiss not to include the original design that sparked this whole conversation in the first place. Any time you have a class of the form:

class C a where
  metaData :: a -> Blah

If metaData doesn’t ever inspect its argument, then it has a dummy argument.

Pros

Simple types. There’s not much hassle involved in defining metaData, since its type a -> Blah is about as simple as it gets. (Hey, I had to think of something to put in the Pros column.)

Cons

Awkward APIs. The type a -> Blah alone doesn’t communicate the fact that its argument is never used, which means you have to document this fact (or worse, have users discover this themselves through trial and error).

Some creativity may be required to use it. It’s not always clear what to pass to metaData as an argument. For instance, what if you have an instance of C Void? At that point, your only option is to pass something like undefined :: Void or let x = x in x :: Void as an argument, so you’d better hope that metaData never forces this argument!

Proxy arguments

Requirements: Any version of GHC

One of the major drawbacks of the dummy arguments approach is that you often have to invent some value to pass as an argument. What if we could take away this guesswork? This is what Proxy gives you:

data Proxy a = Proxy

Proxy is a data type with a single constructor. Moreover, its type parameter a is a phantom type, so one never needs to actually come up with an inhabitant of a in order to create a Proxy. In other words, any time you need a Proxy a, it’s as simple as typing “Proxy”.

Using Proxy, we can redesign our class like so:

class C a where
  metaData :: Proxy a -> Blah

Pros

No creativity required. Any time you need to use metaData, all you have to do is pass it a Proxy at the right type.

More descriptive types. Without any other knowledge of what metaData does, its type Proxy a -> Blah is a pretty clear indication that a is only used for its type.

Cons

Awkward instantiation. When you type metaData Proxy, more often than not GHC will complain that the type of Proxy is ambiguous. If you’re using just Haskell 98, then resolving this ambiguity might require an explicit type signature of the form (Proxy :: Proxy a), which is rather verbose. If you’re allowed to use GHC extensions, then you can make this much more tolerable with TypeApplications, since you can shorten this to just (Proxy @a).

To match or not to match. When implementing metaData, do you write this?

instance C Foo where
  metaData Proxy = ...

Or this?

instance C Foo where
  metaData _ = ...

This is a subtle distinction, but one that can be important, since it’s possible that someone out there is invoking metaData (undefined :: Proxy a), and explicitly matching on the Proxy constructor would cause an exception at runtime. This suggests that the most permissive way to implement metaData is the latter form (without the match on Proxy). This is something that you must remember to do, however, as GHC won’t warn you if you don’t.

You’re locked into using Proxy. A very minor drawback, but hard-coding the use of Proxy in the type of metaData means that you can’t use any other types whose last type parameter is a. (More on this later.)

Proxy# arguments

Requirements: GHC 7.8 or later

A minor variation on the previous design is to use the primitive Proxy# data type from GHC.Exts:

data Proxy# a -- Abstract
proxy# :: Proxy# a

Proxy# is very similar to Proxy, but with two important distinctions:

  1. Unlike Proxy, Proxy# is unlifted, so it cannot be inhabited by a thunk.
  2. Proxy# is a zero-width data type. That is, it provides a guarantee that it will not occupy any memory at runtime.

Pros

Proxy# inherits all of the pros of Proxy. Some advantages that Proxy# has over Proxy are:

No matching ambiguity. Unlike with Proxy, where you had a choice of whether to match on the Proxy constructor or not, there is no such issue with Proxy#, as the constructor for Proxy# is not exposed in the first place. There are very few things you can do with an argument of Proxy# a (besides pass it to other functions), and this is a good thing, since it doesn’t give you any way to shoot yourself in the foot.

Slightly more efficient. Since Proxy# has a compile-time guarantee about its memory usage, one could argue that using Proxy# is more space-efficient than Proxy. Then again, GHC has a very powerful simplifier that will likely optimize away the overhead of Proxy arguments, so it’s unlikely that Proxy will be a performance bottleneck in practice.

Cons

It’s magic. Using Proxy# requires users to remember to turn on the MagicHash extension in order to use, and failing to remember this will result in somewhat obscure parser errors. (This is a problem that arises less often after you develop the habit of instinctively enabling MagicHash before diving into GHC’s primitives, but it can be confusing for newcomers.)

Proxy# also inherits some of the cons of Proxy, such as:

Awkward instantiation. This time, you might need to remember to explicitly write out (proxy# :: Proxy# a), or proxy# @_ @a with type applications. (Yes, that @_ is required!)

You’re locked into using Proxy#. I keep mentioning this as a drawback, so I should follow up on my promise to explain this con in more detail…

proxy arguments

Requirements: Any version of GHC

Having Proxy (or Proxy#) arguments in class methods is all well and good, but it is a bit limiting in that it forces you to, well, instantiate the methods with Proxy. Sometimes, however, you might already have another value lying around with a type that mentions a. For example, you might have an M1 i c f a (if you’re working with GHC.Generics) or a Sing a (if you’re working with the singletons library) or a TypeRep a (if you’re working with Type.Reflection). In these situations, it might be convenient to specify what a is by simply passing one of these values that you already have instead of conjuring up a separate Proxy for this purpose.

Luckily, there is a very simple trick for making your class methods this permissive: just generalize Proxy to proxy!

class C a where
  metaData :: proxy a -> Blah

In other words, turn Proxy into a type variable. This way, proxy can be instantiated with Proxy, M1 i c f, Sing, TypeRep, or whatever your heart may desire.

Pros

Minimal creativity required, but maximal creativity permitted. The name “proxy” very strongly hints at how it is intended to be instantiated 90% of the time, so most people will know how to use it just from the type alone. At the same time, adventurous users who with to instantiate proxy with things besides Proxy have the freedom to do so.

No matching ambiguity. One nice side effect of making metaData parametric over proxy is that you can’t match it against a specific constructor anymore. Like values of type Proxy# a, the only thing you can really do with a value of type proxy a is pass it around.

Cons

If I had written this blog post even one year ago, I would have struggled to think of any real downsides to this approach over using Proxy. But recently, Oleg Grenrus clued me in to a very serious downside of using proxy:

Incompatible with GeneralizedNewtypeDeriving. It turns out that, unlike any of the other approaches I discuss in this blog post [1], this one does not work when combined with the GeneralizedNewtypeDeriving language extension. I was stunned when I realized this, and it took me a while to realize why this is the case. To set the scene, imagine that we have this code:

class C1 a where
  metaData1 :: Proxy a -> Blah
instance C1 Int where
  metaData1 _ = ...

class C2 a where
  metaData2 :: proxy a -> Blah
instance C2 Int where
  metaData2 _ = ...

newtype Age = MkAge Int
  deriving (C1, C2)

This derives instances for Age by reusing the instances of the underlying type, Int. As it turns out, GHC can successfully derive a C1 Age instance, but not a C2 Age instance, as it will fail with this error:

    • Couldn't match representation of type ‘proxy Int’
                               with that of ‘proxy Age’
        arising from the coercion of the method ‘metaData2’
          from type ‘forall (proxy :: * -> *). proxy Int -> Blah’
            to type ‘forall (proxy :: * -> *). proxy Age -> Blah’
      NB: We cannot know what roles the parameters to ‘proxy’ have;
        we must assume that the role is nominal
    • When deriving the instance for (C2 Age)

Uh-oh. Roles strike again. If you want to know more about roles and what this error message means in more detail, I encourage you to read my previous blog posts on the subject. For the purposes of this blog post, it suffices to say that GHC is unable to take something of type proxy Int and coerce it to (i.e., reuse as) something of type proxy Age because the type variable proxy is abstract, and GHC doesn’t know how to look underneath it.

This is a serious bummer, since GeneralizedNewtypeDeriving is used ubiquitously to generate instances for newtypes for cheap. Moreover, since DerivingVia also generates instances in a similar fashion, this also renders this technique incompatible with DerivingVia. Surely there must be a way around this limitation?

“Representational” proxy arguments

Requirements: GHC 8.6 or later

Yes, there is a way around this limitaiton! We’ll need some fancy types to accomplish it, however. The trick is to reuse this type synonym that I introduced in this blog post about QuantifiedConstraints:

type Representational1 m =
  forall a b. Coercible a b => Coercible (m a) (m b)

Essentially, this says that we can coerce between m a and m b under the assumption that we can coerce between a and b. (In the language of roles, this would be as if m was representationally roled in its argument.) With Representational1 in hand, we can tweak metaData slightly:

class C a where
  metaData :: Representational1 proxy
           => proxy a -> Blah

Giving proxy a Representational1 constraint gives us a way to tell GHC that it’s quite alright to look underneath proxy when determining if proxy Int can be coerced to proxy Age. Sure enough, if we try it out:

instance C Int where
  metaData _ = ...

newtype Age = MkAge Int
deriving instance C Age

Then GHC accepts it! Note that I’m deliberately using StandaloneDeriving in the code above instead of simply writing newtype Age = MkAge Int deriving C, since the latter won’t typecheck due to unfortunate technical reasons.

Perhaps you want your proxy type variable to act even more like Proxy. One thing to note about Proxy is that its type argument is phantom, so Proxy a can be coerced to Proxy b for any types a and b. If you wish to encode this as a quantified constraint, you can whip up the following type:

type Phantom1 m = forall a b. Coercible (m a) (m b)

If you change the context of metaData to be Phantom1 proxy, this will require that proxy be instantiated only with type constructors that have phantom types.

Pros

This inherits most of the pros of the proxy approach, minus the downside of being incompatible with GeneralizedNewtypeDeriving.

Cons

Does anyone actually use this? As far as I know, I invented this technique for the first time when writing this blog post, so I would be shocked if anyone was using this trick in the wild. That isn’t necessarily a sign that this technique is flawed, only that it’s a new development.

Slightly less creativity permitted. If proxy has no constraints, then users can instantiate it with whatever type constructor they wish. If proxy has a Representational1 constraint, however, then users are slightly more restricted in what they can do. They wouldn’t be able to instantiate proxy with Sing or TypeRep, for instance, since both of those type constructors are nominally roled in their argument. (That is, in order to coerce from Sing a to Sing b, then a and b must be exactly the same type.) Users will be even more limited if proxy has a Phantom1 constraint.

No explicit arguments at all

Requirements: GHC 8.0 or later

To round off my analysis, I want to mention one more approach that started to become viable in GHC 8.0. Unlike the other designs in this blog post, this design eschews the use of explicit arguments altogether:

class C a where
  metaData :: Blah

One thing to note is that GHC will reject the code above by default, since a is ambiguous in the type of metaData (forall a. C a => Blah). You can relax this restriction, however, with the AllowAmbiguousTypes extension. In order to actually use metaData, you’ll need to make use of another language extension, TypeApplications:

metaData @a

Besides this funkiness, this approach works just like the other ones in this post [2].

Pros

Brevity. This approach requires by far the least amount of typing. Instead of having to plumb around proxies to inform metaData what a is, metaData relies on liberal use of TypeApplications to do the heavy lifting.

Cons

Potential for confusing error messages. If you accidentally forget to provide metaData with its type argument @a, you can get extremely disorienting error messages like this one:

    • No instance for (C a0) arising from a use of ‘metaData’
    • In the expression: metaData

If you’re unaccustomed to the quirks of GHC’s error messages, it might require some head-scratching before you realize what the source of the problem is.

This isn’t a problem that’s exclusively limited to this approach, since GHC might complain similarly if you type metaData Proxy (without the a). But you if start to type “Proxy”, it’s somewhat easier to remember that you need to finish the thought by typing out “Proxy @a” in full. On the other hand, invisible type arguments can be easily forgotten altogether.


Final thoughts

That concludes my analysis of all the different ways one can define “metadata” methods in type classes. To recap, we have looked at the following designs:

  1. Dummy arguments
  2. Proxy arguments
  3. Proxy# arguments
  4. proxy
  5. “Representational” proxy arguments
  6. No explicit arguments at all

Out of these six choices, is there a clear winner? As much as I want to pick one, each of them have at least one downside that can prove to be annoying in practice. Perhaps this isn’t surprising, as all six methods accomplish the goal of specifying a type a in a rather indirect fashion. In an ideal world, I would be able to just say this:

metaData a

That is, I would like to be able to just pass the type a as a plain old argument to metaData without any additional fuss. While GHC certainly can’t do this today, this might be possible in a future version of GHC where terms and types can be intermixed like this. If we had the syntax for Dependent Haskell quantifiers from this GHC proposal, we could imagine metaData having this type:

metaData :: forall a -> C a => Blah

This is almost the same type as forall a. C a => Blah, except that a is an explicit argument to metaData. This type signature would eliminate many of the drawbacks of other approaches:

  • No proxies (or other strange types) need to be passed, just a.
  • There’s no way to pattern-match on a, since forall is an irrelevant quantifier (see the proposal for a more detailed explanation of what this means).
  • If you accidentally forget to supply a as an argument, you’ll get a decent error message, since a is an explicit argument.
  • It should be compatible with GeneralizedNewtypeDeriving.

This sounds great in theory, although imagining how to give metaData this type might prove to be challenging in practice. Even supposing that we had Dependent Haskell today, the syntax for type classes is quite rigid. If you have this:

class C a where
  metaData :: ...

Then regardless of what the rest of the type of metaData is, it will always begin with the following:

metaData :: forall a. C a => ...

Notice that forall a has a dot after it, not an arrow, which is what we really want. Unfortunately, GHC doesn’t offer any mechanism to change this. There has been one GHC proposal put forward which suggests a way to tweak this slightly, but it remains to be seen if the proposal would permit the sort of type signature I envision for metaData (or if the proposal will even be accepted at all).

OK, that’s enough speculation about Dependent Haskell. For now, I hope this blog post serves as a useful field guide to the various ways one can write methods like metaData and can help inform you if you ever need to craft one of your own.


  1. Actually, it turns out that the Proxy# approach doesn’t work well with GeneralizedNewtypeDeriving either, but that’s due to a GHC bug

  2. In case you’re curious, this approach is compatible with GeneralizedNewtypeDeriving, but only with GHC 8.8 or later, since that features an important bugfix