Proxy arguments in class methods: a comparative analysis
The Foreign.Storable
module provides a Storable
class with two rather unusual methods:
What is unusual about sizeOf
and alignment
? To demonstrate, consider what happens if we look up the sizes of various Int8
s:
No matter what Int8
value we provide as an argument, sizeOf
will always give the same answer: 1 byte. In fact, sizeOf
completely ignores its argument! You could just as well pass an infinite loop to sizeOf
, and it will return the same answer:
It’s not just the Storable Int8
instance that exhibits this quirk, either. Every single Storable
instance in the base
library implements sizeOf
and alignment
so that they ignore their arguments. From a certain perspective, sizeOf
and alignment
are “metadata” methods of Storable
. They make use of the type of the argument to look up the appropriate instance, but nothing more.
Many Haskellers (understandably) find it quite weird that these methods never make use of their argument, and various competing designs have emerged which attempt to address this weirdness. If these competing designs had won out when the base
library was first being developed, then we might have a Storable
class that looks like this instead:
Or perhaps this:
Or even this:
I’ve often wondered if there is one design to rule them all, so I wrote up this blog post in an attempt to better understand the pros and cons of each approach. Will we discover which design is unambiguously the best one by the end? Let’s find out.
Dummy arguments
Requirements: Any version of GHC
I would be remiss not to include the original design that sparked this whole conversation in the first place. Any time you have a class of the form:
If metaData
doesn’t ever inspect its argument, then it has a dummy argument.
Pros
Simple types. There’s not much hassle involved in defining metaData
, since its type a -> Blah
is about as simple as it gets. (Hey, I had to think of something to put in the Pros column.)
Cons
Awkward APIs. The type a -> Blah
alone doesn’t communicate the fact that its argument is never used, which means you have to document this fact (or worse, have users discover this themselves through trial and error).
Some creativity may be required to use it. It’s not always clear what to pass to metaData
as an argument. For instance, what if you have an instance of C Void
? At that point, your only option is to pass something like undefined :: Void
or let x = x in x :: Void
as an argument, so you’d better hope that metaData
never forces this argument!
Proxy
arguments
Requirements: Any version of GHC
One of the major drawbacks of the dummy arguments approach is that you often have to invent some value to pass as an argument. What if we could take away this guesswork? This is what Proxy
gives you:
Proxy
is a data type with a single constructor. Moreover, its type parameter a
is a phantom type, so one never needs to actually come up with an inhabitant of a
in order to create a Proxy
. In other words, any time you need a Proxy a
, it’s as simple as typing “Proxy
”.
Using Proxy
, we can redesign our class like so:
Pros
No creativity required. Any time you need to use metaData
, all you have to do is pass it a Proxy
at the right type.
More descriptive types. Without any other knowledge of what metaData
does, its type Proxy a -> Blah
is a pretty clear indication that a
is only used for its type.
Cons
Awkward instantiation. When you type metaData Proxy
, more often than not GHC will complain that the type of Proxy
is ambiguous. If you’re using just Haskell 98, then resolving this ambiguity might require an explicit type signature of the form (Proxy :: Proxy a)
, which is rather verbose. If you’re allowed to use GHC extensions, then you can make this much more tolerable with TypeApplications
, since you can shorten this to just (Proxy @a)
.
To match or not to match. When implementing metaData
, do you write this?
Or this?
This is a subtle distinction, but one that can be important, since it’s
possible that someone out there is invoking metaData (undefined :: Proxy a)
,
and explicitly matching on the Proxy
constructor would cause an exception at
runtime. This suggests that the most permissive way to implement metaData
is
the latter form (without the match on Proxy
). This is something that you
must remember to do, however, as GHC won’t warn you if you don’t.
You’re locked into using Proxy
. A very minor drawback, but hard-coding the
use of Proxy
in the type of metaData
means that you can’t use any other
types whose last type parameter is a
. (More on this later.)
Proxy#
arguments
Requirements: GHC 7.8 or later
A minor variation on the previous design is to use the primitive Proxy#
data
type from GHC.Exts
:
Proxy#
is very similar to Proxy
, but with two important distinctions:
- Unlike
Proxy
,Proxy#
is unlifted, so it cannot be inhabited by a thunk. Proxy#
is a zero-width data type. That is, it provides a guarantee that it will not occupy any memory at runtime.
Pros
Proxy#
inherits all of the pros
of Proxy
. Some advantages that Proxy#
has
over Proxy
are:
No matching ambiguity. Unlike with Proxy
, where you had a choice of
whether to match on the Proxy
constructor or not, there is no such issue
with Proxy#
, as the constructor for Proxy#
is not exposed in the first
place. There are very few things you can do with an argument of Proxy# a
(besides pass it to other functions), and this is a good thing, since it
doesn’t give you any way to shoot yourself in the foot.
Slightly more efficient. Since Proxy#
has a compile-time guarantee about
its memory usage, one could argue that using Proxy#
is more space-efficient
than Proxy
. Then again, GHC has a very powerful simplifier that will likely
optimize away the overhead of Proxy
arguments, so it’s unlikely that Proxy
will be a performance bottleneck in practice.
Cons
It’s magic. Using Proxy#
requires users to remember to turn on the
MagicHash
extension in order to use, and failing to remember this will result
in somewhat obscure parser errors. (This is a problem that arises less often
after you develop the habit of instinctively enabling MagicHash
before diving
into GHC’s primitives, but it can be confusing for newcomers.)
Proxy#
also inherits some of the cons
of Proxy
, such as:
Awkward instantiation. This time, you might need to remember to explicitly
write out (proxy# :: Proxy# a)
, or proxy# @_ @a
with type applications.
(Yes, that @_
is required!)
You’re locked into using Proxy#
. I keep mentioning this as a drawback, so
I should follow up on my promise to explain this con in more detail…
proxy
arguments
Requirements: Any version of GHC
Having Proxy
(or Proxy#
) arguments in class methods is all well and good,
but it is a bit limiting in that it forces you to, well, instantiate the methods
with Proxy
. Sometimes, however, you might already have another value lying
around with a type that mentions a
. For example, you might have an
M1 i c f a
(if you’re working with GHC.Generics
) or a
Sing a
(if you’re working with the singletons
library)
or a
TypeRep a
(if you’re working with Type.Reflection
).
In these situations, it might be convenient to specify what a
is by simply
passing one of these values that you already have instead of conjuring up a
separate Proxy
for this purpose.
Luckily, there is a very simple trick for making your class methods this permissive:
just generalize Proxy
to proxy
!
In other words, turn Proxy
into a type variable. This way, proxy
can be
instantiated with Proxy
, M1 i c f
, Sing
, TypeRep
, or whatever your heart
may desire.
Pros
Minimal creativity required, but maximal creativity permitted. The name “proxy
”
very strongly hints at how it is intended to be instantiated 90% of the time, so
most people will know how to use it just from the type alone. At the same time,
adventurous users who with to instantiate proxy
with things besides Proxy
have the freedom to do so.
No matching ambiguity. One nice side effect of making metaData
parametric
over proxy
is that you can’t match it against a specific constructor anymore.
Like values of type Proxy# a
, the only thing you can really do with a value of
type proxy a
is pass it around.
Cons
If I had written this blog post even one year ago, I would have struggled to
think of any real downsides to this approach over using Proxy
. But recently,
Oleg Grenrus clued me in to a very serious downside of using proxy
:
Incompatible with GeneralizedNewtypeDeriving
. It turns out that, unlike any of
the other approaches I discuss in this blog post [1], this one does not work when
combined with the
GeneralizedNewtypeDeriving
language extension. I was stunned when I realized
this, and it took me a while to realize why this is the case. To set the scene,
imagine that we have this code:
This derives instances for Age
by reusing the
instances of the underlying type, Int
.
As it turns out, GHC can successfully derive a C1 Age
instance, but not a
C2 Age
instance, as it will fail with this error:
Uh-oh. Roles strike again. If you want to know more about roles and what this error message
means in more detail, I encourage you to read my
previous
blog
posts
on the subject. For the purposes of this blog post, it suffices to say that GHC
is unable to take something of type proxy Int
and coerce it to (i.e., reuse as)
something of type proxy Age
because the type variable proxy
is abstract, and GHC
doesn’t know how to look underneath it.
This is a serious bummer, since GeneralizedNewtypeDeriving
is used ubiquitously
to generate instances for newtypes for cheap. Moreover, since
DerivingVia
also generates instances in a similar fashion, this also renders this technique
incompatible with DerivingVia
. Surely there must be a way around this limitation?
“Representational” proxy
arguments
Requirements: GHC 8.6 or later
Yes, there is a way around this limitaiton! We’ll need some fancy types to
accomplish it, however. The trick is to reuse this type synonym that I introduced in
this blog post
about QuantifiedConstraints
:
Essentially, this says that we can coerce between m a
and m b
under the
assumption that we can coerce between a
and b
. (In the language of roles,
this would be as if m
was representationally roled in its argument.) With
Representational1
in hand, we can tweak metaData
slightly:
Giving proxy
a Representational1
constraint gives us a way to tell GHC that
it’s quite alright to look underneath proxy
when determining if proxy Int
can
be coerced to proxy Age
. Sure enough, if we try it out:
Then GHC accepts it! Note that I’m deliberately using StandaloneDeriving
in the
code above instead of simply writing newtype Age = MkAge Int deriving C
, since the
latter won’t typecheck due to
unfortunate technical reasons.
Perhaps you want your proxy
type variable to act even more like Proxy
. One
thing to note about Proxy
is that its type argument is phantom, so Proxy a
can
be coerced to Proxy b
for any types a
and b
. If you wish to encode this as
a quantified constraint, you can whip up the following type:
If you change the context of metaData
to be Phantom1 proxy
, this will require
that proxy
be instantiated only with type constructors that have phantom types.
Pros
This inherits most of the pros of the proxy
approach, minus the downside
of being incompatible with GeneralizedNewtypeDeriving
.
Cons
Does anyone actually use this? As far as I know, I invented this technique for the first time when writing this blog post, so I would be shocked if anyone was using this trick in the wild. That isn’t necessarily a sign that this technique is flawed, only that it’s a new development.
Slightly less creativity permitted. If proxy
has no constraints, then users can
instantiate it with whatever type constructor they wish. If proxy
has a
Representational1
constraint, however, then users are slightly more restricted
in what they can do. They wouldn’t be able to instantiate proxy
with Sing
or
TypeRep
, for instance, since both of those type constructors are nominally roled
in their argument. (That is, in order to coerce from Sing a
to Sing b
, then
a
and b
must be exactly the same type.) Users will be even more limited if
proxy
has a Phantom1
constraint.
No explicit arguments at all
Requirements: GHC 8.0 or later
To round off my analysis, I want to mention one more approach that started to become viable in GHC 8.0. Unlike the other designs in this blog post, this design eschews the use of explicit arguments altogether:
One thing to note is that GHC will reject the code above by default, since a
is
ambiguous in the type of metaData
(forall a. C a => Blah
). You can relax this
restriction, however, with the AllowAmbiguousTypes
extension. In order to actually
use metaData
, you’ll need to make use of another language extension,
TypeApplications
:
Besides this funkiness, this approach works just like the other ones in this post [2].
Pros
Brevity. This approach requires by far the least amount of typing. Instead of
having to plumb around proxies to inform metaData
what a
is, metaData
relies
on liberal use of TypeApplications
to do the heavy lifting.
Cons
Potential for confusing error messages. If you accidentally forget to provide
metaData
with its type argument @a
, you can get extremely disorienting error
messages like this one:
If you’re unaccustomed to the quirks of GHC’s error messages, it might require some head-scratching before you realize what the source of the problem is.
This isn’t a problem that’s exclusively limited to this approach, since GHC might
complain similarly if you type metaData Proxy
(without the a
).
But you if start to type “Proxy
”, it’s somewhat
easier to remember that you need to finish the thought by typing out “Proxy @a
” in full.
On the other hand, invisible type arguments can be easily forgotten altogether.
Final thoughts
That concludes my analysis of all the different ways one can define “metadata” methods in type classes. To recap, we have looked at the following designs:
- Dummy arguments
Proxy
argumentsProxy#
argumentsproxy
- “Representational”
proxy
arguments - No explicit arguments at all
Out of these six choices, is there a clear winner? As much as I want to pick one,
each of them have at least one downside that can prove to be annoying in practice. Perhaps this
isn’t surprising, as all six methods accomplish the goal of specifying a type a
in a rather indirect fashion. In an ideal world, I would be able to just say this:
That is, I would like to be able to just pass the type a
as a plain old argument
to metaData
without any additional fuss. While GHC certainly can’t do this today,
this might be possible in a future version of GHC where terms and types can be
intermixed like this. If we had the syntax for Dependent Haskell quantifiers from
this GHC proposal,
we could imagine metaData
having this type:
This is almost the same type as forall a. C a => Blah
, except that a
is an
explicit argument to metaData
. This type signature would eliminate many of the
drawbacks of other approaches:
- No proxies (or other strange types) need to be passed, just
a
. - There’s no way to pattern-match on
a
, sinceforall
is an irrelevant quantifier (see the proposal for a more detailed explanation of what this means). - If you accidentally forget to supply
a
as an argument, you’ll get a decent error message, sincea
is an explicit argument. - It should be compatible with
GeneralizedNewtypeDeriving
.
This sounds great in theory, although imagining how to give metaData
this
type might prove to be challenging in practice. Even supposing that we had
Dependent Haskell today, the syntax for type classes is quite rigid. If you
have this:
Then regardless of what the rest of the type of metaData
is, it will always begin
with the following:
Notice that forall a
has a dot after it, not an arrow, which is what we really
want. Unfortunately, GHC doesn’t offer any mechanism to change this. There has been
one GHC proposal
put forward which suggests a way to tweak this slightly, but it
remains to be seen if the proposal would permit the sort of type signature I envision for
metaData
(or if the proposal will even be accepted at all).
OK, that’s enough speculation about Dependent Haskell. For now, I hope
this blog post serves as a useful field guide to the various ways one can write
methods like metaData
and can help inform you if you ever need to craft one of
your own.
-
Actually, it turns out that the
Proxy#
approach doesn’t work well withGeneralizedNewtypeDeriving
either, but that’s due to a GHC bug. ↩ -
In case you’re curious, this approach is compatible with
GeneralizedNewtypeDeriving
, but only with GHC 8.8 or later, since that features an important bugfix. ↩