This is the second part in my series of practical applications of the
QuantifiedConstraints extension. See
for part 1.
join has been a part of the
base library for a long time, and for good
reason. Not only is it useful in many monadic contexts, it’s also quite general.
For instance, one can re-implement
(>>=) itself using
join are equivalent in expressive power. Anything one
can do with
(>>=), one could just as well do with
join, and vice versa.
Many folks took note of this and proposed that
join be added as a class
Monad itself so that
join could have default
implementations in terms of the other. In fact, the original
(or AMP for short) proposed doing just that, in addition to making
Applicative a superclass of
join portion of the AMP was eventually dropped, not because of
community outcry, but because of unfortunate technical restrictions. In this blog
post, we will explore what these technical restrictions are in further detail, and
how the proposed
extension can lift these restrictions.
If you want to follow along with the code in this post at home, you can build a branch of GHC that implements
The day the
To put the following section into the right historical context: the year is 2014.
GHC 7.8 has just been released, boasted a variety of improvements such as typed
holes, closed type families, pattern synonyms, and an overhauled version of
GeneralizedNewtypeDeriving that fixed a latent type safety bug (more on this
point later). Team GHC is hard at work preparing a patch which will implement
the AMP for the upcoming 7.10 release.
One day, and unexpected
arises. After making
join a class method of
Monad, GHC starts complaining
when compiling the
haskeline library. The issue can be boiled down to the
Here is the perplexing error that GHC gives (I’ll use a contemporary version, GHC 8.2.2, for a slightly more comprehensible error message):
• Couldn't match representation of type ‘m (T m a)’ with that of ‘m (m a)’ arising from the coercion of the method ‘join’ from type ‘forall a. m (m a) -> m a’ to type ‘forall a. T m (T m a) -> T m a’ NB: We cannot know what roles the parameters to ‘m’ have; we must assume that the role is nominal • When deriving the instance for (Monad' (T m)) | | deriving Monad' | ^^^^^^
As mentioned earlier, GHC 7.8 introduced a new version of
GeneralizedNewtypeDeriving, and since the code snippet uses it prominently,
it became evident that
GeneralizedNewtypeDeriving was the culprit. However,
this was no common bug. This was a restriction that arose due to the design of
GeneralizedNewtypeDeriving itself, and no one could come up with a
Disheartened and defeated, Team GHC eventually decided to abandon the
portion of the AMP, as using
GeneralizedNewtypeDeriving was simply too useful
to give up. Needless to say, though, this left a sour taste in the mouths of many
people—Haskellers don’t take kindly to being told that something can’t be
done in Haskell!
In order to better understand why adding
Monad interacted so poorly
GeneralizedNewtypeDeriving, we must first take a detour into roles, the
mechanism which underlies GHC’s notion of type-safe coercions. Understanding
roles is a key step into seeing why
join is so tricky to derive.
The coming sections present an overview of how
Coercible, and roles work. If you are already comfortable with these, feel
free to skip to this section.
GeneralizedNewtypeDeriving gone wrong
What exactly was wrong with
GeneralizedNewtypeDeriving before GHC 7.8?
First, let’s consider a typical example:
GeneralizedNewtypeDeriving does here is to generate an
Age that does the appropriate wrapping and unwrapping
Figuring out where all of these constructor wrapping should go can be tricky, however, so to make things easier, GHC takes a shortcut. Before 7.8, this would be the code that would actually get generated:
This use of
unsafeCoerce may seem fishy, but it’s actually OK here. That’s
because at runtime, an
Age and an
Int have the exact same representation,
so it’s perfectly safe to coerce from an
Int to an
Age, or vice versa.
Moreover, any function that works over
Ints can also be coerced to a function
Ages, which explains why it’s safe to coerce from, say,
fromInteger :: Integer -> Int to
fromInteger :: Integer -> Age.
It turns out, however, that always using
GeneralizedNewtypeDeriving compromised type safety. (You might be thinking,
“Well duh!”, but this was not obvious at the time!) To see what can go wrong
with this approach, consider this extension to the example above, which
leverages type families:
Now consider what happens when you invoke
bad (MkAge 42). Since
Inspect Age = Int, we’d expect this to return an
Int. But since we
BadIdea instance for
Age, we are going to coerce from
bad implemention for
Int. But note that in the
bad implementation for
Int, we return a
bad (MkAge 42) returns
42 > 0 (a
unsafely coerced to an
Int, which are two types with very different runtime
representations. Undefined behavior (or segfaults) are just around the corner—
in other words, we have violated type safety.
Time to role up our sleeves
GeneralizedNewtypeDeriving was subtly broken, but where precisely did
things go kaput? Coercing from
Age was just fine, but coercing
Inspect Age to
Inspect Int caused pandemonium. This observation that
coercing between types could be safe in some contexts but unsafe in other
contexts in the key idea behind roles.
An abundance of equality
To explain what a role is, first consider the equality type
a ~ b would only hold if
the same type. But this is somewhat unsatisfying, since types like
Age are “equal” (in the sense that they have the same representation at
runtime), but you could not conclude that
Int ~ Age.
GHC 7.8 addressed this by parameterizing its notion of equality with a role. There are three different roles:
Nominal: Two types are nominally equal when they are the exact same type. (This is equivalent to the notion of equality GHC used before the introduction of roles.)
Ageis nominally equal to
Intis nominally equal to
Ageis not nominally equal to
Bool, or any other type.
Representational: Two types are representationally equal when they have the same representation at runtime. (This is a strict superset of nominal equality.)
Any newtype is representationally equal to its underlying type, so we have that
Ageis representationally equal to
Agewould not be representationally equal to
Phantom: This is the broadest notion of equality, as any two types are phantom-equal.
In GHC 7.8,
(~) became synonymous with nominal equality. GHC 7.8 also added
a type for representational equality called
Coercible. Moreover, a type-safe
unsafeCoerce was added, simply called
coerce will be key in fixing
GeneralizedNewtypeDeriving. If GHC cannot
Coericble a b, then we cannot be sure that it is safe to coerce
from a value of type
a to type
b, so it will be rejected with a type error.
What can be
The earlier definition I gave for representational equality (having the same
runtime representation) is a mite hand-wavey. For instance, we can intuitively
Inspect Int and
Inspect Age do not have the same runtime representation,
since they reduce to
Int, respectively, but how would we be able
to know that it’s not safe in general to
Inspect a and
Inspect b for distinct types
To give a clearer picture of what can and can’t be
coerced, I’ll give some
examples in terms of
Coercible. We already know that newtypes and their
underlying types are inter-
Note: I’m writing these
Coercibleexamples as instances, since that’s how Haskellers are used to thinking about constraints. Do note, however, that
Coercibleis not actually a type class, and you can’t actually define instances for it yourself. Rather, these example instances are intended to demonstrate facts about
Coerciblethat GHC is able to use in its constraint solver.
Coercible can do more that this: it can also look through other type
constructors! For instance, it can look through the function type:
Critically, however, it cannot look through type family constructors, so we would not be able to conclude that:
What separates the honest type constructor
(->) from the miscreant
Inspect? It all comes down to how each type uses its arguments.
In GHC, if you have a type
T a_1 ... a_n, then each parameter
a_i is given a
role, which indicates the broadest notion of equality that is permitted to be used
when determining if
T a_1 ... a_n is representationally equal to
T b_1 ... b_n.
As a concrete example, both of the parameters to
(->) have representational
roles. In terms of the
RoleAnnotations language extension, we can write this
In other words,
a1 -> a2 is representationally equal to
b1 -> b2 if
a1 is representationally equal to
b1 is representationally equal
b2 (i.e., if
(Coercible a1 a2, Coercible b1 b2) holds). Having
representationally roled parameters is a trait that many data types also share:
This makes sense, since all of the parameters to these data types have fields
of those type, so they matter at runtime. If we had a data type that had a
parameter that didn’t appear as the type of a field, such as in
Then that parameter would be phantom-roled:
In other words,
Proxy a is representationally equal to
Proxy b for any types
Coercible (Proxy a) (Proxy b) always holds). Again, this makes
intuitive sense, the parameter to
Proxy has no effect whatsoever on its
Now you might be wondering: can we ever get a parameter to be nominally roled? Indeed we can: type families are a prime example of this, as every parameter to a type family must be assigned a nominal role. For instance:
In other words,
Inspect a is representationally equal to
Inspect b if
a is nominally equal to
b (i.e., if
a ~ b). This is exactly the
behavior we desire, because it’s only safe to coerce
Inspect a if we
can be sure that the coercion does not change the type
There are other types in Haskell that demand nominal roles for its parameters, such as GADTs and type classes. The curious reader is encouraged to read the paper Safe Zero-cost Coercions for Haskell for further commentary on these design choices.
Role with the changes
Tying everything we’ve just learned together, we can now tweak
GeneralizedNewtypeDeriving to be type-safe. Instead of generating the following
code for a derived
BadIdea Age instance:
All we have to do is make one small change: just replace
That’s it! Now, the typechecker will be unable conclude that
Coercible (Inspect Int) (Inspect Age) holds, and thus the typechecker will
reject this code.
When roles go rogue
Now, let’s take a closer look at the code that gets generated when we derive
an instance of
join for the
T newtype from
We will get:
In order for this to typecheck, we need to be able to conclude that the following constraints hold:
Coercible (m (m a)) (T m (T m a))
Coercible (m a) (T m a)
The latter fact follows from the fact that
T m a is a newtype around
The former fact, however, is trickier. We can now unwrap the outer
get an obligation of
Coercible (m (m a)) (m (T m a)), but at this point, we
are in trouble. Recall this part of GHC’s error message:
• Couldn't match representation of type ‘m (T m a)’ with that of ‘m (m a)’ NB: We cannot know what roles the parameters to ‘m’ have; we must assume that the role is nominal
This gets at the heart of the matter:
m is a higher-kinded type variable
that takes an argument. Moreover,
m is abstract, so we cannot know in
m will be instantiated with something like
parameter is representationally roled) or an abstract type like
(whose parameter is nominally roled). Thus, to preserve type safety, we
must be conservative and assume that the parameter to
m is nominal.
But this is a serious problem.
m (m a) is only ever representationally equal
m (T m a) when
m a is nominally equal to
T m a, and that can never
happen! GHC therefore rejects this as ill roled.
What can we do about all this? It might appear that we’ve designed ourself into a corner with the way representational equality works. Indeed, there were several proposed extensions to the role system, such as higher-order roles, which would address this shortcoming. However, each idea had certain programs for which it would not work, so none of these suggestions were ever implemented.
In late 2014, it was
QuantifiedConstraints—a much older idea that had been
developed for different purposes—could be used to roll your own higher-order roles!
Recall that we got stuck when trying to solve
Coercible (m (m a)) (m (T m a)),
since didn’t know enough about
m. But what if we were allowed to say that
m a is representationally equal to
m b under the assumption that
a is representationally equal to
b? This is precisely the power that
QuantifiedConstraints gives you at the language level.
First, let’s bring back our earlier attempt at deriving
We want to be able to say that for any types
If we can conclude that
b are representationally equal…
…then we can conclude that
m a is representationally equal to
This is exactly the quantified constraint we need:
And now, this typechecks! Best of all, we didn’t need to change the underlying role system at all for this to work—instead, we took advantage of GHC’s constraint solver itself, which is quite flexible.
A consequence of this quantified constraint is that instantiating
N such that
type role N nominal will fail. This is a good thing—
N are precisely what we want to rule out, and nothing more!
GeneralizedNewtypeDeriving gone right?
We now have a blueprint for how to spruce up
address its current shortcomings. In particular, we can change it so that
when inferring constraints for derived instance, it infers quantified
Coercible constraints whenever the class has a method with a type signature
in which a higher-kinded type variable bound by the instance is applied to
This would do the job, but it would be a bit of a departure from the constraints that GHC typically infers. Would users be surprised to see error messages involving quantified constraints? That remains to be seen, but we now at least have the potential to fix the problem, which is a huge leap forward from the earlier status quo.
Best of all,
join is far from the only type class that can
QuantifiedConstraints to become derivable through
GeneralizedNewtypeDeriving. In fact,
Traversable could also be
made to be newtype-derivable with a little sprucing up! The exact details
of this sprucing will have to wait for the next post in this series, however.