# Improvements to deriving in GHC 8.2

We’re drawing closer to a release of GHC 8.2, which will feature a variety of enhancements to GHC’s `deriving`

-related extensions. None of the improvements are particularly revolutionary, and for most code, you won’t notice a difference. But there are quite a few quality-of-life fixes that should make doing certain things with `deriving`

a little less of a hassle.

## Deriving strategies

The largest change to `deriving`

that debuts in GHC 8.2 is a new extension: `DerivingStrategies`

. Before discussing what `DerivingStrategies`

does, let me motivate the problem. Imagine you have this datatype:

Now suppose you want to derive another instance for `Foo`

:

How should this derived instance be implemented? Well, if you had `GeneralizedNewtypeDeriving`

enabled when compiling it, the derived instance will closely resemble this:

Alternatively, if you had `DeriveAnyClass`

enabled, the derived instance would instead be:

causing the default implementation of `bar = show`

to kick in.

But what happens if `GeneralizedNewtypeDeriving`

and `DeriveAnyClass`

are *both* enabled? Then a problem emerges: `deriving Bar`

becomes ambiguous! One could reasonably pick either `GeneralizedNewtypeDeriving`

or `DeriveAnyClass`

to derive `Bar`

, as shown above. And the choice matters, since the result of evaluating:

will be `'a'`

if `GeneralizedNewtypeDeriving`

is used, and `MkFoo 'a'`

if `DeriveAnyClass`

is used.

As it turns out, GHC handles such a scenario by making an arbitrary choice:

This is a bit unfortunate, however, because this effectively prevents you from using `GeneralizedNewtypeDeriving`

in any module where `DeriveAnyClass`

is also enabled. Bummer.

The `DerivingStrategies`

extension was created to solve precisely this type of ambiguity. Once the extension is enabled, it extends the syntax of `deriving`

clauses and standalone `deriving`

declarations slightly, allowing you to augment them with one of three keywords:

`stock`

`newtype`

`anyclass`

These are the “strategies” referred to in `DerivingStrategies`

. There are only three for now, although more could conceivably be introduced. Here is an example of each strategy:

`stock`

`stock`

is named because it refers to the “stock” type classes that GHC simply knows how to derive on its own (credit goes to Joachim Breitner for suggesting what to name this). These include the derivable type classes mentioned in the Haskell Report:

`Bounded`

`Enum`

`Ix`

`Eq`

`Ord`

`Read`

`Show`

They also include the classes that are only derivable in GHC through bespoke language extensions:

`Functor`

(via`DeriveFunctor`

)`Foldable`

(via`DeriveFoldable`

)`Traversable`

(via`DeriveTraversable`

)`Data`

and`Typeable`

(via`DeriveDataTypeable`

)`Generic`

and`Generic1`

(via`DeriveGeneric`

)`Lift`

(via`DeriveLift`

)

So if you write:

This provides an additional guarantee that the derived instance will really be:

which can be useful for programmer sanity.

`newtype`

This strategy indicates that you absolutely want to use `GeneralizedNewtypeDeriving`

. To reuse the earlier example, if you wrote:

Then you’ll know that `Bar`

will be derived via `GeneralizedNewtypeDeriving`

.

This code also demonstrates another feature of deriving strategies: multiple strategies can be used after a data declaration! This part:

tells GHC to derive `Show`

with whatever strategy it sees fit (in this case, it defaults to `stock`

), and to derive `Bar`

specifically with `GeneralizedNewtypeDeriving`

. You can also put more than one class after each strategy:

`anyclass`

Finally, the `anyclass`

strategy corresponds to a use of `DeriveAnyClass`

. So if you had wrote:

Then, you guessed it, it’ll derive `Bar`

via `DeriveAnyClass`

.

For more details on the innards of deriving strategies, see the corresponding GHC Commentary page.

As an additional fun fact, I was able to use deriving strategies to clean up some of the `base`

library. In the `Foreign.C.Types`

and `System.Posix.Types`

modules, there are a lot of newtypes with slightly unusual `Read`

and `Show`

instances. They’re unusual in the sense that they ignore the newtypes’ constructors, which means that `deriving (Read, Show)`

couldn’t be used to implement these instances. Instead, this ugly hack was used (using the `Show CIntPtr`

instance as an example):

Yuck. Happily, this can be made much cleaner with deriving strategies!

`DeriveAnyClass`

overhaul

Before GHC 8.2, `DeriveAnyClass`

only worked on for type classes whose argument is of kind `*`

or `* -> *`

. The reason for this seemingly arbitrary restriction is because GHC made a crude simplifying assumption. If you wrote something like:

Then GHC assumes one of the following cases:

`C`

’s argument is of kind`*`

. Then GHC will derive`C`

like it was stock-deriving`Eq`

. That is, it will generate this instance:

`C`

’s argument is of kind`* -> *`

. Then GHC will derive`C`

like it was stock-deriving`Functor`

. That is, it will generate this instance:

If neither case is true, then GHC errors.

This assumption made implementing `DeriveAnyClass`

simpler, but it made it quite less general than it could be. What’s worse, even though `DeriveAnyClass`

was co-opting the code for deriving `Eq`

and `Functor`

to get the instance contexts right (`C (f a)`

and `C f`

, respectively), it wasn’t even doing that part correctly! For example, consider this code:

This uses `GHC.Generics`

to automatically figure out what the name of a data type is. For instance, here is an example of how you could use it:

So far, so good. But what if we attempted to derive the `TypeName`

instance for `T a`

using `DeriveAnyClass`

?

You might think that GHC would come up with the same instance as the one we wrote manually above:

But prior to GHC 8.2, that wasn’t true! If you compiled this code with the `-ddump-deriv`

flag to see the generated code that GHC derives, you’d discover that the actual instance was this:

Huh? This instance has a completely redundant `TypeName a`

context! Even worse, `tName`

no longer typechecks, since there’s no `TypeName`

instance for `()`

!

This behavior, while totally bonkers, was by design. Recall that `DeriveAnyClass`

was using the same algorithm that GHC uses to stock-derive `Eq`

instances. That is, because this code:

would generate this instance:

Then as a consequence, `DeriveAnyClass`

follows the same pattern in deriving a `TypeName`

instance for `T a`

. Unfortunately, the approach for deriving `Eq`

just doesn’t work for a type class like `TypeName`

.

It was clear that `DeriveAnyClass`

needed a new coat of paint, so GHC 8.2 will debut a new inference algorithm for `DeriveAnyClass`

. Unlike, say, deriving `Eq`

, which infers the context for its instances by examining the definition of the data type, `DeriveAnyClass`

infers its context by examining the *type signatures of the class’s methods*. Continuing the `TypeName`

example:

This will generate a `TypeName`

instance like this:

GHC determines what `???`

is by gathering constraints from the type signatures of `TypeName`

’s methods and simplifying them as much as possible. In this example, GHC gathers the constraints:

GHC is immediately able to discharge all three of these constraints, so this simplifies down to `()`

, so the final instance that GHC generates is:

Which is exactly what we wanted. Hooray!

Better yet, this new design completely removes the requirement that the derived class’s argument must be of kind `*`

or `* -> *`

, so now `DeriveAnyClass`

can be used in far more places than it could before.

I owe a great deal of gratitude to Simon Peyton Jones for patiently explaining the parts of the typechecker needed to implement this feature… and for fixing several mistakes in my initial implementation :)

`GeneralizedNewtypeDeriving`

and associated type families

Prior to GHC 8.2, it was impossible to use `GeneralizedNewtypeDeriving`

to derive an instance of this type class:

Or rather, it was impossible for *any* class with associated type families. But this was rather unfortunate, as implementing `Marshal`

instances for newtypes is predictable and laden with boilerplate:

So this definitely smells like something that `GeneralizedNewtypeDeriving`

should be able to handle. Thankfully, starting with GHC 8.2, that is the case. You can now just write:

And it will generate an instance that is equivalent to the manually written one above.

There are a couple of things to watch out for when using this feature, however. One gotcha is that this only works for associated *type* families, not *data* families. It doesn’t make sense to combine associated data families with `GeneralizedNewtypeDeriving`

, because if you tried deriving this:

Then what instance would be produced? GHC would have to generate something like this:

And it is not clear what GHC would fill in for `???`

, as creating a data family instance here would require a fresh data constructor. That is to say, data family instances are *generative*, whereas type family instances are not.

Another minor annoyance to watch out for is if you try to derive an instance like this, where the newtype wraps a concrete type (instead of just a type variable, as in `Age`

above):

This is only allowed if `UndecidableInstances`

is enabled. Why? That’s because the derived instance would be this:

GHC’s typechecker isn’t smart enough to conclude that reducing `T MyInt`

will ever terminate, so it conservatively requires `UndecidableInstances`

to allow this. Of course, this requirement does rule out things that would legitimately send the typechecker into a loop—for instance, consider what would happen if you did this!

## Poly-kinded `GHC.Generics`

If you use `GHC.Generics`

, you’re probably familiar with the `Generic1`

class:

If you squint, you’ll notice that the kind of `Generic1`

is actually less polymorphic than it could be. We can generalize the kind of `Generic1`

to this:

In a similar vein, we can kind-generalize most of the datatypes in `GHC.Generics`

:

(The exception being `Par1`

, of course, since its type parameter is forced to be of kind `*`

.)

Now we can derive `Generic1`

instances for more data types than we could before. For example, Derek Elkins uses `GHC.Generics`

to automatically define `Authenticated`

instances for a data type that is parameterized over a type that uses `DataKinds`

in this example.

`DeriveFunctor`

now implements `(<$)`

(This addition was not authored by me, but rather by David Feuer. Thanks, David!)

GHC’s `DeriveFunctor`

extension grants you the power to easily implement a lawful `Functor`

instance for a given datatype. For instance, `data Foo a = Foo a a deriving Functor`

would generate the instance:

However, there’s more to `Functor`

than just `fmap`

. Here’s the fully fleshed-out definition of the `Functor`

type class:

`Functor`

also has the somewhat lesser-known method `(<$)`

, which replaces locations inside the input with the same value. Notice that in the derived `Functor`

instance above, however, GHC didn’t implement `(<$)`

manually, but relied on the default implementation (`fmap . const`

). As it turns out, this default implementation can be very inefficient for certain data structures. Here’s an example from the `containers`

library:

This produces the following `Functor`

instance:

Using the default implementation of `(<$)`

for `Tree`

, we end up with this definition:

Alas, GHC is unable to optimize this any further, since fmap is defined recursively. (The curious reader is encouraged to read this for the full story of why this `(<$)`

definition is difficult to optimize.) And this definition is quite unsatisfactory, since this will produce a `Tree`

full of thunks of the form `((\_ -> x) y)`

, which allocates far more (and leaks way more space) than it should need to.

Luckily, there’s a pretty simple fix: just be smarter about deriving `Functor`

instances. In GHC 8.2 and later, `DeriveFunctor`

will implement `(<$)`

in addition to `fmap`

to avoid the aforementioned space leaks. For comparison, here is how 8.2 would derive the `Functor Tree`

instance above:

Much better!