What is a lawless type class, and should they be avoided? Are orphan instances inherently evil, or just misunderstood? Is global coherence a mistake or a fundamental requirement? In this post we argue that most of the arguments over these questions stem from the fact that type classes embody a number of disparate concepts, with each facet giving rise to perceived wisdom which does not necessarily map to other facets. We look at the various concepts which are combined in type classes, and posit how one might disentangle them.
I am large, I contain multitudes
Functions from Types to Values
Consider the following diagram:
Type ----> Type
| ▵
| |
| |
▿ |
Value ---> Value
Haskell has been steadily filling out the arrows here. From Value to Value we have regular functions. GADTs let us go from Value to Type.1 Type families give us functions from Type to Type. Classes give us the final arrow, that from Type to Value.
Two examples of this spring to mind: the Default
class from
data-default
gives us
precisely this — a single value matched to each type, whereas the
singletons
library uses this
functionality to effectively allow case analysis on types, through mapping types
to values which, as constructors of a GADT, may then allow us to recover the
type information.
Are there other ways we can go from types to values? The use of Typeable
allows us to do this, albeit we cannot in such a case prove that we have a
value for any given type (or, alternatively, we must provide a value for all
types), since we lack the constraint also provided by the class. But note that
Typeable
itself is a class, whose instances are automatically generated by the
compiler.
Seen from this perspective, a class should be seen as defining a function signature, and its collection of instances as collectively defining that function. It is necessarily global (as are all top-level function definitions — provided we have the function in scope, its body is uniquely defined!).
Modularity
What does modularity mean? In essence, just the ability to break something into pieces which may then be used together. Haskell has various flavours of modularity: its module system for one, and the natural composability of pure functions as an extremely powerful other. What we refer to here is something more akin to OCaml’s modules, or something like the classes of object-oriented languages — to wit, a class allows us to group a set of definitions (including type definitions, with associated type families) and refer to them together. In this way, each class instance acts as a module whose interface is given by the class definition.
We can use a type class purely in this fashion with what I call the “makeshift module” pattern:
class CatDB a where
type family Config a :: Type
type family Handle a :: Type
openDB :: Config a -> IO (Handle a)
getAll :: Handle a -> IO ([Cat])
data MockDB
instance CatDB MockDB where
type instance Config MockDB = ()
type instance Handle MockDB = ()
openDB = pure
getAll () = pure exampleCatList
In this pattern, the type to which the class is associated serves purely as a name for the module instance — it doesn’t even have any associated values. One may even use this to build OCaml-style functors (that is, modules themselves parametrised over other modules):
class CatDB (DB a) => CatApp a where
type family DB a :: Type
...
Implicit resolution
The third facet to type classes is their use of implicit resolution and global
scoping. When we ask for an instance of a class Foo
, we do not explicitly
provide such an instance but have it provided to us by the compiler, depending
on the type.
Do we have any other means for implicit resolution? Indeed we do; the
ImplicitParams
extension gives us name-directed implicit resolution of
parameters. Perhaps one should not be surprised to discover that implicit
parameters are themselves implemented using the following class, under the
covers:
class IP (x :: Symbol) a | x -> a where
ip :: a
{-# MINIMAL ip #-}
This class is imbued by GHC with some custom behaviour to allow local scoping.
From this perspective, we can see a class definition as akin to part of a function signature, identifying the type of an argument, and an instance as “filling in” that part of the signature with a required value.
Global scoping
Related to implicit resolution is the issue of global scoping. When trying to
satisfy a constraint Foo a
, GHC consults a global table of all type class
instances. It expects that (demands!) a unique instance be found2.
Global uniqueness is most often thought of in order to provide class
coherence. This enables a particular use of type classes, as best evidenced by
the use of the Ord
constraint in containers such as Set
:
fromList :: Ord a => [a] -> Set a
insert :: Ord a => a -> Set a -> Set a
In this snippet we rely on the fact that the means of ordering a given type a
must remain the same in the place where we call fromList
and the place where
we call insert
(or simply in successive calls to insert
). Were the instance
to change, due to local scoping, we could break the internal ordering and do
things like accidentally lose elements.
From this angle, then, a class can be seen as imbuing a type with extra semantics (or, depending on one’s perspective, encoding a semantics it might platonically possess).
Many languages have a concept of global scoping more generally. In Bash and older versions of Perl, for example, all variables are global (although Perl has packages, which do provide some namespace division). It’s hard to see why we might want such a thing in Haskell, but could we acquire it? Well, yes - the above implementation of implicit parameters shows us how:
class Global (x :: Symbol) a | x -> a where
global :: a
instance Global "foo" String where
global = "Hello world"
Constrained polymorphism
Perhaps it is unfair to leave this facet until last, since in essence it is both the driver beyond all of the preceding sections and the unifier bringing them all together. Why do we have classes in the first place? Because they allow us to do constrained polymorphism, to say neither “this is valid for this type” nor “this is valid for all types”, but somewhere in between. And indeed, this requirement underlies all of the above use cases: it gives the domain of a function from types to values (and hence guarantees that the function is not used partially), it provides a means to depend only on a module interface, it demands implicit resolution so that the constraint can be dispatched by the compiler, rather than having to be proved by the user, and… well, arguably global scoping isn’t required here, but it does make sense from certain perspectives - more on that below.
Do we have other means of constraining polymorphism in Haskell? Only one - we
can construct equality constraints of the form a ~ b
. Other languages,
however, do permit different types of constraint — the most obvious one being
subtype polymorphism in object-oriented languages. It is from this perspective
that global coherence makes the most sense, when we see a type as being given
not just by its structure, but also by the set of operations defined on it
(as in an object-oriented language, where a class will have not only fields but
methods, as well as superclasses etc.).
From this perspective, both classes and instance declarations can be seen as something like a Prolog “rule”, a deductive technique which the compiler may use in attempting to satisfy a constraint. At the extreme end, it’s the use of this deduction engine that lets us start to do things like type-level programming in Haskell.
The means of dialectical evolution
Let’s go back to our original questions, with this framework in mind. When we talk about a lawless class, what we really talk about is a lawless module (perhaps combined with the fact that these modules will be implicitly and globally resolved). When using a class as a function from types to values, it makes little sense to talk about laws (at least any more than one might be concerned by the laws of a regular function). Likewise when used as a means to pass arguments implicitly.
Similarly, orphan instances mostly pertain to the aspect of global scoping,
where we think of the class as adding semantics to the type. From this
perspective, it makes sense that those semantics are grouped with the type;
indeed, for the semantics of a type to change depending on what we have imported
seems like a big bug! From the perspective of classes-as-implicits, though, this
restriction seems strange. And where we are using classes purely as modules, it
seems irrelevant to talk of orphan instances, since the type to which the
instance is attached only serves as a name for the instance; that is to say, the
instance is all there is. And since the name may be fully qualified, orphan
instances would only be a concern if two people wanted to use the same name
(including package name, given PackageImports
!) for their module!
The case of orphan instances when viewing classes-as-functions is interesting,
since it comes down to allowing open functions4. Going back to our diagram
above, we can see that our arrows with Value
as the domain are closed - one
cannot have an open (regular) function or GADT — but the arrows going the other
way are open. This is somewhat necessarily true, since our prototypical kind -
Type
- is itself open, whereas a given type is closed. The restriction against
orphan instances then is basically one which helps ensure the function is
well-defined.
As for global uniqueness, it goes naturally with global scoping, but is a bit of a weird beast otherwise. Our other perspectives on classes don’t really call for it (one might argue that classes-as-functions would desire this, but in such a case it’s only really important that a function is well-defined at its use site). However, the usage of classes in Haskell does rely heavily on global uniqueness.3
Conclusion
Type classes are powerful tools with various facets to them; we’ve shown how they can be viewed in multiple ways to exploit these individually, and to what extent the various received wisdom around classes maps to these facets.
-
The interpretation of GADTs as arrows from Value to Type may not be entirely obvious; consider the following types
data Even data Odd data SParity a where SEven :: SParity Even SOdd :: SParity Odd
We can think of
↩SParity
as mapping each constructor nameSEven
orSOdd
to a type (Even
andOdd
). In this view, each constructor declaration reads like a branch in acase
expression. -
Modulo
↩Overlapping
and similar pragmas. -
Though see Winant 2018 for a discussion of how to differentiate cases requiring global uniqueness from those not.
↩ -
Open functions - that is, functions whose body need not be defined in one place but could be spread out amongst multiple modules. Consider a function
↩foo :: Int -> String
whose value of even numbers is defined inEvens.hs
and whose value on odd numbers is defines inOdds.hs
.