There’s some talk going on in the Swift Forums about whether it could be useful or not to add the notion of Higher-Kinded Types (from now on, HKTs) to the Swift’s type system. This discussion has been going for a while in the Swift community, and there seems to be a lack of practical examples to sustain the need for such an advanced feature. Unfortunately, HKTs are in the area of things that become obvious if you have them and use them every day, but tend to remain obscure if you have to work without them: that’s because HKTs can shift one’s way of thinking about programming, abstractions and computation in general.
Also, while functors or monads are HKTs, you don’t need higher-kinded typing to design and implement types that behave like them: in languages that have HKTs a Monad
, for example, is basically an interface, like a Swift protocol
, with some extra power that Swift, at the present moment, cannot deliver. But you can implement the Optional<Wrapped>
type in Swift, and work with it successfully, without knowing, nor needing to know, that Optional<Wrapped>
is a monad. I happen to think that, if Swift had the notion of HKT in its type system – for example, if we could express the concept of Monad
generically (maybe with some kind of protocol) - then we’d be able to write practical, pragmatic, useful code that’s only unlocked by the power of HKTs.
In this article I’m going to explore that usage of monads for practical code, and the power unlocked by their composition: in doing so, I’m also going to give my personal introduction to monads, something that you can find by the dozen by googling “monad introduction”. There’s really no point in linking other introductions here, for I want to try and give my own, and they’re so easy to find; but I’ll refer to a couple of resources that I consider particularly valuable, and helped me a lot at the time:
I’m also going to take the Monoid
path to reach the Monad
land, but in a different way; I’d like to do so to also try and give an explanation why we use the word monad, that seems pointlessly abstract: while the concept of Monoid
is inevitably so abstract that there really couldn’t be a more “descriptive name” for it (like trying to give a more descriptive name to the concept of “multiplication”, you just learn it and move on), Monad
is sometimes referred with more pragmatic-sounding names like “flatmappable” or “chainable”. While I certainly think that using less scary and more interface-like sounding names for abstract concepts can be useful (for example, I’d probably give up the functor name, and use mappable), there are reasons why a monad is called like that and sounds like “monoid”, and I think that understanding those reasons could really reveal some deeper truths about why these abstractions are important.
Let’s start then. Brace yourselves.
Why Monoids?
I think that the concept of Monoid
is the single most important idea that percolated from abstract algebra to computation: actually, abstract algebra is usually more concerned with groups, while monoids seem to be particularly useful in software development, at least in my experience. This article is about monads, and my goal is not really to talk about monoids, but the concepts are deeply related, and I think that an understanding of the properties and usefulness of monoids will help in truly understanding what monads are about.
Monoid
encapsulates the idea of merging two things together into a single one, with the possibility that one of those things is irrelevant. For example, we concatenate strings: we take two strings, put them together, and generate a single string. This composition operation is binary, because it takes two and gives one. But if I’m concatenating strings and I want to ignore one of them, a simple way is to just use an empty string: "Hello "
plus "World!"
gives "Hello World!"
, but "Hello "
plus ""
, or ""
plus "World!"
gives back the first or second string like if nothing happened. It turns out that String
is a Monoid
in respect to string concatenation, and the irrelevant, or neutral, instance (let’s call it empty
), that is, the empty string, is ignored in the composition operation.
Another example is Array<Element>
, in which again concatenation is the “putting two things together” operation, and an empty array is the neutral instance. Here’s another one: Bool
is a monoid, if we consider the &&
(AND) operation, and the true
instance. But Bool
can also be a Monoid
if we consider the ||
(OR) operation, and the false
instance: what’s going on here?
Let’s first give a more formal definition of a Monoid
protocol, something that we can fully do in Swift right now:
protocol Monoid {
func append(_ other: Self) -> Self
static var empty: Self { get }
}
It’s common for implementations based on abstract algebra to employ an operator in place of the append
method, like the “diamond” <>
operator; I’m not a great fan of using operators in Swift because of slow compilation times (the type checker really can’t stand too may concatenated operators), and in general I like to have at least the choice to use a method instead of an operator: there’s a place for custom operators, but for now let’s go with methods instead.
Anyway, the append
method takes another instance of Self
, and combines it with the self
instance, to get a new one: from two, we get one, so it represents a binary operation. Then, there’s the empty
static property, which is the “neutral” fixed one. There are 3 requirements for a type to properly conform to the Monoid
protocol:
- the
append
method must be total: no preconditions, no fatal errors, nothing weird going on; if the method is called, it must not crash;
- the
append
operation must be associative, that is, a.append(b).append(c)
must be exactly the same as a.append(b.append(c))
, that is, however I put the parentheses, the final result is the same;
- the
empty
instance must be neutral in regard to the append
operation, which means that a.append(empty)
and empty.append(a)
must be the same and exactly equal to just a
.
Notice that append
has a generic-sounding name because it is completely generic: the meaning (semantics) of append
will depend on the concrete type conforming to Monoid
. Thus, append
for String
and Array
will mean “concatenation”, which is associative. But what about Bool
? Before dealing with that, let’s try and understand why it makes sense to have this kind of protocol in the first place, with an example.
Let’s assume we have a current user session going within an app. The Session
object could be represented by a couple of structs like the following:
struct UserSession {
var token: String?
var identity: UserIdentity
}
struct UserIdentity {
var identifier: String
var updatedAt: Date
}
A session is identified by a token (initially, nil
) that, when updated, is returned by a server on the next request; there’s also an “identity”, characterized by an identifier and an update date (that’s because we only ever care about the most recent identity). Both the token and the identity could be updated at any time, in response to a number of events.
To handle the update, we could design an object that we pass around as a mutable dependency, with ad-hoc methods for the various manipulations. This way we centralize the manipulation, but we need to pass around a dependency, and “expressing the update” means really “calling a method on some object”, which will probably require the definition of an interface, some boilerplate, and a “contract” that will complect, that is, bind together two contexts that shouldn’t necessarily be related.
Alternatively, we could express the kind of update we want to do with functions that operate on UserSession
: this way we can declaratively produce these updates as functions of type (UserSession) -> UserSession
, that can be handled in a different context, and that’s a huge win, but the function type doesn’t carry any meaning by itself, and we’ll need to rely on the function names to understand what’s going on; also, in the context where these functions are called, there’s no memory of what kind of update we’re going to perform.
These strategies have pros and cons, but are both characterized by a potentially crucial detail: the UserSession
type in itself doesn’t express the business logic related to its update. We could add methods (mutating or not) for expressing this logic, but these would be redundant and essentially duplicate some logic already expressed in both strategies. An interesting alternative could be to express the update logic at the type level: we could represent the way to update all parts of a UserSession
with specific types. Then, when merging instances of those types, each one of them will express specific semantics of update.
For example, we could design an Update<A>
type for which the merge on two instances will favor the second of the two, while the empty
instance will simply wrap nil
:
struct Update<A>: Monoid {
let unwrap: A?
init(_ value: A?) {
self.unwrap = value
}
func append(_ other: Update<A>) -> Update<A> {
return Update<A>(other.unwrap ?? self.unwrap)
}
static var empty: Update<A> {
return Update<A>(nil)
}
}
Equipped with this new type, we can redefine UserSession
as follows:
struct UserSession {
var token: Update<String>
var identity: UserIdentity
}
With this definition we’re saying that the token is an Optional<String>
that will update by simply discarding the previous one if the next one is not nil
.
For UserIdentity
we can, on first approximation, give a custom implementation of Monoid
.
extension UserIdentity: Monoid {
func append(_ other: UserIdentity) -> UserIdentity {
if other.updatedAt > self.updatedAt {
return other
} else {
return self
}
}
static var empty: UserIdentity {
return UserIdentity.init(
identifier: "",
updatedAt: .distantPast)
}
}
In this case, the append
method will favor the most recent UserIdentity
, and the empty
instance has distantPast
as date, so it will never be selected in an append
operation.
Now that both the token
and identity
properties are monoids, we can trivially make the whole UserSession
a Monoid
, by simply updating both properties with the append
method, and use both the empty
static functions:
extension UserSession: Monoid {
func append(_ other: UserSession) -> UserSession {
return UserSession.init(
token: self.token.append(other.token),
identity: self.identity.append(other.identity))
}
static var empty: UserSession {
return UserSession.init(
token: .empty,
identity: .empty)
}
}
Equipped with these definitions, something magical happens: any time we want to express an update in UserSession
, we simply need to produce a new one. For example, if we’re updating just the token, we can produce a new session with a new token, and an empty
user identity; if we’re updating the identity, we can just yield nil
for the token, and we’ll be sure that the most recent identity will be kept in any case.
Interestingly, in an atomic process that contains multiple session updates, we can just collect all the updates (in order) and generate a single one in batch, with a concatenated
method:
extension Sequence where Iterator.Element: Monoid {
func concatenated() -> Iterator.Element {
return reduce(.empty) { $0.append($1) }
}
}
Notice that this method doesn’t know anything about the fact that Iterator.Element
is a UserSession
: it just needs to know that it’s a Monoid
, so it has empty
and append
.
What about the Bool
conundrum? Well, two different specific types will express the two ways of “composing” booleans, let’s call them And
and Or
:
struct And: Monoid {
let unwrap: Bool
init(_ value: Bool) {
self.unwrap = value
}
func append(_ other: And) -> And {
return And(self.unwrap && other.unwrap)
}
static var empty: And {
return And(true)
}
}
struct Or: Monoid {
let unwrap: Bool
init(_ value: Bool) {
self.unwrap = value
}
func append(_ other: Or) -> Or {
return Or(self.unwrap || other.unwrap)
}
static var empty: Or {
return Or(false)
}
}
This might seem pointlessly complicated when confronted to simply use &&
and ||
, and in this particular case it probably is, but I hope I’ve made a compelling argument for using types that express specific semantics of binary composition. And
and Or
allow to embed boolean composition in the monoid framework, and by constructing types that embed binary, associative composition, a world of possibilities unfolds: what I’ve shown here is just the tip of the iceberg.
Like basic mathematical operations that eventually yield complex expressions, the mere act of merging two things together, resulting in a third thing of the same type, is a very basic operation that can unlock sophisticated abstractions constructed from simple tools. It all accounts to properly express our types in a way that favors combination; this way we don’t need to define many ad-hoc methods, that could change over time and disrupt interfaces and contracts: expressing combination semantics at the type level is the ultimate encapsulation strategy.
From instances to types
Monoids are about putting together instances of types: if a type conforms to the Monoid
protocol, it means that we can merge two of its instances into a single instance of the same type, and that there exists a very specific instance that will act neutrally towards this composition. Now, here’s the key question: can we do the same thing with types? What I mean is this: is there a way to express the composition of types in a monoid-like way? Notice that to be monoidal, the types should somehow share something (like instances of a monoid share the fact that they’re of the same type), and there should exist some kind of neutral type in respect to their composition. But what composition might even mean for types?
Well, an option could be using algebraic data types, like Product<A,B>
to compose two concrete types: Int
and String
would compose into Product<Int,String>
, but this is not the kind of composition we’re talking about, because Int
, String
and Product<Int,String>
are different, unrelated (well, not completely) types. This is not the monoidal composition we’re looking for.
What about generic types? At this point it’s probably better to give a new definition, that is, type constructor. A type constructor, in general, is a mapping from one or more types to a constructed, final type: for example, a unary type constructor takes a single type and returns a new single type, different in general from the first one, but typically related. In Swift, type constructors are represented by generic types. Think about it: Optional<Wrapped>
, for example, is a unary type constructor, because if I give it a type, say String
, it will yield a specific type Optional<String>
, while if I give it another, different type, say Array<Int>
, it will yield another, specific, different type Optional<Array<Int>>
. It turns out that a type constructor, that is, a generic type in Swift, has a monoidal representation. A type constructor could also be binary, like Dictionary<Key,Value>
, or in general n-ary, but we’re really only concerned with one of the generic parameters, fixing all the others: so, the monoidal representation will refer to only one of the type arguments (the generic parameters).
The “composition” we’re looking for here is the nesting of the same type constructor, like for example Optional<Optional<Wrapped>>
or Array<Array<Element>>
, or in general Generic<Generic<A>>
: to be monoidal, this composition should yield a type identical to the initial, non-nested one; it should also define some Generic<A>
neutral type; and it should respect the associativity and neutrality laws. To summarize, if a generic type has this kind of monoidal composition, it must define the following:
- an append mapping in the form of
Generic<Generic<A>> -> Generic<A>
;
- a neutral element
Generic<A>
;
- the append mapping should be associative, that is, the path
Generic<Generic<Generic<A>>> -> Generic<Generic<A>> -> Generic<A>
should be the same whether we merge the first and the second first, or the second and the third first;
- the neutral element, if nested into a
Generic<_>
, or if we nest a Generic<A>
into it, should merge into a Generic<A>
without any change.
If these four requirements hold, Generic<A>
is a monad. Notice that we’re really requiring the same things that we required for monoids, even if this time we’re dealing with the nesting of generic types: if you think about it, you can see that it’s still really about putting two things together to get a single one, and a neutral element towards this composition. This is also why monad and monoid sound similar: a monad is a monoid lifted into the world of type constructors.
From a mathematical standpoint, monads are more generic than that, because they transcend the concept of type or generics, but we’re interested in the representation of monads in Swift, and generic types provide that. Still, the monads we’re talking about are more related to the monads in category theory than “functors” are; that’s why I’d keep the monad word, but lose functor in favor of mappable.
Let’s try and write some code to equip Generic<A>
with monad features, and by doing that we’ll express the append operation with a joined
method (because we’re joining a nested instance into a base one), and the neutral element with a pure
method, that returns the “purest” instance possible for the type, the one that’s going to be neutral towards joining:
extension Generic {
func joined <T> () -> Generic<T> where A == Generic<T> {
/// some code
}
static func pure(_ value: A) -> Generic<A> {
/// some code
}
}
Unfortunately, in trying to give a practical representation in terms of Swift code of what a monad is about from an abstract standpoint (like in defining a Monad
protocol), we hit a brick wall; we really can’t do that in Swift, because as you can see, the joined
method for Generic<A>
returns a Generic<T>
, that is, same type constructor, but different generic parameter: if we used Self
, it would mean Generic<A>
again, and we cannot write something like Self<T>
in Swift. That’s because we don’t have HKTs in Swift, that is, what this article is really about. The fact that we cannot generically talk about monads in Swift code has many consequences: for now, I’ll simply observe that things like concatenated
for monoids is not expressible in Swift for monads, but in the rest of article we’ll see a few practical examples of why this can be problematic.
It’s frequent to see monads expressed in terms of a flatMap
or bind
method, that for the generic type would have this kind of shape:
extension Generic {
func flatMap <T> (_ transform: (A) -> Generic<T>) -> Generic<T> {
/// some code
}
}
This representation is frequently more useful, but it’s completely equivalent to the joined
method I presented before: to implement this, one can simply map
and then joined
. Or otherwise, one can implement flatMap
, and then implement joined
by passing pure
to flatMap
: as I said, the representation are equivalent. But mathematically, monads are defined in terms of the operation represented by joined
, and the latter I think is more convenient to implement (because there’s no transformation with an new parameter) and more rigorous.
I can always map
because a monad is always also a functor, that for us is simply something that can be mapped, by respecting some rules, and I don’t really want to talk about functors right now, but we’re going to need a tool, the map
function, so let’s get it out of the way and define it:
extension Generic {
func map <T> (_ transform: (A) -> T) -> Generic<T> {
/// some code
}
}
map
transforms the content of the type, thus changing the generic parameter.
Special effects
Up to this point I treated monads as a construction based on the nesting of a generic type, but I didn’t present any practical use case for it. In iOS development (or any other, really) with Swift, we sometimes end up with nested generics, for example Optional<Optional<Something>>
or Array<Array<Something>>
; at the level of the standard library, the former case is not really handled in any sensible way: either a double if/guard-let
, or a double !
are required to extract the value nested inside; but the latter case is cleanly handled with a joined
method, basically the same method the I defined for a monad, with the difference that it returns a more efficient FlattenBidirectionalCollection<[[Something]]>
(that could be easily converted into an Array
if needed). Efficiency on collections is a cornerstone of the Swift standard library, thus many types in it are designed to allow for fast conversions and iterations, but the Element
of a FlattenBidirectionalCollection<[[Something]]>
is Something
, like in Array<Something>
.
For now we can observe that a joined
method would make sense also for Optional<Optional<Something>>
, with basically the same shape as Array<Array<Something>>
: notice that the two implementations would be (and actually are) completely different, and they depend on the specific internal structure of the type. Now, are there more types for which this kind of operation would make sense? Let’s consider for example the following class:
protocol HasCalories {
var calories: Int { get }
}
class Animal<Food> where Food: HasCalories {
func consume(_ food: Food) {
/// some code
}
var consumedCalories: Int {
/// some code
}
}
The Animal
class is parameterized over the Food
eaten, and Food
can be anything that HasCalories
. If we nest the generics we get Animal<Animal<SomeFood>>
, that is, an animal that eats animals that eat SomeFood
: if we joined
this, we would get Animal<SomeFood>
, but what does it even mean? Is this animal eating SomeFood
, or other animals? The joined
operation is not something that you would do for every generic type. Here’s another example: we could easily write a joined
function on Dictionary<Key,Dictionary<Key,SomeValue>>
to get Dictionary<Key,SomeValue>
, but it wouldn’t make any sense, because we would merge all the subdictionaries – in case of identical keys, some values would have precedence, so we’d lose the other ones – and we would probably lose some of the original keys.
It turns out that there are specific categories of generic types where the “joining” operation makes sense, and it’s actually useful: types that represent effects. To my knowledge, there’s no shared, widely adopted and authoritative source on the concept of effect in programming, in the sense that it eventually becomes clear in one’s mind, but I wasn’t able to find a straightforward clear-cut definition. An effect in programming and computation in general is any consequence of an operation that doesn’t yield a direct informational value. For example, a simple division of Double
s seem a straightforward operation, with nothing weird going on, but if we implement the function we discover that the reality is not that simple:
func divide(_ dividend: Double, by divisor: Double) -> Double {
guard divisor != 0 else {
fatalError("Cannot divide by zero")
}
return dividend/divisor
}
As you can see, this function crashes if we pass 0
as divisor: this means that the function is not pure, the function is partial, it doesn’t just produce a straightforward informational value (the result of the division) even if the signature says so. There’s an effect going on here, and I could ignore this and risk crashing, or I could leverage the Swift type system to somehow encapsulate the precondition, and go back to a pure function. Of course Swift has already a way to deal with this, the Optional<Wrapped>
type:
func divide(_ dividend: Double, by divisor: Double) -> Double? {
guard divisor != 0 else {
return nil
}
return dividend/divisor
}
Now the function signature says it all: the operation could fail, so I’m returning an Optional<Double>
. This is a pure, straightforward function, that expresses its effect (that is, partiality) clearly.
What if I want to divide twice? Something like divide(divide(42, by: 28), by: 21)
wouldn’t compile, because the external divide
(that’s executed second) requires a Double
dividend, but I’m providing a Optional<Double>
. The solution is simple, and Swift has already a way to deal with this in the standard library:
divide(42, by: 28)
.flatMap { divide($0, by: 21) }
flatMap
chains the two effects together. With a joined
method, I could do the same more explicitly:
divide(42, by: 28) /// yields Optional<Bool>
.map { divide($0, by: 21) } /// yields Optional<Optional<Bool>>
.joined() /// yields Optional<Bool>
As you can see, flatMap
is generally more convenient to use, and that’s why you usually find something like that in standard libraries: but we shouldn’t forget where the concept comes from, that is, nesting multiple effects into a single one.
An implementation of joined
for Optional<Wrapped>
looks like this:
extension Optional {
func joined <T> () -> Optional<T> where Wrapped == Optional<T> {
switch self {
case .none:
return .none
case let .some(value):
return value
}
}
}
Notice that to make this work we need to enter into the Optional
, navigate the structure, and produce the result. This code works for Optional<Wrapped>
, so the implementation of a joined
will ultimately depend on the effect involved.
Array<Element>
is also a particular kind of effect: it models non-deterministic computation, where the result of some process can be one of many (even none), and in moving the computation forward, we should take into account all the possible values. When flattening an array of arrays, we need to navigate it, thus taking into account the effect that it represents thanks to a specific knowledge of its structure. Of course arrays are not generally used for this reason, but they could model this kind of computation if needed, so they can represent this effect.
An effect cannot be unwrapped or executed easily, at least in principle: for example, force-unwrapping an optional value can result into a crash. The monadic definition allows to represent the execution of effects in a safe and incapsulated way, thanks to the nesting of generics, and their flattening. The fact that we’re remaining in the monadic context means that we can execute the effects safely, as a pure operation. For example, we’ve already seen that an operation like (Optional<Int>) -> Int
is not pure, because either we pull numbers out of a hat, or we need to force unwrap; but if instead of Int
we had Optional<Int>
again, the operation (Optional<Optional<Int>>) -> Optional<Int>
could be performed safely, and it would be pure, so it’s a completely different situation, even if it looks like the previous one: the fact that Optional<Wrapped>
is a monad makes this possible.
The M word is here to stay
Effects are wild beasts, and using them makes understanding programs really hard, but we literally cannot do anything useful without effects, not even a simple division. It might feel like I’m repeating myself over and over again, but I really want to stress out this point: monads help with effects, because they allow to execute them safely within a particular context, and this is the case because joined
is a pure operation. Here’s a simple example, that will show a more direct relation between monads and effects, and will allow for the introduction of another monad: the Effect<Value>
type.
The Effect<Value>
type, also called IO, represents an effectful computation that produces an instance of type Value
. It basically incapsulates a function of type () -> Value
: because Value
can be any type, and we have no information about it, it’s completely impossible for this function to be pure. A function of type () -> Int
could be pure, if for example always returned the number 42
, but that’s just because we know the characteristics of the returned value’s type: but if the returned type is completely generic, there’s no way a function that takes nothing - actually, ()
, which is more like a singleton, traditionally, but quite erroneously called Void
- returns something. But then, if I wanted to write to a function that returns a value of type Value
taken from a globally mutable context (like a global variable), I could do it with a function of type () -> Value
by referring the mutable context from within the body of the function. Another case could be executing some side effect on an external context (like persisting data), in which case I could use an even simpler function of type () -> ()
(thus, an Effect<()>
),
Here’s a very basic, very obvious implementation of Effect<Value>
:
struct Effect<Value> {
private let call: () -> Value
init(_ call: @escaping () -> Value) {
self.call = call
}
func execute() -> Value {
return call()
}
}
Of course calling execute
on the effect will not be pure, and it’s something we’d like to do in a protected context. But Effect<Value>
has a monad representation, that can be implemented in the following way:
extension Effect {
static func pure(_ value: Value) -> Effect {
return Effect.init { value }
}
func joined <A> () -> Effect<A> where Value == Effect<A> {
return Effect<A>.init {
self.execute().execute()
}
}
}
Within the joined
function we don’t call the execute
method, so no effect is executed: we literally just return a new effect, the computation is pure. But joined
will simplify a nested effect, with the macroscopic result of an execution, because it would go from Effect<Effect<Value>>
to Effect<Value>
.
At this point I think we understood fairly well why the abstract structure called “monad” can be useful in effectful computation; monads show in mathematics the equivalence of a particular nested structure to an non-nested one (given some laws are respected). Then we map this nesting into generic types et voilà: we get effectful computations that, within a certain context, magically become pure. I’m not sure I can explain the idea any better than that: this is what a monad is about, this is why it sounds like monoid, and this is why I think that monad is an appropriate word for representing this concept: it’s like “multiplication”, you just learn it and move on. So let’s move on.
How can I hold all these monads?
In this piece I’m not particularly interested in listing a bunch of useful monads in current use: there aren’t many, and they mostly reflect the original Eugenio Moggi’s dissertation about using monads to represent effectful computations. You can find many articles about useful monads on the web, here’s one for example. Let me just produce another one here, that’s going to be useful for a practical example, the State<Environment,Value>
monad:
struct State<Environment,Value> {
private let call: (Environment) -> (Environment, Value)
init(_ call: @escaping (Environment) -> (Environment, Value)) {
self.call = call
}
func run(_ environment: Environment) -> (Environment, Value) {
return call(environment)
}
}
State<Environment,Value>
encapsulates a function of type (Environment) -> (Environment, Value)
: this shape of function is useful if we want to produce some value given a certain contex (the Environment
parameter) and also want to modify the context in some way. Now, State<Environment,Value>
is a monad and we can provide the two defining functions for it:
extension State {
static func pure(_ value: Value) -> State {
return State.init { e in (e, value) }
}
func joined <A> () -> State<Environment, A> where Value == State<Environment, A> {
return State<Environment, A>.init { m in
let (m1,sub) = self.run(m)
let (m2,p) = sub.run(m1)
return (m2,p)
}
}
}
Notice that State<Environment,Value>
is monadic in the Value
parameter: that’s the parameter passed to pure, and the nesting solved by joined
is for a State<Environment,State<Environment,Value>>
type, where the Environment
must be the same, and the nesting is about putting another State<Environment,Value>
in place of the Value
of the first.
For example, we could have a list of objects of type User
as the environment, and we’d want to represent a query on that list that retrieves a user and deletes it from the list, thus modifying the list itself. We’re tempted to represent this operation with State<[User],User>
but we’re forced to allow the possibility that a User
for a particular query might not be there, so a more appropriate representation would be State<[User],Optional<User>>
. This is interesting because it’s a kind of nesting of monads, but a different one: we’re nesting the optional monad within the state monad. This cannot be joined, and this is one of the reasons why you often read that monads don’t compose. But if we treated State<[User],Optional<User>>
as a whole monad in itself (like we manufactured a new one), then it turns out that we could somehow join this: its nested representation would be something like State<[User],Optional<State<[User],Optional<User>>>>
, which is hard to read, but kind of simple to understand: suppose that we wanted to perform a query on [User]
that required information from another User
; we do the first query, get the User
(maybe, for it could be nil
) then perform the second one with some information from the first. But of course we’d like to represent this with a single instance of State<[User],Optional<User>>
that we can simply run
. Is there a way to write a joined
function that worked this way? Of course there is, and let’s call it joinedT
to not confuse it:
extension State {
func joinedT <A> () -> State<Environment, Optional<A>> where Value == Optional<State<Environment, Optional<A>>> {
return self
.map { value -> State<Environment, Optional<A>> in
switch value {
case let .some(innerValue):
return innerValue
case .none:
return .pure(nil)
}
}
.joined()
}
}
We also needed to add a map
function to State
to make it work but it’s not a big deal; as already said, every monad is also a functor (or mappable if you prefer):
extension State {
func map <A> (_ transform: @escaping (Value) -> A) -> State<Environment, A> {
return State<Environment, A>.init { e in
let (newE, p) = self.run(e)
return (newE, transform(p))
}
}
}
Now, there are a few interesting things about the joinedT
function, and here’s the two most relevant for me:
- we took advantage of the knowledge of the particular structure of
Optional<Wrapped>
to make it work (we’re switching over its cases);
- the only methods that we call on the root instance of the state monad are
map
, pure
and joined
.
The first point means that to do this particular operation we need to know the structure of the internal type, and this is kind of expected; but the second point is crucial: we don’t care about the fact that the external type is a state monad, we only care about the fact that it’s a monad. Here’s the same implementation of joinedT
for Effect<Optional<Value>>
(providing map
for Effect<Value>
):
extension Effect {
func map <A> (_ transform: @escaping (Value) -> A) -> Effect<A> {
return Effect<A>.init {
transform(self.execute())
}
}
func joinedT <A> () -> Effect<Optional<A>> where Value == Optional<Effect<Optional<A>>> {
return self
.map { value -> Effect<Optional<A>> in
switch value {
case let .some(innerValue):
return innerValue
case .none:
return .pure(nil)
}
}
.joined()
}
}
The implementation of joinedT
is identical, save for the different types on the function signature. Again, we only care about the fact the Effect<Value>
is a monad.
Here’s the implementation of joinedT
for Array<Optional<Element>>
(providing a pure
function for Array
):
extension Array {
static func pure(_ value: Element) -> [Element] {
return [value]
}
func joinedT <A> () -> FlattenBidirectionalCollection<[[Optional<A>]]> where Element == Optional<Array<Optional<A>>> {
return self
.map { value -> Array<Optional<A>> in
switch value {
case let .some(innerValue):
return innerValue
case .none:
return .pure(nil)
}
}
.joined()
}
}
The type is a little different here, because joined
on a array returns a FlattenBidirectionalCollection
, but the type could be made conform to the others with a different overload of joined
. We’re still only calling map
, pure
and joined
on the base type, which simply requires that the base type be a monad.
To mix the things up a bit, here’s the implementation of joinedT
for Effect<Array<Value>>
:
extension Effect {
func joinedT <A> () -> Effect<Array<A>> where Value == Array<Effect<Array<A>>> {
return self
.map { value -> Effect<Array<A>> in
value.reduce(.pure([])) { accumulation, element in
accumulation
.map { acc in element
.map { ele in
acc + ele
}
}
.joined()
}
}
.joined()
}
}
The implementation is different because now the internal type is an array, and we need to take into account its particular structure: but on the base type, Effect<Value>
, we’re still only calling map
, pure
and joined
.
What about pure
? Does it compose? Yes, and trivially:
extension Effect {
static func pureT <A> (_ value: A) -> Effect<Array<A>> where Value == Array<Effect<Array<A>>> {
return .pure(.pure(value))
}
}
We just call pure
within pure
. We can actually do the same with map
, producing mapT
, also trivially:
extension Effect {
func mapT <A,B> (_ transform: @escaping (A) -> B) -> Effect<Optional<B>> where Value == Optional<A> {
return map { $0.map(transform) }
}
}
I could go on indefinitely, but I think I made my point: it’s true that monads don’t strictly compose, but we can operate with different nested monads like we operated on simple monads, with functions that work exactly like those of the latter. As we saw, the only requirement at the implementation level is that the base type has the pure
, map
and joined
methods. This sounds like a protocol extension to me!
If we had a Monad
protocol, that had an associatedtype
representing the monadic parameter (let’s call it ParameterType
), and declared the pure
, map
and joined
methods, we could extend the protocol with a constraint on the ParameterType
to be some other type, and we would take into consideration the particular structure of the latter, considering that we’ll simply need to call pure
, map
and joined
on the base monad.
Unfortunately, we cannot do this in Swift, and this is heart of the matter.
Can we go higher?
As an exercise, try and write a Monad
protocol: you can easily add the pure
function, but when you go into joined
, something strange happens. The return type of joined
must be Self
somehow, but it’s a little different from actual Self
, because it should have a different associated type:
protocol Monad {
associatedtype ParameterType
static func pure(_ value: ParameterType) -> Self
/// this won't compile: Self<A> is a meaningless term
func joined <A> () -> Self<A> where ParameterType == Self<A>
}
We would like to express the idea that the return type must be of the same concrete type of the instance we call joined
on, but with a different generic parameter (or, in other words, the associatedtype
should be different). But we can’t do this!
The kind
of a type represents the shape of its constructor; for example, a normal generic type like Optional<Wrapped>
has a kind represented by the expression * -> *
: this means that we can construct an optional type by providing some concrete type (the input *
) and thus forming another concrete type (the output *
). In the case of Optional<Wrapped>
, if we provided Int
as the input type we would get Optional<Int>
, a concrete type, that’s not generic anymore, as the output type. So, a particular generic type like Optional<Wrapped>
has kind * -> *
. In general, a protocol with an associatedtype
represents a type of the kind * -> *
because it takes some type (the one associated to it) and produces a concrete type represented by Self
. What’s the kind of a Monad
? It’s actually * -> * -> *
, which means that there’s another input type involved in the process of defining the final concrete type: this is the base type, that could be any type with a generic parameter. That’s what HKT means: the kind * -> * -> *
is higher than * -> *
I guess the case might be clearer if we considered a possible example, in pseudo-Swift:
/// pseudo-Swift
protocol Monad {
basetype Self<_>
associatedtype ParameterType
static func pure(_ value: ParameterType) -> Self<ParameterType>
func joined <A> () -> Self<A> where ParameterType == Self<A>
}
The declaration basetype Self<_>
means that the type has a generic parameter, and we want to abstract over the Self<_>
type constructor, by directly referring to it. A possible extension for Optional<Wrapped>
would be like the following:
/// pseudo-Swift
extension Optional: Monad {
typealias Self<_> = Optional<_>
typealias ParameterType = Wrapped
static func pure(_ value: Wrapped) -> Optional<Wrapped> {
/// some code
}
func joined <A> () -> Optional<A> where Wrapped == Optional<A> {
/// some code
}
}
This is just an example - I’m not proposing this particular syntax - but I think it shows what would practically mean to represent a type constructor of kind * -> * -> *
.
It’s not possible to represent this kind of abstraction in Swift, and it’s a shame: as we saw with the nested monads example, we could leverage the semantic power given by combining monads without having to specify the base type, by simply extending the Monad
protocol with different implementations of pureT
and joinedT
based on the concrete associated type, like for example in the following (the last pseudo-Swift bit for now):
/// pseudo-Swift
extension Monad {
static func pureT <A> (_ value: A) -> Self<Optional<A>> where Value == Optional<Self<Optional<A>>> {
return .pure(.pure(value))
}
func joinedT <A> () -> Self<Optional<A>> where Value == Optional<Self<Optional<A>>> {
return self
.map { value -> Self<Optional<A>> in
switch value {
case let .some(innerValue):
return innerValue
case .none:
return .pure(nil)
}
}
.joined()
}
}
This code would be valid for any monad with a nested optional.
I’d like to make another practical example to show the power of combining monads, by introducing another couple of useful ones:
Future<Value>
: it’s used to model the effect related to producing a value at some point in the future or, in other words, manipulating a value that’s not yet available; where Effect<Value>
is executed with a simple execute
method that returns a value immediately, the Future<Value>
effect is executed by passing a handler function of type (Value) -> ()
, that will be called once the value is available: in this sense, Future<Value>
models the concept of continuation or asynchronous calls in general.
Result<Failure,Value>
: it’s used to model a computation that can fail, with some additional information attached to the failure, represented by the Failure
type; it works similarly to Optional<Wrapped>
, but we get an instance of Failure
instead of just nil
; Swift natively represents this with the concept of throwing function, but Result<Failure,Value>
is much more powerful for a number of reasons:
- it’s a nominal type, so it can have methods and produce valid instances;
- it carries information about the type of the error produced, while still capable of being made adequately ergonomic with an
AnyError
container, the possibility to map the Failure
part and many compositional utilities, for example when dealing with many failable operations at the same time;
- it’s intrinsically more composable: for example, it can be passed as the input to a function, with no friction.
Thus, Future<Value>
makes for asynchronous calls, like HTTP requests for example, and they can usually fail to produce a valid Value
, by providing some error instead: this seems the perfect case for nesting a monad into another, because we can represent the possibility of failure with the Result<Failure,Value>
monad, producing finally a Future<Result<Failure,Value>>
type. For this type, everything we said about pureT
and joinedT
is still valid, and this case could be even more relevant, because it’s frequent that we need to chain together more asynchronous calls, where the latter depends on the result of the former.
Again, I’m not interested in talking about the single monads: they’re useful per se, and we don’t need the notion of HKT to use them; but their composition is exponentially more powerful, and without HKTs the only option is to generate the pureT
and joinedT
functions for every possible combination of monads; and if we add a new monad in our code base at a certain point in time, we need to generate a lot of new code just for that, a solution that obviously doesn’t scale. I could produce more examples, but there are many other types of effects that can be worked with within a monadic context, some probably not yet discovered: but without a sense of HKTs in the language these entities remain isolated, and cannot be composed without tedious repetition of code: loads of code generation help, but they’re usually not the solution to the problem.
A deep nest
The problem is not just horizontal (more monads’ combinations) but also vertical (deeper nesting). For example, what if our asynchronous call produced a list of values, and from each value we want to make another asynchronous call that produces a list of values, finally bundling all the calls together into a single list? We start from a type like
Future<Result<Failure,Array<StartingValue>>>
and we’ve got a function of type
(StartingValue) -> Future<Result<Failure,Array<EndingValue>>>
,
where there’s one more level of nesting. We’d like to go from some monster like
Future<Result<Failure,Array<Future<Result<Failure,Array<Value>>>>>>
to the joined version
Future<Result<Failure,Array<Value>>>
and it turns out the if we can define pureT
and joinedT
, we’ll also be able to define pureTT
and joinedTT
. The former is simple:
extension Future {
static func pureTT <E, A> (_ value: A) -> Future<Result<E, Array<A>>> where Value == Result<E, Array<A>> {
return .pureT(.pure(value))
}
}
notice that on the base type we only call pureT
, which means that we only need that the base type is a monad to make this work, because we were able to extend our hypothetical Monad
protocol with the pureT
function. joinedTT
is where the true magic happens: if you recall joinedT
on Effect<Array<Value>>
, its implementation didn’t care about the fact that the base type was an Effect
, and only used pure
, map
and joined
on it, while taking into account the particular structure of Array
, that is, the internal type; now, in the case of Future<Result<E, Array<Value>>>
, again the most internal type is an Array
and again we were able to define pureT
, mapT
and joinedT
on Future<Result<E, Value>>
, so in theory this implementation of joinedTT
should be like joinedT
for Effect<Array<Value>>
, where we simply swap pure
, map
and joined
with pureT
, mapT
and joinedT
. Impressively, that’s exactly the case:
extension Future {
func joinedTT <E, A> () -> Future<Result<E, Array<A>>> where Value == Result<E, Array< Future<Result<E, Array<A>>>>> {
return self
.mapT { value -> Future<Result<E, Array<A>>> in
value.reduce(.pureT([])) { accumulation, element in
accumulation
.mapT { acc in
element.mapT { ele in
acc + ele
}
}
.joinedT()
}
}
.joinedT()
}
}
The types change, but the shape of the whole thing is the same. That would work also for joinedTTT
, joinedTTTT
and so on: the point is, at any level of nesting for a type like
Base<Nested1<Nested2<Nested3<...>>>>
to make the _T
functions work we require that Base
is a monad, whatever monad it is, and we need to know the internal structure of Nested1
, Nested2
and so on, because we need to be able to write joinedT
, joinedTT
and so on. Notice that we’re not abstracting the types in the middle, but we should do that otherwise we’d still need to generate a lot of code, even with HKTs. Now, there’s is actually a way to do that, that would allow us to abstract over even the most internal type: it would require to be able to define a type as Traversable
, something that requires again HKTs. But I don’t want to add too many eggs to this basket: I think I already made a clear example for the usefulness of higher-kinded typing when using monads.
Conclusion
The goal of this article was twofold: introducing monads from a slightly different perspective than usual, and show a case for the addition of HKTs to Swift (or any language with generics, really). Unfortunately, this endeavor requires a lot of words and code to be even vaguely worth of consideration, and this is a big issue with many functional programming concepts: once you live with them, they become the norm and actually shape your mind in approaching the problems in a certain way, but when you’ve just started to (or you want to be convinced to) consider them, there seems to be no real “perfect” strategy, because:
- the inductive approach, based on practical examples, doesn’t work because examples, even if well structured and abundant, are never really enough, due to their very nature: there are always alternatives, that are often considered equivalent even if they’re not necessarily equally valid, and specific cases are often too specific, and hard to expand into more generic ones that may include each particular problem;
- the deductive approach, based on theory, is often too obscure and alienating for most programmers, because, at least as I was able to observe in my experience, the programming field is usually not very fond of the theoretical way in general, and of the approaches based on analyzing problems purely in the abstract, only considering generic models.
I guess it’s a truism that the right path is in the middle: honestly I’m not sure about that, but what I’m pretty sure about is that when approaching functional programming for the first time, there’s a certain leap of faith involved, a trust that things will eventually make sense.
But there’s one thing I can say for sure about the whole functional programming approach, and what you would eventually find there: functional programming is critically concerned with being as much as possible precise and correct. Failing (crashing) is not considered a viable option in almost all cases, and the few ones where an assertion is inevitable are treated as exceptional cases (it sounds like “exception”, in fact) that must be very, very carefully controlled and isolated, and controlling effects, for example through monads, plays a huge role there. I’m not saying that functional programming is the only approach that’s concerned with preciseness and correctness, but it certainly is one of the most. I hope I was able to shed some light on why, once upon a time, someone even considered applying mathematical concepts like monads to programming, and on why these structure are useful in solving practical programming problems: now it’s up to the reader to consider taking this leap.
HKTs are a major feature of a language, and even if the Apple core Swift team has shown a tepid interest in adding them to Swift, this is not going to happen in a short time (if ever), both for the complexity of the feature, and because there’s simply a lot of much more important stuff to be done first. Are we left dead in the water, then? Well, mostly, but not completely: code generation can help, for example, when compositing monads, but it doesn’t really scale (I had to stop at _TT
in FunctionalKit because the code size and the compilation time would have been unbearable with 3 levels of nesting) and it must be redone every time a new monadic type is added. But there’s a different option, that is, emulating HKTs.
With a few tricks, we are actually able to write a container type parametrized over a value and a type constructor (actually, its tag). We can then define our monad methods in terms of this container, from which we can extract the original “containers” (like Array
, Optional
, Future
) with the correct parameter applied. This technique is based on an article by Jeremy Yallop and Leo White, which is an interesting read but I think that exploring the many implementations of it, discovered by just googling “simulating higher kinded types”, can be enough to get a general idea.
Instead of explaining the whole thing again in Swift, I’ll refer to a couple of resources:
The main idea here is to wrap types into a container for which we can actually define the various map
, joined
et cetera in an abstract way via a proper procotol, and then extract the base types with a forced unwrap that we’re sure is going to work thanks to the constraints posed on the types. The idea seems solid, but I’m yet to see Swift production code that uses HKTs defined that way, and I’ve not explored the concept myself enough to have formed a complete picture of it’s feasibility in my day-to-day work: still, it looks definitely promising.
In the next article I’m going to talk about Profunctor Optics, something that heavily needs HKTs to be defined - code generation is not going to help there - and I intend to explore HKTs emulation for that particular application.
Until next time.