Monads (forget about bind)
Monads are a simple concept, but is made unreasonably hard to learn partly by tutorials featuring weird analogies and partly by Haskell-specific cruft. In this post, I promise not to use any analogies, and we'll get rid of all the cruft by inserting this directive in the top of our Haskell file:
{-# LANGUAGE NoImplicitPrelude, RebindableSyntax #-}
There, we got rid of the standard library and are left with a blank slate. We'll need the RebindableSyntax later on for using do notation with our own monads. For example purposes, we'll keep a few things from the prelude:
import Prelude (Maybe(Just, Nothing),
String, Int, fromInteger, (+), (++))
We can now forget about applicative functors and other funny sounding concepts, and focus on defining monads as they would look given none of the cruft. Here it is:
class Monad m where
fmap :: (a -> b) -> (m a -> m b)
join :: m (m a) -> m a
return :: a -> m a
The fmap
function is a generalized version of List.map
that works for all monad instances. The return
function simply wraps a plain value into an instance of the monad. The join
function peels off a layer of the monad instance, and this is usually where most of the monad logic is implemented.
You'll notice there's no >>=
operator. That's because it's not central to monads, but rather just a combination of join
and fmap
. In fact, here's the definition:
m >>= f = join (fmap f m)
Notice that it's not part of the Monad
type class. It's implemented as a stand-alone function for all monads at once, right there. You'll never have to implement it again. The do
syntax is syntactic sugar for using this operator, eg:
do
a <- x
b <- y
return (a + b)
Is translated into:
x >>= \a ->
y >>= \b ->
return (a + b)
It would be strange to try to understand >>=
without first understanding join
and fmap
. Yet, that's how monads are usually introduced! But I digress, let's look at an example.
The Maybe instance of monads
Let's implement the Monad Maybe
instance:
instance Monad Maybe where
return
takes a plain value wraps it:
return a = Just a
join
takes a Maybe (Maybe a)
and peels off a layer to get you Maybe a
:
join Nothing = Nothing
join (Just a) = a
fmap
applies a function to the contained value (if any):
fmap f Nothing = Nothing
fmap f (Just a) = Just (f a)
This instance is often used to process multiple values that may be Nothing, and returning Nothingif any of the values are Nothing. As a very simple example, here's a function that adds two numbers if they're not Nothing:
addMaybe :: Maybe Int -> Maybe Int -> Maybe Int
addMaybe maybe1 maybe2 = do
a <- maybe1
b <- maybe2
return (a + b)
Eg. addMaybe Nothing (Just 42)
is Nothing
because one of the arguments is Nothing, whereas addMaybe (Just 1) (Just 2)
is Just 3
because both arguments are Just.
The List instance of monads
With this definition of monads, implementing the list monad instance is also quite simple:
instance Monad [] where
return a = [a]
join [] = []
join (x:xs) = x ++ join xs
fmap f [] = []
fmap f (x:xs) = f x : fmap f xs
The list instance is often used to generate the Cartesian product and then apply a function to it. Here's an example:
do
x <- [10, 20, 30]
y <- [1, 2, 3]
return (x + y)
It computes [11, 12, 13, 21, 22, 23, 31, 32, 33]
.
The IO instance of monads
Purely functional programming languages like Haskell have a problem. How can one write to the file system and otherwise interact with the world, if there are no side effects?
You might come up with the following solution: Simply have the main
function take in the world, do stuff to the world like writing to files, and then return the modified world! If it worked that way, you could copy a file like this:
main :: World -> World
main world1 =
let (text, world2) = readFromFile world1 "original.txt" in
let world3 = writeToFile world2 "copy.txt" text in
world3
That is, every time we want to perform I/O, we take in the old world and return a new world, which can in turn be used in the next call. It's a bit verbose, but that's not the real problem. The real problem is, what happens if you use world1
more than once? The first time you use world1
and get a new world world2
back, world1
is no longer valid. In order to prevent people from using world1
twice, we can hide functions like readFromFile :: World -> (String, World)
away inside a data type. Thus, we create a data type that represents all such functions and call it IO
:
data IO a = IO (World -> (a, World))
In order to avoid accidentally reusing spent world, we need to stop manually passing around World
values. Fortunately, monads can help us combine IO operations without doing that:
instance Monad IO where
return a = IO (\world -> (a, world))
join (IO f1) = IO (\world1 ->
let (IO f2, world2) = f1 world1 in
f2 world2
)
fmap f2 (IO f1) = IO (\world1 ->
let (a, world2) = f1 world1 in
(f2 a, world2)
)
In return
, we simply return the plain value paired with the unmodified world. In join
, we return a function that performs the outer layer of the IO (IO a)
effect first, and then perform the effect of the inner layer. In fmap
we just apply the supplied function to the result of performing the effect.
Finally, we only give access to construct IO
s given a function World -> (a, World)
to a select few operations (readFromFile
, writeToFile
, etc.) in the standard library, and then we carefully and manually verify that those functions don't ever attempt to reuse a spent world. Together, these operations define what kind of I/O you can do in the language. Every other I/O library builds on top of this. The main
function is thus:
main :: IO ()
And we can use the do
notation to make our I/O look like an imperative language:
main = do
text <- readFromFile "original.txt"
writeToFile "copy.txt" text
Quite a bit more readable too!
Back to the real world
We threw away the standard library at the beginning, but of course you'll want to use it in real programs. In the standard library, since bind (>>=
) is used so much in Haskell, the Monad type class in the standard library requires you to implement >>=
directly. If you prefer implementing join instead, you can just define your own join'
function, and then use m >>= f = join' (fmap f m)
as the definition of bind.
The fmap
is useful for things beyond monads, so it's actually part of another type class called Functor
, which is a superclass of Monad
. Monad
recently gained another superclass called Applicative
. Once you have implemented the Monad
type class with >>=
and return
, you can simply use the generic definitions for Functor
and Applicative
found in this list monad example:
instance Monad [] where
return a = [a]
[] >>= f = []
(x:xs) >>= f = f x ++ (xs >>= f)
-- The instances below work for any monad (just replace [] with your own type).
instance Functor [] where
fmap = liftM
instance Applicative [] where
pure = return
(<*>) = ap
The Monad
type class from the standard library also has a fail
method, which you can use to change what happens when a pattern on the left of a <-
fails in do
notation. This doesn't really have anything to do with monads, and you should usually not provide your own implementation of fail
.
When implementing your own monads, make sure they adhere to the Monad laws.
Eventually, you'll run into monad transformers. If this article gathers enough interest, I'm planning to do a follow up on those.
Thank you Michael & Ramon for your input on this article. To everybody else, if you have comments, catch me on Twitter! I'm @Continuational