Applying Mathematics to Programming

Haskell takes a lot of inspiration from mathematics. In mathematics, ideas like equality, addition, concatenation, etc. have been studied in detail, and so it is often useful to apply the insights mathematicians have learned when programming such functions.

Equality

You can write functions for testing the equality of two values in any programming language. For example, C++ lets you define your own equality operator on some (user-defined) type T like this:

template<class T>
bool operator==(T x, T y)

Here, T is any C++ type, and for user-defined types the programmer must write their own ==. For example:

struct Color {
    uint32 r, g, b;
};

bool operator==(Color x, Color y) {
    return x.r == y.r && x.g == y.g && x.b == y.b;
}

C++ puts no restrictions on what operator== must do other than requiring that it take Color values as input and return a bool as output. A programmer is free to write a nonsensical equality function like this:

bool operator==(Color x, Color y) {
    return x.r + y.b + x.g == 104;    // Huh?
}

In mathematics, equality functions are, at least, expected to satisfy the rules for an equivalence relation. So in C++, == function is considered to be mathematically sensible if the following rules all hold (x, y, and z are all assumed to be values of the same type):

  1. Reflexivity: For all x, x == x.
  2. Symmetry: For all x and y, if x == y then y == x.
  3. Transitivity: For all x, y, and z, if x == y and y == z, then x == z.

The nonsensical == function above doesn’t satisfy all these rules. In particular, it is not symmetric, e.g. (100, 5, 3) == (10, 1, 20) is true, but (10, 1, 20) == (10, 5, 3) is false.

In many cases, especially simple ones, it is obvious that the equivalence relations rules hold for an equality function, and so we often don’t explicitly check them. However, if you ever implement an unusual or tricky equality function, it is worth checking that these rules hold!

Other basic relations, such as < and <=, also have standard mathematical rules defining exactly what makes them sensible.

In addition to being an equivalence relation, there are other things we usually require of a good == function:

  • After x == y is called, neither x nor y has been modified. You can enforce this in C++ by passing x and y by value.

  • == should have no side-effects, i.e. nothing outside of == should be modified. For example, this could be bad:

    int a = 0;  // global variable
    
    bool operator==(Color x, Color y) {
        ++a;
        return x.r == y.r && x.g == y.g && x.b == y.b;
    }
    

    There could be cases where this is okay, though. For instance, a is a count how many times == is called, which is useful profiling information.

    Haskell doesn’t allow regular functions to have side-effects, and so this is not a problem such functions. Side-effects are only permitted in special situations, i.e. inside a do environment.

  • x == y should always return the same value every time we call it. For instance, this would not be a good ==:

    bool operator==(Color x, Color y) {
        return (rand_bit() == 0);  // rand_bit() randomly returns 0 or 1
    }
    

    Again, the lack of side effects in Haskell solve this problem. A function like rand_bit() cannot be written as a regular Haskell function.

  • If x == y, then for any function f that can take a value of x_s type as input, f(x) == f(y). Intuitively, means that if x and y are the same, then calling f on either of them should give the same answer. This is perhaps so obvious that you might not think it even needs to be mentioned, but nothing about defining == forces this fact be true.

  • Both x and y have the same type T, i.e. they are from the same set of values. C++ sometimes lets you compare different types, e.g. 3 == 3.0 evaluates to true, but 3 is an int and 3.0 is a double. C++ is not directly comparing 3 and 3.0; instead, it automatically converts the 3 to 3.0, and then evaluates the expression 3.0 == 3.0.

    Note that the definition of an equivalence relation implies that the values being compared are from the same set, which implies they ought to be the same type.

  • x == y doesn’t ever get stuck in an infinite loop. In other words, x == y either returns either true or false, or perhaps throw an error. You could argue that eventually returning a value is a requirement for any correct function, but it is worth pointing out because infinite loops are not a problem in ordinary mathematical functions.

  • And last, but not least, == should usually be implemented efficiently in both time and memory. In mathematics, time/memory is not usually a concern for function evaluation, but it is a major concern in programming.

This is a lot to take into account for writing a == function! The complexity of writing a function like == is one of the reasons that some people give for not allowing programmers to create their own operators — it is too easy to get them wrong in subtle ways and cause difficult to fix bugs.

By the way, the default == for double in C++ is not an equivalence relation. It is not reflexive, i.e. that are some double values that don’t equal themselves. For instance:

double x = 0.0 / 0.0;
if (x == x) {
    cout << "x equals x\n";
} else {
    cout << "x does not equal x (!)\n";
}

This compiles, runs, and prints “x does not equal x (!)”. In practice, this might not be a huge problem because only a few unusual values of double cause such problems.

A Python Monoid

One of the many interesting things about Python is that it lets you use + and * to create strings. For example:

>>> "cat" + "dog"
'catdog'
>>> "dog" + "cat"
'dogcat'
>>> 3 * 'dog'
'dogdogdog'
>>> 'cat' * 3
'catcatcat'
>>> 3 * ('cat' + 'dog')
'catdogcatdogcatdog'
>>> '' + 'cat'
'cat'
>>> 'cat' + ''
'cat'

When used with string, + is the concatenation operator. Notice that, in general, if s and t are strings, then s + t is not equal to t + s. Also, if s is a string, then n * s and s * n are only legal if n is an integer, e.g. an expression like 'a' * 'n' is an error.

It turns out that + for string concatenation satisfies the rules of an algebraic structure called a monoid. Namely:

  1. Associativity: For any strings s, t, and u, (s + t) + u = s + (t + u).
  2. Identity element: For any string s, s + '' = '' + s = s, where '' is the empty string.

A consequence of these rules is that expressions like 3 * 'dog' make sense, i.e. it’s equal to 'dog' + 'dog' + 'dog'.

Many sets of values satisfy the rules for monoids. Haskell has the following pre-defined Monoid class:

class Monoid m where
  mempty :: m
  mappend :: m -> m -> m
  mconcat :: [m] -> m
  mconcat = foldr mappend mempty  -- default implementation of mconcat

For example, the list monoid in Haskell is defined like this:

instance Monoid [a] where
  mempty = []
  mappend x y = x ++ y

Recall that strings in Haskell are just lists of characters, so this works for strings.

Notice that Haskell leaves it entirely up to the programmer to ensure that the two monoid rules hold. Haskell does not check this for you (how could it?).

Review: The List map Function

Recall that the expression map f lst applies f to every element of lst. It’s a standard Haskell function, and can be used like this:

> map (+1) [1,2,3]
[2,3,4]

> map (>1) [1,2,3]
[False,True,True]

> map (== 'p') "apple"
[False,True,True,False,False]

It’s type signature is this:

map :: (a -> b) -> [a] -> [b]

Take a moment to make sure you really understand this type signature. The first input is a function of type a -> b, i.e. a function that converts a value of type a to a value of type b. The second input is a list of a values. The result of the mapping will be a list of values of type b, and so the output of map will be a list of b values.

Sometimes is it useful to pass in only the first argument to map. For example:

> twice :: Int -> Int
> twice n = 2 * n
>
> doubler = map twice  -- doubler :: [a] -> [b]

doubler is an example of a lifted function. map takes twice as input, and returns a function that applies twice to every element of the list. We can say that map lifts twice into the list.

Generalizing map

The essential idea of map f lst is to apply a function f to every element of lst in such a way that it modifies individual elements while still keeping the list structure. In contrast, a foldl and foldr don’t promise to return a list, e.g. a fold could reduce a list to a number, a boolean, or any other type of value.

What would a map function on a tree structure do? Presumably it would apply a function f to every element in the tree, preserving the structure of the tree.

Even more generally, suppose you have a container of values structured in some way. If you map a function g onto this container, then we would expect that every value in it would have g applied to it, and the structure of the container wouldn’t change.

This is the general notion of mapping that a functor encapsulates. In Haskell, a functor can be represented by a type class like this:

class Functor f where               -- Functor is in the standard Prelude
    fmap :: (a -> b) -> f a -> f b
    (<$) :: a        -> f b -> f a  -- synonym for fmap
    (<$) = fmap . const             -- default implementation

Haskell defines the operator <$> as an infix synonym for fmap, meaning instead of fmap f x you could write f <$> x.

In Functor f, the f is a type constructor that takes one input, e.g. f could be the Maybe type constructor. The function fmap is the generalized mapping function. Notice that the first input to fmap is a function of type a -> b. But the second input is of type f a, not [a] as with a regular list-based map. f a is a generalization of [a]. For lists, f would in fact be equal to [], but that’s just one instance of a functor:

instance Functor [] where  -- Functor instance for built-in lists
    fmap = map

Consider this user-defined list data type:

> data List a = Nil | Cons a (List a)
>   deriving Show

We could define a functor instance for it as follows:

> instance Functor List where
>     fmap g Nil               = Nil
>     fmap g (Cons first rest) = Cons (g first) (fmap g rest)

Now we can write code like this:

> fmap (+1) (Cons 1 (Cons 2 (Cons 3 Nil)))
Cons 2 (Cons 3 (Cons 4 Nil))

Trees are another example of a type where a functor makes sense:

> -- binary search tree (BST)
> data BST a = EmptyBST | Node a (BST a) (BST a)
>      deriving (Show, Read, Eq)
>
> instance Functor BST where
>     fmap g EmptyBST            = EmptyBST
>     fmap g (Node x left right) = (Node (g x)
>                                        (fmap g left)
>                                        (fmap g right))

Now we can write code that uses fmap to apply a function to every value in a BST:

> fmap (+1) (Node 1 (Node 2 EmptyBST EmptyBST) (Node 3 EmptyBST EmptyBST))
Node 2 (Node 3 EmptyBST EmptyBST) (Node 4 EmptyBST EmptyBST)

A different sort of example of where a functor can be used is the Maybe type:

data Maybe a = Nothing | Just a   -- defined in the prelude

A Maybe value can be thought of as a container that holds either Nothing, or a value of type a. So we could map over it like this:

instance Functor Maybe where  -- already defined in the prelude
    fmap f (Just x) = Just (f x)
    fmap f Nothing = Nothing

This says that mapping over Nothing gives you Nothing, and mapping over Just x gives you Just (f x).

For example:

> fmap (+1) (Just 2)
Just 3

> fmap (+1) Nothing

Rules for Functors

In addition to satisfying the fmap signature, a functor is usually expected to obey these two rules:

fmap id       ==  id               -- identity
fmap (f . g)  ==  fmap f . fmap g  -- composition

A functor that doesn’t follow these two rules can end up doing some unexpected things. Note that Haskell doesn’t automatically check that these two rules are satisfied — it’s up to the programmer to make sure they are followed.

Applicative Functors

An applicative functor can be seen as a further generalization of a functor:

class Functor f => Applicative f where
  pure  :: a -> f a
  infixl 4 <*>, *>, <*
  (<*>) :: f (a -> b) -> f a -> f b

  (*>) :: f a -> f b -> f b
  a1 *> a2 = (id <$ a1) <*> a2

  (<*) :: f a -> f b -> f a
  (<*) = liftA2 const

The two key functions are pure and <*>; the *> and <* functions have default implementations that are consequences of the other functions.

We give this here only to mention its existence as an abstraction that comes between a functor and a monad.

Monads

Monads get special attention in Haskell because Haskell uses them to handle input/output actions. Here is the basic interface:

class Applicative m => Monad m where
  return :: a -> m a
  (>>=)  :: m a -> (a -> m b) -> m b   -- bind function

  (>>)   :: m a -> m b -> m b
  m >> n = m >>= \_ -> n

  fail   :: String -> m a

For example, here is the definition of the Maybe monad:

instance Monad Maybe where
  return = Just
  (Just x) >>= g = g x
  Nothing  >>= _ = Nothing

These are provided just as examples without explanation — we don’t have the time to go into any details.

Monads turn out to be a useful way to deal with impure functions. In Haskell, a function is considered impure if it doesn’t always return the same value when given the same input. For example, suppose you have a function called read() that reads the next byte from a file. Every time you call it you usually get a different byte, so it is an impure function. Haskell allows functions inside a monad. This means that functions outside of a monad are always pure.

do Notation

In practice, it is necessary to have some understanding of monads in Haskell because they underly Haskells approach to I/O. Any time in Haskell you read/write a file, print to the screen, generate a random number, and so on, you need to use a monad.

Consider this example, which prints Hello on the screen:

> putStrLn "Hello"
Hello

What would be a sensible return type for putStrLn? In a C++ or Java it would be void. Here is what Haskell does:

> :type putStrLn
putStrLn :: String -> IO ()

The return type IO () is the type of an I/O action. In Haskell, actions are things with side-effects. For example, reading/writing from the keyboard or a file is an example of a side effect, and so they return an action type.

Actions are handled inside monads, and typically Haskell programmers use the do notation to simplify their use. For example:

main = do
    putStrLn "Hello, what's your name?"
    name <- getLine
    putStrLn ("Hello " ++ name ++ ", how is it going?")

Since putStrLn returns IO (), we need to put it in a do environment if we want to sequence more than one action in a row.

The line name <- getLine gets a string from the console and assigns it to the variable name. The type of getLine is important:

> :type getLine
getLine :: IO String

In other words, getLine returns an IO String action. Inside a monad, i.e. inside a do environment, you can use getLine like an ordinary function. But outside a do environment, you can’t get the string getLine returns — you can only get IO String, i.e. a container that holds the string.

The <- notation is a special operator that is used inside a do environment to assign a value.

Here’s another example of a do environment that shows you can use regular Haskell features, such as let, within it:

import Data.Char

main = do
    putStrLn "What's your first name?"
    fname <- getLine
    putStrLn "What's your last name?"
    lname <- getLine
    let firstName = map toUpper fname
        lastName = map toUpper lname
    putStrLn $ "hey " ++ firstName ++ " " ++ lastName ++ ", how are you?"

It is important to realize that getLine does not return a string, but instead returns an IO String. However, inside a do environment we can treat it as if it returned a string.