Basic Haskell¶
The following is based in part on https://www.haskell.org/tutorial/, other Haskell web resources, and the 2014 book Thinking Functionally in Haskell by Richard Bird.
Introduction¶
Haskell is a purely functional programming language, where all expressions yield values, all values have types, and side-effects are carefully controlled. Like Lisp, Haskell takes inspiration from the lambda calculus. Compared to Lisp, there are some major differences:
Haskell syntax is generally more human reader-friendly than Lisp — the designers of the language made readability a priority. For instance, arithmetic expressions are written in regular infix notation, e.g.
2 * 3 + 4
instead of(+ (* 2 3) 4)
.Haskell is statically type-checked, meaning that type errors can be found before the program runs (i.e. at compile-time) by examining the source code. In contrast, Lisp is usually dynamically typed, meaning type errors may not be discovered until run-time.
Haskell uses type inference, and so in many cases it is unnecessary to explicitly state the types of functions or variables. Instead, Haskell can often infer the correct types.
Haskell uses an evaluation strategy known as lazy evaluation. The basic idea is that an expression is not evaluated until it is needed. This often ends up returning the same result as the regular non-lazy evaluation used by most other languages, but it does have a number of consequences, such as:
Potentially improved performance — expressions that are never evaluated run fast.
On the downside, it can be hard to predict the performance of lazy evaluation ahead of time. This could be a problem in real-time systems where it is required that a calculation finish in a specific amount of time.
The possibility of infinite data structures. Interestingly, these have practical uses and can simplify some kinds of code.
Some control-flow abstractions can be written using ordinary functions. In Lisp, you can do this with macros, but macros don’t work the same way as ordinary functions.
Haskell is purely functional, and does not have loops or assignment statements as found in most other programming languages. Haskell encourages the use of functions, and higher-order functions, like
map
,filter
, andfold
.While Haskell is not object-oriented, it has good support for abstraction, e.g. type classes and algebraic data types.
Side-effects are strictly controlled in Haskell, and can only appear in restricted parts of your program known as do-environments.
For programmers coming from traditional languages, Haskell can seem overwhelming at first because it has so many novel features. Plus it has mainly been developed by academics instead of practitioners, and so a lot of the discussion around Haskell is based on mathematics and theory.
Pure Functions¶
An important idea in Haskell is the notion of a pure function. A Haskell function is said to be pure if it satisfies these two properties:
It always returns the same result for the same arguments.
For example, suppose
f 5
returns the value 2 (the notationf 5
is a Haskell function call meaning functionf
is applied to the argument 5 — no brackets are needed). Iff
is pure, you can be sure thatf 5
will always return 2 no matter when it is called, or how many times you call it.What a pure function
f
returns cannot depend upon anything except the input tof
. A pure function cannot depend upon any global variables or other state that might change from call to call.It has no side-effects.
Side-effects are things like printing to the screen, reading/writing files, or communicating to other computers on the Internet. Most interactions with the outside world — nearly everything that makes computers useful! — involve side-effects.
Haskell encourages the use of pure functions, since they are usually easier to reason about than non-pure functions. Pure functions are very similar to the functions used in mathematics.
But the big problem with pure function is that purity makes it impossible to
write many vital operations. For instance, a read_char()
function that
returns the next character from a file on disk can’t be pure because it
doesn’t return the same character every time it’s called. Other impure
functions include random number generators (e.g. rand()
in C++ is impure
because it returns different numbers each time it’s called), or input
functions that get information from the user (e.g. Python’s input(prompt)
function is impure because it returns whatever string the user types).
So pure functions can’t be the only kind of function in a useful programming language. Lisp, like almost all programming languages, deals with this problem by simply allowing pure and non-pure functions to intermingle, leaving it up to the programmer to deal with the consequences.
Haskell, however, takes a very different approach to purity. Impure functions
are only permitted inside do
-environments. The use and design of these
environments is one of the most significant features of Haskell, and we’ll
have more to say about them later.
For now, though, we will learn about Haskell by exploring pure functions.
Basic Features of GHCI¶
Like Lisp, we often interact with Haskell through its interpreter. We’ll uses “ghci” in these notes, e.g.:
GHCi, version 8.0.2: http://www.haskell.org/ghc/ :? for help
Prelude> 1+2*3
7
Prelude> (1+2)*3
9
Right away you can see that Haskell uses regular infix notation for arithmetic expressions.
Here’s an example of a Haskell function declaration:
> inc :: Integer -> Integer -- type signature
> inc n = n + 1
You should type the above lines into a text file named, say, test.hs
, and
then load it into the interpreter like this:
Prelude> :load "test.hs"
[1 of 1] Compiling Main ( test.hs, interpreted )
Ok, modules loaded: Main.
The name of this function is inc
. The first line is it’s type
signature, which specifies that it takes one input, n
, of type
Integer
, and returns one value, also of type Integer
. The token ::
is read “has type”.
Source code comments in Haskell start with --
, and continue to the end of
the line.
To call inc
, write its function name followed by its input. For example:
> inc 5
6
> inc 8
9
We’ll often want to know an expression’s type, and for that we’ll use the
:type
operator (:t
for short). For example:
> :type inc
inc :: Integer -> Integer
> :type inc 4
inc 4 :: Integer
Some simple expressions can have types that are more complex than you might expect. For example:
> :type 5
5 :: Num a => a
> :type 5.3
5.3 :: Fractional a => a
For the moment we will ignore these complexities. But the basic idea is that,
for the number 5, a
is any type of value that implements all the
requirements of Num
.
Here are a couple more examples of functions:
-- calculates the length of the hypotenuse of a right-triangle
-- with leg lengths a and b
hyp a b = sqrt (a*a + b*b)
-- calculates the distance between points (x, y) and (a, b)
dist x y a b = hyp (x - a) (y - b)
Notice that these two functions don’t include a type signature. That’s okay in this case: Haskell is able to automatically infer them:
> :type hyp
hyp :: Floating a => a -> a -> a
> :type dist
dist :: Floating a => a -> a -> a -> a -> a
In practice, automatic type inference can help make your programs shorter and easier to read, and yet still benefit from static type checking.
That said, throughout these notes we will almost always write type signatures for functions. Not only is that a good way to learn about the syntax and semantics of type signatures, but they are often useful documentation for functions that help when designing and implementing programs.
Negative Numbers¶
Haskell has a small problem with negative numbers:
> inc 1
2
> inc -1
<interactive>:12:5:
No instance for (Num (Integer -> Integer))
arising from a use of `-'
Possible fix:
add an instance declaration for (Num (Integer -> Integer))
In the expression: inc - 1
In an equation for `it': it = inc - 1
To deal with this, you often need to write negative numbers inside brackets:
> inc (-1)
0
Sometimes you need the brackets, sometimes you don’t:
> -2 -- ok
-2
> -2 + 1 -- ok
-1
> 1 + -2 -- oops: -2 doesn't work here!
<interactive>:16:1:
Precedence parsing error
cannot mix `+' [infixl 6] and prefix `-' [infixl 6] in the same infix expression
> 1 + (-2) -- need () around -2
-1
The problem is due to a parsing ambiguity with -
. In Haskell, the
expression x - 1
could mean “subtract 1 from variable x
”, or it could
mean “apply function x
to -1”. The brackets are needed to avoid this
ambiguity in some cases.
Languages like C don’t have this problem because the expression x - 1
only has one meaning: subtract 1 from x
. In C, function calls always
use ()
, and so if the expression were a function call it would have been
written x(-1)
.
Calling Functions¶
Consider these two expressions:
> inc 3 -- ok
4
> inc inc 3 -- oops: error!
<interactive>:21:1:
Couldn't match expected type `a0 -> t0' with actual type `Integer'
The function `inc' is applied to two arguments,
but its type `Integer -> Integer' has only one
In the expression: inc inc 3
In an equation for `it': it = inc inc 3
<interactive>:21:5:
Couldn't match expected type `Integer'
with actual type `Integer -> Integer'
In the first argument of `inc', namely `inc'
In the expression: inc inc 3
In an equation for `it': it = inc inc 3
The expression inc inc 3
is an error because Haskell first evaluates
inc inc
. This is a type error: inc
requires a parameter of type
Integer
, but we’ve passed it inc
which has type Integer ->
Integer
.
One solution is to use brackets:
> inc (inc 3)
5
Another solution is to use the $
operator:
> inc $ inc 3
3
$
is the low-priority application operator, and is often used to reduce
the number of brackets in expressions. Once you get used to it, it can be
quite handy. However, many newcomers to Haskell find it confusing, so we will
use it infrequently.
Lists and Polymorphic Types¶
The Haskell type Integer
is an example of a concrete type that mirrors
the mathematical notion of an integer. However, Haskell functions are often
written using polymorphic types, where the exact types are not specified.
We’ll introduce Haskell’s polymorphic types and lists types at the same time,
as they go naturally go together. In Haskell, lists are written using
square-brackets, and elements are separated by commas. For example, the list
['a', 'b', 'c']
has the type [Char]
, i.e. it’s a list of Char
values.
The elements on a list must all be of the same type. For example, you could have a list of functions:
> :type [inc, inc, inc]
[inc, inc, inc] :: [Integer -> Integer]
The following list is not allowed because it has two different types of values:
> [1, 2, 'a']
<interactive>:54:2:
No instance for (Num Char) arising from the literal `1'
Possible fix: add an instance declaration for (Num Char)
In the expression: 1
In the expression: [1, 2, 'a']
In an equation for `it': it = [1, 2, 'a']
Thus, in Haskell, if L
is a list, then you know for certain that all the
elements of L
have the same type.
Now lets look at a list function. len
uses recursion to calculate the
length of a list containing elements of any type a
:
> len :: [a] -> Integer -- type signature: [a] is a list of any type
> len [] = 0 -- base case
> len (x:xs) = 1 + len xs -- recursive case
len
consists of two equations. The first equation defines the base case:
the empty list []
has length 0. The second equation applies the recursive
case to a non-empty list.
When len [1, 2, 3]
is evaluated, Haskell checks the equations of len
in the order they are written. So first it tries to match len [] = 0
. This
fails, because []
doesn’t match [1, 2, 3]
. Then it moves on to the
second equation, which it does match.
In its second equation, len
uses pattern matching to extract the head
and rest of the list. When you evaluate len [1, 2, 3]
, [1, 2, 3]
gets
matched by x:xs
, and x
is bound to 1 and xs
is bound to [2,
3]
. The expression x:xs
is a nice syntax for getting the first element,
and the rest of the elements, of a list.
The variable names in x:xs
are just a convention. The x
refers to the
first element of the list, while xs
is the plural of x
because it
refers to the rest of the list elements. You could, if you prefer, write
first:rest
, a:b
, or even car:cdr
.
Importantly, len
does not depend on any specific details of the type
a
, and so it works with any list whose elements are all of type a
,
i.e. it’s input is a value of type [a]
. For example:
> len [1, 2, 3] -- [1, 2, 3] :: [Integer]
3
> len ['a', 'b', 'c'] -- ['a', 'b', 'c'] :: [Char]
3
> len [[1], [2], [3]] -- [[1], [2], [3]] :: [[Integer]]
3
> len [len, len, len] -- [len, len, len] :: [[a] -> Integer]
3
Haskell also provides standard functions head
and tail
that work like
this:
> head [1, 2, 3]
1
> tail [1, 2, 3]
[2,3]
Note that it’s an error to call head
or tail
on an empty list:
> head []
*** Exception: Prelude.head: empty list
> tail []
*** Exception: Prelude.tail: empty list
What are the type signatures for head
and tail
? For head
, the
input is any list, and the result is the first element of the list. A general
list has the type [a]
, where a
is any type. So, the type signature of
head
must be [a] -> a
, i.e. it takes a list of values of type a
,
and returns a value of type a
.
Similarly, tail
takes a list of values of type a
as input, and returns
a list of values of type a
as output. So its type signature is [a] ->
[a]
.
You can verify these are the correct signatures using :type
:
> :type head
head :: [a] -> a
> :type tail
tail :: [a] -> [a]
You might think that the function tail
has the type a -> a
, i.e. it
takes a value of type a
as input and gives an output of type a
. But
that’s too general. It’s an error, for example, to evaluate tail 5
because tail
requires a list type as input.
Similarly, you might think that tail
has the type [Integer] ->
[Integer]
. But that type is too specific: it says that tail
would only
work with Integer
lists.
Haskell uses type inference at compile time to do this type checking, and is careful to get just the right type, i.e. not too general and too specific. This turns out to be a non-trivial task, and Haskell uses the Hindley-Milner type system to figure out the correct types for expressions.
A Few List Functions¶
Haskell provides a number of helper functions for lists. For example, you can create a range of numbers like this:
> [1..5]
[1,2,3,4,5]
> [4..9]
[4,5,6,7,8,9]
You can also have an increment, e.g.:
> [2,4..10] [2,4,6,8,10]
> [0,5..20] [0,5,10,15,20]
> [2,1..(-5)] [2,1,0,-1,-2,-3,-4,-5]
The notation is a bit unusual: the first two numbers are the first two numbers
of the list, and the number after ..
is the last number in the list. The
numbers in between increase (or decrease) by the difference between the first
two numbers.
++
concatenates lists:
> [1,2,3] ++ [4..8]
[1,2,3,4,5,6,7,8]
> [1,2,3] ++ [-1,-2,-3]
[1,2,3,-1,-2,-3]
> [1] ++ [2,3] ++ [4,5,6]
[1,2,3,4,5,6]
> [1..3] ++ [2..6]
[1,2,3,2,3,4,5,6]
++
is an example of an infix operator, and when you use an infix
operator on its own (e.g. with :type
) you must put it in round brackets:
> :type ++ -- oops!
<interactive>:1:1: parse error on input `++'
> :type (++) -- brackets required
(++) :: [a] -> [a] -> [a]
Strings¶
Haskell strings are lists of characters, i.e. a string has the type
[Char]
:
> :type "apple"
"apple" :: [Char]
Haskell pre-defines the type String
like this:
type String = [Char] --- String is a synonym for [Char]
Since strings are lists, general list functions work with them. For example:
> head "apple"
'a'
> tail "apple"
"pple"
> len "apple"
5
> "apple" ++ "s"
"apples"
> ['r', 'a', 'c', 'e'] ++ "car"
"racecar"
> reverse "orchestra"
"artsehcro"
Curried Functions¶
Consider the following function:
> add :: Integer -> Integer -> Integer -- type signature
> add x y = x + y
add
sums two integers, x
and y
. It’s type signature is a bit
strange: it says that add
takes two Integer
values as input, and
returns an Integer
as output.
The type Integer -> Integer -> Integer
is the same as Integer ->
(Integer -> Integer)
. The bracketed version highlights an important Haskell
feature: if you call add
with one Integer
(instead of two) then it
returns a value of type Integer -> Integer
, i.e. a function that takes one
Integer
and returns an Integer
.
For example, add 3
evaluates to a function. You could use it like this:
> inc3 = add 3
Now inc3
is a function that takes one input and returns that input
incremented by 3:
> inc3 4
7
inc3
takes an Integer
and returns an Integer
:
> :type inc3
inc3 :: Integer -> Integer
Notice that we did not need to write a type signature for inc3
—
Haskell was able to infer it automatically.
Curried functions can be useful in practice. For example, Haskell has a
map
function: map f lst
applies function f
to every element of
lst
. If we wanted to add 1 to every element of a list of integers, we
could do it like this:
> map (add 1) [1,2,3]
[2,3,4]
This also works with the built-in +
:
> map (+ 1) [1,2,3]
[2,3,4]
Here’s another example. Recall that ++
concatenates two lists:
> [1, 2] ++ [6, 8, 6]
[1,2,6,8,6]
Here’s its type:
> :type (++)
(++) :: [a] -> [a] -> [a]
The type [a] -> [a] -> [a]
is equivalent to [a] -> ([a] -> [a])
, and
so if we were to pass ++
a single input, it would return a value of type
[a] -> [a]
, i.e. a function that takes a list of values of type a
as
input, and returns a list of values of type a
as ouput.
++
is an infix operator, which means that it is normally put between
its inputs. Alternatively, you can use ++
(and many other infix operators)
in prefix style by putting brackets around it:
> (++) "ab" "cd"
"abcd"
So we could write this function, which adds “- ” to the start of a string:
> dash = (++) "- "
For instance:
> dash "apples"
"- apples"
> map dash ["apples", "oranges", "grapes"]
["- apples","- oranges","- grapes"]
Or you could use currying to defined this function:
> dashed = map dash
Then:
> dashed ["apples", "oranges", "grapes"]
["- apples","- oranges","- grapes"]
The definition of dashed
is interesting: it is only 4 tokens long. It
doesn’t use any if-statements, loops, recursion, variables, or lambdas. It is
short, easy to understand (at least once you get used to Haskell!), consists
of previously made functions, and is statically type-checked.
Example: Curried Decrement¶
A problem with currying is that you can only curry arguments in the order they
occur. So, for example, we can implement inc
like this:
> inc = (+) 1
> inc 3
4
But decrement is not so straightforward:
> dec = (-) 1
> dec 3
-2
The problem is that dec 3
is equivalent to (-) 1 3
, which is indeed
-2. But the order of the inputs to (-)
are in the wrong order, and so what
we want is something liked this:
minus_flipped x y = y - x
Then we can do this:
> minus_flipped x y = y - x
> dec = minus_flipped 1
> dec 3
2
This is a common enough problem with currying that Haskell provides a helper
function called flip
to make this easier to implement. flip
takes a
2-input function f
as input, an returns a new 2-input function that is the
same as f
except the order of the inputs is switched.
For instance:
> (-) 5 2 -- 5 - 2
3
> (flip (-)) 5 2 -- 2 - 5
-3
> flip (-) 5 2 -- 2 - 5
-3
> (/) 10 2 -- 10 / 2
5.0
> flip (/) 10 2 -- 2 / 10
0.2
Now we can write dec
in one line like this:
> dec = flip (-) 1
> dec 5
4
flip
is pre-defined in Haskell, but it turns out to have a short
implementation:
flip :: (a -> b -> c) -> b -> a -> c
flip f x y = f y x
The type signature tells us that flip
has 3 inputs:
- the first input is a function that takes a value of type
a
and then a value of typeb
, and returns a value of typec
- the second input is a value of type
b
- the third input is a value of type
a
- the return value of the function is a value of type
c
If you pass flip
just a function, e.g. flip (-)
, then it returns a
function of type b -> a -> c
. Note that the types of the parameters have
been switched: the b
value comes first and the a
value comes second in
the flipped function.