.. highlight:: haskell

Basic Haskell
=============

.. note:: :doc:`This is a literate Haskell page: you can load it directly into
   ghci by following these steps <haskell_literate>`.


Introduction
------------

The following is based in part on these documents is based on
https://www.haskell.org/tutorial/, and other Haskell_ web resources.

Haskell_ is a **purely functional** programming language, where all
expressions yield values, and all values have **types**. Haskell_ is similar
in spirit to Lisp_, in that both provide an implementation of the `lambda
calculus <https://en.wikipedia.org/wiki/Lambda_calculus>`_. However, there are
some major differences:

- Haskell_ syntax is generally more human reader-friendly than Lisp_ --- the
  designers of the language made readability a priority. For instance,
  arithmetic expressions are written in regular infix notation, e.g. ``2 * 3 +
  4`` instead of ``(+ (* 2 3) 4)``.

- Haskell_ is **statically type-checked**, meaning that type errors can be
  found *before* the program runs (i.e. at compile-time). In contrast, Lisp_
  type errors are usually only discovered at run-time when a program fails.

- In addition to being statically typed, Haskell_ uses **type inference**, so
  that in many cases it is not necessary to explicitly state types. Instead,
  Haskell_ can often *infer* correct types.

- Haskell_ uses an evaluation strategy known as `lazy evaluation
  <https://en.wikipedia.org/wiki/Lazy_evaluation>`_. The basic idea is that an
  expression is not evaluated until it is needed. Often this strategy ends up
  being the same as regular non-lazy evaluation (that occurs in most other
  languages), but it does have a number of consequences, such as:

  - Potentially improved performance --- expressions that are never
    evaluated run fast.

    On the downside, it can be hard to predict the performance of lazy
    evaluation ahead of time. This could be a problem in, for instance, a
    real-time system where it is required that a calculation finish in a
    specific amount of time.

  - The possibility of infinite data structures. Interestingly, this has some
    practical uses and can simplify some kinds of code.

  - Control-flow abstractions can be written using ordinary functions. In
    Lisp_, you can do this with macros, which don't work the same way as
    ordinary functions.

- Haskell_ is purely functional, and does not have loops or assignment
  statements as found in most mainstream programming languages. Haskell_
  encourages the use of functions, and higher-order functions, like ``map``,
  ``filter``, and ``fold``.

- While Haskell_ is not object-oriented, it has good support for abstraction,
  e.g. type classes and algebraic data types.

For programmers coming from traditional languages, Haskell_ can seem
overwhelming at first because it has so many novel features. Plus it has
mainly been developed by academics instead of practitioners, and so a lot of
the discussion around Haskell_ is based on mathematics and theory.


Basic Features
--------------

Like Lisp_, we often interact with Haskell_ through its interpreter. We'll
uses "ghci" in these notes. e.g.::

    GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
    Loading package ghc-prim ... linking ... done.
    Loading package integer-gmp ... linking ... done.
    Loading package base ... linking ... done.
    
    Prelude> 1 + 2 * 3
    7
    Prelude> (1 + 2) * 3
    9
    Prelude>

Right away you can see that Haskell_ uses regular infix notation for
arithmetic expressions.

Here's a simple example of a Haskell_ function declaration::

> inc :: Integer -> Integer   -- type signature
> inc n = n + 1

The name of this function is ``inc``, and it takes one input, ``n``, of type
``Integer``, and returns one value, also of type ``Integer``. The token ``::``
can be read "has type".

Source code comments in Haskell_ start with ``--``, and continue to the end of
the line.

To call ``inc``, write its function name followed by its input. For example::

    > inc 5
    6
    > inc 8
    9

We'll often want to know an expression's type, and for that we'll use the
``:type`` operator (which can be abbreviated ``:t``). For example::

    > :type inc
    inc :: Integer -> Integer

    > :type inc 4
    inc 4 :: Integer

Some expressions will have more complex types than you might expect. For
example::

    > :type 5
    5 :: Num a => a

    > :type 5.3
    5.3 :: Fractional a => a

For the moment we will ignore these complexities. But the basic idea is that
``a`` is any type of value that implements all the requirements of a ``Num``.

Here are a couple more sample functions::

    -- calculate the length of the hypotenuse of a right-triangle 
    -- with leg lengths a and b
    hyp a b = sqrt (a*a + b*b)

    -- calculate the distance between points (x, y) and (a, b)
    dist x y a b = hyp (x - a) (y - b)

Usually you type functions into a file and load the file with ``:load`` (or
just ``:l``). If you want to type a function at the command line you must use
``let``, e.g.::

    > let hyp a b = sqrt (a*a + b*b)

    > let dist x y a b = hyp (x-a) (y-b)

    > dist 1 0 2 0
    1.0


Negative Numbers
----------------

Haskell_ has a small problem with negative numbers::

    > inc -1

    <interactive>:12:5:
        No instance for (Num (Integer -> Integer))
          arising from a use of `-'
        Possible fix:
          add an instance declaration for (Num (Integer -> Integer))
        In the expression: inc - 1
        In an equation for `it': it = inc - 1

Unfortunately, you often need to write negative numbers inside brackets to get
the right results::

    > inc (-1)
    0

This is a general problem with negative numbers, e.g.::

    > -2      -- ok
    -2
    > -2 + 1  -- ok
    -1
    > 1 + -2  -- oops: -2 doesn't work here!

    <interactive>:16:1:
        Precedence parsing error
            cannot mix `+' [infixl 6] and prefix `-' [infixl 6] in the same infix expression

    > 1 + (-2)   -- need () around -2
    -1

The problem is caused by a parsing ambiguity that Haskell_ has with ``-``. In
Haskell_, the expression ``x-1`` could mean "subtract 1 from variable ``x``",
or it could mean "apply function x to -1". The brackets are needed to avoid
this ambiguity in some cases.

Languages like C_ don't have this problem because the expression ``x-1`` only
has one meaning: subtract 1 from ``x``. In C_, function calls *always* use
``()``, and so the expression ``x(-1)`` is what you must write to call ``x``
when it's a function.


Calling Functions
-----------------

Consider these two expressions::

    > inc 3       -- ok
    4
    > inc inc 3   -- oops: error!

    <interactive>:21:1:
        Couldn't match expected type `a0 -> t0' with actual type `Integer'
        The function `inc' is applied to two arguments,
        but its type `Integer -> Integer' has only one
        In the expression: inc inc 3
        In an equation for `it': it = inc inc 3

    <interactive>:21:5:
        Couldn't match expected type `Integer'
                    with actual type `Integer -> Integer'
        In the first argument of `inc', namely `inc'
        In the expression: inc inc 3
        In an equation for `it': it = inc inc 3

The expression ``inc inc 3`` is an error because Haskell_ first evaluates
``inc inc``. This is a type error: ``inc`` requires a parameter of type
``Integer``, but we've passed it ``inc`` which has type ``Integer ->
Integer``.

One solution is to use brackets::

    > inc (inc 3)
    5

Another solution is to use the ``$`` operator::

    > inc $ inc 3
    3

``$`` is the low-priority application operator, and is often used to reduce
the number of brackets in expressions.


Polymorphic Types
-----------------

The Haskell_ type ``Integer`` is a concrete type that mirrors the mathematical
notion of an integer. However, Haskell_ functions are often written using
**polymorphic types**, where the exact types are not specified.

To understand Haskell's polymorphic types, lets first look at lists. In
Haskell_, lists are written using square-brackets, and elements are
separated by commas. For example, the list ``['a', 'b', 'c']`` has the type
``[Char]``, i.e. it's a list of ``Char`` values.

The elements on a list must all be of the same type. You can have a list of
any type, so, for example, you could have a list of functions::

    > :type [inc, inc, inc]
    [inc, inc, inc] :: [Integer -> Integer]

But this list is *not* allowed because it has two different types of values::

    > [1, 2, 'a']

    <interactive>:54:2:
        No instance for (Num Char) arising from the literal `1'
        Possible fix: add an instance declaration for (Num Char)
        In the expression: 1
        In the expression: [1, 2, 'a']
        In an equation for `it': it = [1, 2, 'a']

Thus, in Haskell_, if ``L`` is a list, then you know for certain that all the
elements of ``L`` have the same type.

Now lets look at a list function. ``len`` uses recursion to calculate the
length of a list::

> len :: [a] -> Integer      -- type signature: [a] is a list of any type
> len []     =  0            -- base case
> len (x:xs) =  1 + len xs   -- recursive case

``len`` consists of two *equations*. The first equation defines the base case:
``[]`` is the empty list, and it has length 0. The second equation applies the
recursive case to any non-empty list. It uses **pattern matching** to extract
the head and rest of the list.

The expression ``x:xs`` is a nice syntax for getting the first element, and
the rest of the elements, of a list. For example, when you evaluate ``len [1,
2, 3]``, ``[1, 2, 3]`` gets matched by ``x:xs``, and ``x`` is bound to 1 and
``xs`` is bound to ``[2, 3]``. In Lisp_, we would have to use a ``let``
environment with ``car`` and ``cdr`` functions to do the same thing.

The names in the expression ``x:xs`` are just a convention. The ``x`` refers
to the first element of the list, while ``xs`` is the plural of ``x`` because
it refers to the rest of the list elements. You could, if you prefer, write
``first:rest``, ``a:b``, or even ``car:cdr``.

Haskell_ also provides standard functions ``head`` and ``tail`` that work like
this::

    > head [1, 2, 3]
    1
    > tail [1, 2, 3]
    [2,3]

Note that it's an error to call ``head` or ``tail`` on an empty list::

    > head []
    *** Exception: Prelude.head: empty list
    > tail []
    *** Exception: Prelude.tail: empty list

What are the type signatures for ``head`` and ``tail``? For ``head``, the
input is any list, and the result is the first element of the list. A general
list has the type ``[a]``, where ``a`` is any type. So, the type signature of
``head`` must be ``[a] -> a``, i.e. it takes a list of values of type ``a``,
and returns a value of type ``a``.

Similarly, ``tail`` takes a list of values of type ``a`` as input, and returns
a list of values of type ``a``. So its type signature is ``[a] -> [a]``.

You can verify these type signatures using ``:type``::

    > :type head
    head :: [a] -> a

    > :type tail
    tail :: [a] -> [a]

You might think that the function ``tail`` has the type ``a -> a``, i.e. it
takes a value of type ``a`` as input and gives an output of type ``a``. But
that's *too general*. It's an error, for example, to evaluate ``tail 5``
because ``tail`` requires a *list* type as input.

Similarly, you might think that ``tail`` has the type ``[Integer] ->
[Integer]``. But that type is *too specific*: it says that ``tail`` would only
work with ``Integer`` lists.

Haskell_ uses **type inference** at **compile time** to do this type checking,
and is careful to make sure that its gets just the right type, i.e. not too
general and too specific. Haskell_ uses the `Hindley-Milner type system
<https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system>`_ to figure
out the correct types for expressions.

When ``len [1, 2, 3]`` is evaluated, Haskell_ checks the equations of ``len``
in the order they are written. So first it tries to match ``len [] = 0``. This
fails, because ``[]`` doesn't match ``[1, 2, 3]``. Then it moves on to the
second equation, which it *does* match.

``len`` works with any type of list, e.g.::

    > len [1, 2, 3]          -- [1, 2, 3] :: [Integer]
    3
    > len ['a', 'b', 'c']    -- ['a', 'b', 'c'] :: [Char]
    3
    > len [[1], [2], [3]]    -- [[1], [2], [3]] :: [[Integer]]
    3
    > len [len, len, len]    -- [len, len, len] :: [[a] -> Integer]
    3

What's interesting about this that these expressions are type-checked, but we
don't have to explicitly say their types. Instead, Haskell_ infers them.


Some Useful List Functions
--------------------------

Haskell_ provides a number of helper functions for lists. For example, you can
create a range of numbers like this::

    > [1..5]
    [1,2,3,4,5]
    > [4..9]
    [4,5,6,7,8,9]

You can also have increment, e.g.:

    > [2,4..10]
    [2,4,6,8,10]
    > [0,5..20]
    [0,5,10,15,20]

``++`` concatenates lists::

    > [1,2,3] ++ [4..8]
    [1,2,3,4,5,6,7,8]
    > [1,2,3] ++ [-1,-2,-3]
    [1,2,3,-1,-2,-3]
    > [1] ++ [2,3] ++ [4,5,6]
    [1,2,3,4,5,6]
    > [1..3] ++ [2..6]
    [1,2,3,2,3,4,5,6]

``++`` is an example of an *infix operator*, and in sometimes you must put it
in round brackets::

    > :type ++    -- oops!
    <interactive>:1:1: parse error on input `++'

    > :type (++)  -- brackets required
    (++) :: [a] -> [a] -> [a]


Strings
-------

Haskell_ strings are lists of characters, i.e. a string has the type
``[Char]``::

    > :type "apple"
    "apple" :: [Char]

Haskell_ pre-defines the type ``String`` like this::

    type String = [Char]   --- String is a synonym for [Char]

Since strings are just lists of characters, all list functions work with
strings. For example::

    > head "apple"
    'a'

    > tail "apple"
    "pple"

    > len "apple"
    5

    > "apple" ++ "s"
    "apples"

    > ['r', 'a', 'c', 'e'] ++ "car"
    "racecar"


Curried Functions
-----------------

Consider the following function::

> add :: Integer -> Integer -> Integer  -- type signature
> add x y =  x + y

``add`` sums two integers, ``x`` and ``y``. It's type signature is a bit
strange: it says that ``add`` takes two ``Integer`` values as input, and
returns an ``Integer`` as output.

The type ``Integer -> Integer -> Integer`` is the same as ``Integer ->
(Integer -> Integer)``. The bracketed version highlights a basic feature of
Haskell_: if you call ``add`` with a single ``Integer`` (instead of two) then
it returns a value of type ``Integer -> Integer``, i.e. a function that takes
one ``Integer`` and returns an ``Integer``.

For example, ``add 3`` evaluates to a function. You could use it like this::

> inc3 = add 3

Now ``inc3`` is a function that takes one input and returns that input
incremented by 3::

    > inc3 4
    7

``inc3`` takes an ``Integer`` and returns an ``Integer``::

    > :type inc3
    inc3 :: Integer -> Integer

Notice that we did not need to write a type signature for ``inc3`` ---
Haskell_ was able to *infer* it automatically.

Curried functions can be useful in practice. For example, Haskell_ has a
``map`` function: ``map f lst`` applies function ``f`` to every element of
``lst``. If we wanted to add 1 to every element of a list of integers, we
could do it like this::

    > map (add 1) [1,2,3]
    [2,3,4]

Here's another example. Recall that ``++`` concatenates two lists::

    > [1, 2] ++ [6, 8, 6]
    [1,2,6,8,6]

Here's its type::

    > :type (++)
    (++) :: [a] -> [a] -> [a]

The type ``[a] -> [a] -> [a]`` is equivalent to ``[a] -> ([a] -> [a])``, and
so if we were to pass ``++`` a single input, it would return a value of type
``[a] -> [a]``, i.e. a function that takes any list of ``a`` values and
returns another list of ``a`` values.

``++`` is an **infix operator**, which means that it is normally put *between*
its inputs. Alternatively, you can use ``++`` (and many other infix operators)
in prefix style by putting brackets around it::

    > (++) "ab" "cd"
    "abcd"

So we could write this function, which adds "- " to the start of a string::

> dash = (++) "- "

For instance::

    > dash "apples"
    "- apples"

    > map dash ["apples", "oranges", "grapes"]
    ["- apples","- oranges","- grapes"]

Or you could use currying to defined this function::

> dashed = map dash

Then::

    > dashed ["apples", "oranges", "grapes"]
    ["- apples","- oranges","- grapes"]

Think about the approach taken here: without using any if-statements, loops,
recursion, or variables, we've created a useful type-checked function in a few
lines of simple code.