.. highlight:: haskell Basic Haskell ============= .. note:: :doc:`This is a literate Haskell page: you can load it directly into ghci by following these steps `. Introduction ------------ The following is based in part on these documents is based on https://www.haskell.org/tutorial/, and other Haskell_ web resources. Haskell_ is a **purely functional** programming language, where all expressions yield values, and all values have **types**. Haskell_ is similar in spirit to Lisp_, in that both provide an implementation of the `lambda calculus `_. However, there are some major differences: - Haskell_ syntax is generally more human reader-friendly than Lisp_ --- the designers of the language made readability a priority. For instance, arithmetic expressions are written in regular infix notation, e.g. ``2 * 3 + 4`` instead of ``(+ (* 2 3) 4)``. - Haskell_ is **statically type-checked**, meaning that type errors can be found *before* the program runs (i.e. at compile-time). In contrast, Lisp_ type errors are usually only discovered at run-time when a program fails. - In addition to being statically typed, Haskell_ uses **type inference**, so that in many cases it is not necessary to explicitly state types. Instead, Haskell_ can often *infer* correct types. - Haskell_ uses an evaluation strategy known as `lazy evaluation `_. The basic idea is that an expression is not evaluated until it is needed. Often this strategy ends up being the same as regular non-lazy evaluation (that occurs in most other languages), but it does have a number of consequences, such as: - Potentially improved performance --- expressions that are never evaluated run fast. On the downside, it can be hard to predict the performance of lazy evaluation ahead of time. This could be a problem in, for instance, a real-time system where it is required that a calculation finish in a specific amount of time. - The possibility of infinite data structures. Interestingly, this has some practical uses and can simplify some kinds of code. - Control-flow abstractions can be written using ordinary functions. In Lisp_, you can do this with macros, which don't work the same way as ordinary functions. - Haskell_ is purely functional, and does not have loops or assignment statements as found in most mainstream programming languages. Haskell_ encourages the use of functions, and higher-order functions, like ``map``, ``filter``, and ``fold``. - While Haskell_ is not object-oriented, it has good support for abstraction, e.g. type classes and algebraic data types. For programmers coming from traditional languages, Haskell_ can seem overwhelming at first because it has so many novel features. Plus it has mainly been developed by academics instead of practitioners, and so a lot of the discussion around Haskell_ is based on mathematics and theory. Basic Features -------------- Like Lisp_, we often interact with Haskell_ through its interpreter. We'll uses "ghci" in these notes. e.g.:: GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> 1 + 2 * 3 7 Prelude> (1 + 2) * 3 9 Prelude> Right away you can see that Haskell_ uses regular infix notation for arithmetic expressions. Here's a simple example of a Haskell_ function declaration:: > inc :: Integer -> Integer -- type signature > inc n = n + 1 The name of this function is ``inc``, and it takes one input, ``n``, of type ``Integer``, and returns one value, also of type ``Integer``. The token ``::`` can be read "has type". Source code comments in Haskell_ start with ``--``, and continue to the end of the line. To call ``inc``, write its function name followed by its input. For example:: > inc 5 6 > inc 8 9 We'll often want to know an expression's type, and for that we'll use the ``:type`` operator (which can be abbreviated ``:t``). For example:: > :type inc inc :: Integer -> Integer > :type inc 4 inc 4 :: Integer Some expressions will have more complex types than you might expect. For example:: > :type 5 5 :: Num a => a > :type 5.3 5.3 :: Fractional a => a For the moment we will ignore these complexities. But the basic idea is that ``a`` is any type of value that implements all the requirements of a ``Num``. Here are a couple more sample functions:: -- calculate the length of the hypotenuse of a right-triangle -- with leg lengths a and b hyp a b = sqrt (a*a + b*b) -- calculate the distance between points (x, y) and (a, b) dist x y a b = hyp (x - a) (y - b) Usually you type functions into a file and load the file with ``:load`` (or just ``:l``). If you want to type a function at the command line you must use ``let``, e.g.:: > let hyp a b = sqrt (a*a + b*b) > let dist x y a b = hyp (x-a) (y-b) > dist 1 0 2 0 1.0 Negative Numbers ---------------- Haskell_ has a small problem with negative numbers:: > inc -1 :12:5: No instance for (Num (Integer -> Integer)) arising from a use of `-' Possible fix: add an instance declaration for (Num (Integer -> Integer)) In the expression: inc - 1 In an equation for `it': it = inc - 1 Unfortunately, you often need to write negative numbers inside brackets to get the right results:: > inc (-1) 0 This is a general problem with negative numbers, e.g.:: > -2 -- ok -2 > -2 + 1 -- ok -1 > 1 + -2 -- oops: -2 doesn't work here! :16:1: Precedence parsing error cannot mix `+' [infixl 6] and prefix `-' [infixl 6] in the same infix expression > 1 + (-2) -- need () around -2 -1 The problem is caused by a parsing ambiguity that Haskell_ has with ``-``. In Haskell_, the expression ``x-1`` could mean "subtract 1 from variable ``x``", or it could mean "apply function x to -1". The brackets are needed to avoid this ambiguity in some cases. Languages like C_ don't have this problem because the expression ``x-1`` only has one meaning: subtract 1 from ``x``. In C_, function calls *always* use ``()``, and so the expression ``x(-1)`` is what you must write to call ``x`` when it's a function. Calling Functions ----------------- Consider these two expressions:: > inc 3 -- ok 4 > inc inc 3 -- oops: error! :21:1: Couldn't match expected type `a0 -> t0' with actual type `Integer' The function `inc' is applied to two arguments, but its type `Integer -> Integer' has only one In the expression: inc inc 3 In an equation for `it': it = inc inc 3 :21:5: Couldn't match expected type `Integer' with actual type `Integer -> Integer' In the first argument of `inc', namely `inc' In the expression: inc inc 3 In an equation for `it': it = inc inc 3 The expression ``inc inc 3`` is an error because Haskell_ first evaluates ``inc inc``. This is a type error: ``inc`` requires a parameter of type ``Integer``, but we've passed it ``inc`` which has type ``Integer -> Integer``. One solution is to use brackets:: > inc (inc 3) 5 Another solution is to use the ``$`` operator:: > inc $ inc 3 3 ``$`` is the low-priority application operator, and is often used to reduce the number of brackets in expressions. Polymorphic Types ----------------- The Haskell_ type ``Integer`` is a concrete type that mirrors the mathematical notion of an integer. However, Haskell_ functions are often written using **polymorphic types**, where the exact types are not specified. To understand Haskell's polymorphic types, lets first look at lists. In Haskell_, lists are written using square-brackets, and elements are separated by commas. For example, the list ``['a', 'b', 'c']`` has the type ``[Char]``, i.e. it's a list of ``Char`` values. The elements on a list must all be of the same type. You can have a list of any type, so, for example, you could have a list of functions:: > :type [inc, inc, inc] [inc, inc, inc] :: [Integer -> Integer] But this list is *not* allowed because it has two different types of values:: > [1, 2, 'a'] :54:2: No instance for (Num Char) arising from the literal `1' Possible fix: add an instance declaration for (Num Char) In the expression: 1 In the expression: [1, 2, 'a'] In an equation for `it': it = [1, 2, 'a'] Thus, in Haskell_, if ``L`` is a list, then you know for certain that all the elements of ``L`` have the same type. Now lets look at a list function. ``len`` uses recursion to calculate the length of a list:: > len :: [a] -> Integer -- type signature: [a] is a list of any type > len [] = 0 -- base case > len (x:xs) = 1 + len xs -- recursive case ``len`` consists of two *equations*. The first equation defines the base case: ``[]`` is the empty list, and it has length 0. The second equation applies the recursive case to any non-empty list. It uses **pattern matching** to extract the head and rest of the list. The expression ``x:xs`` is a nice syntax for getting the first element, and the rest of the elements, of a list. For example, when you evaluate ``len [1, 2, 3]``, ``[1, 2, 3]`` gets matched by ``x:xs``, and ``x`` is bound to 1 and ``xs`` is bound to ``[2, 3]``. In Lisp_, we would have to use a ``let`` environment with ``car`` and ``cdr`` functions to do the same thing. The names in the expression ``x:xs`` are just a convention. The ``x`` refers to the first element of the list, while ``xs`` is the plural of ``x`` because it refers to the rest of the list elements. You could, if you prefer, write ``first:rest``, ``a:b``, or even ``car:cdr``. Haskell_ also provides standard functions ``head`` and ``tail`` that work like this:: > head [1, 2, 3] 1 > tail [1, 2, 3] [2,3] Note that it's an error to call ``head` or ``tail`` on an empty list:: > head [] *** Exception: Prelude.head: empty list > tail [] *** Exception: Prelude.tail: empty list What are the type signatures for ``head`` and ``tail``? For ``head``, the input is any list, and the result is the first element of the list. A general list has the type ``[a]``, where ``a`` is any type. So, the type signature of ``head`` must be ``[a] -> a``, i.e. it takes a list of values of type ``a``, and returns a value of type ``a``. Similarly, ``tail`` takes a list of values of type ``a`` as input, and returns a list of values of type ``a``. So its type signature is ``[a] -> [a]``. You can verify these type signatures using ``:type``:: > :type head head :: [a] -> a > :type tail tail :: [a] -> [a] You might think that the function ``tail`` has the type ``a -> a``, i.e. it takes a value of type ``a`` as input and gives an output of type ``a``. But that's *too general*. It's an error, for example, to evaluate ``tail 5`` because ``tail`` requires a *list* type as input. Similarly, you might think that ``tail`` has the type ``[Integer] -> [Integer]``. But that type is *too specific*: it says that ``tail`` would only work with ``Integer`` lists. Haskell_ uses **type inference** at **compile time** to do this type checking, and is careful to make sure that its gets just the right type, i.e. not too general and too specific. Haskell_ uses the `Hindley-Milner type system `_ to figure out the correct types for expressions. When ``len [1, 2, 3]`` is evaluated, Haskell_ checks the equations of ``len`` in the order they are written. So first it tries to match ``len [] = 0``. This fails, because ``[]`` doesn't match ``[1, 2, 3]``. Then it moves on to the second equation, which it *does* match. ``len`` works with any type of list, e.g.:: > len [1, 2, 3] -- [1, 2, 3] :: [Integer] 3 > len ['a', 'b', 'c'] -- ['a', 'b', 'c'] :: [Char] 3 > len [[1], [2], [3]] -- [[1], [2], [3]] :: [[Integer]] 3 > len [len, len, len] -- [len, len, len] :: [[a] -> Integer] 3 What's interesting about this that these expressions are type-checked, but we don't have to explicitly say their types. Instead, Haskell_ infers them. Some Useful List Functions -------------------------- Haskell_ provides a number of helper functions for lists. For example, you can create a range of numbers like this:: > [1..5] [1,2,3,4,5] > [4..9] [4,5,6,7,8,9] You can also have increment, e.g.: > [2,4..10] [2,4,6,8,10] > [0,5..20] [0,5,10,15,20] ``++`` concatenates lists:: > [1,2,3] ++ [4..8] [1,2,3,4,5,6,7,8] > [1,2,3] ++ [-1,-2,-3] [1,2,3,-1,-2,-3] > [1] ++ [2,3] ++ [4,5,6] [1,2,3,4,5,6] > [1..3] ++ [2..6] [1,2,3,2,3,4,5,6] ``++`` is an example of an *infix operator*, and in sometimes you must put it in round brackets:: > :type ++ -- oops! :1:1: parse error on input `++' > :type (++) -- brackets required (++) :: [a] -> [a] -> [a] Strings ------- Haskell_ strings are lists of characters, i.e. a string has the type ``[Char]``:: > :type "apple" "apple" :: [Char] Haskell_ pre-defines the type ``String`` like this:: type String = [Char] --- String is a synonym for [Char] Since strings are just lists of characters, all list functions work with strings. For example:: > head "apple" 'a' > tail "apple" "pple" > len "apple" 5 > "apple" ++ "s" "apples" > ['r', 'a', 'c', 'e'] ++ "car" "racecar" Curried Functions ----------------- Consider the following function:: > add :: Integer -> Integer -> Integer -- type signature > add x y = x + y ``add`` sums two integers, ``x`` and ``y``. It's type signature is a bit strange: it says that ``add`` takes two ``Integer`` values as input, and returns an ``Integer`` as output. The type ``Integer -> Integer -> Integer`` is the same as ``Integer -> (Integer -> Integer)``. The bracketed version highlights a basic feature of Haskell_: if you call ``add`` with a single ``Integer`` (instead of two) then it returns a value of type ``Integer -> Integer``, i.e. a function that takes one ``Integer`` and returns an ``Integer``. For example, ``add 3`` evaluates to a function. You could use it like this:: > inc3 = add 3 Now ``inc3`` is a function that takes one input and returns that input incremented by 3:: > inc3 4 7 ``inc3`` takes an ``Integer`` and returns an ``Integer``:: > :type inc3 inc3 :: Integer -> Integer Notice that we did not need to write a type signature for ``inc3`` --- Haskell_ was able to *infer* it automatically. Curried functions can be useful in practice. For example, Haskell_ has a ``map`` function: ``map f lst`` applies function ``f`` to every element of ``lst``. If we wanted to add 1 to every element of a list of integers, we could do it like this:: > map (add 1) [1,2,3] [2,3,4] Here's another example. Recall that ``++`` concatenates two lists:: > [1, 2] ++ [6, 8, 6] [1,2,6,8,6] Here's its type:: > :type (++) (++) :: [a] -> [a] -> [a] The type ``[a] -> [a] -> [a]`` is equivalent to ``[a] -> ([a] -> [a])``, and so if we were to pass ``++`` a single input, it would return a value of type ``[a] -> [a]``, i.e. a function that takes any list of ``a`` values and returns another list of ``a`` values. ``++`` is an **infix operator**, which means that it is normally put *between* its inputs. Alternatively, you can use ``++`` (and many other infix operators) in prefix style by putting brackets around it:: > (++) "ab" "cd" "abcd" So we could write this function, which adds "- " to the start of a string:: > dash = (++) "- " For instance:: > dash "apples" "- apples" > map dash ["apples", "oranges", "grapes"] ["- apples","- oranges","- grapes"] Or you could use currying to defined this function:: > dashed = map dash Then:: > dashed ["apples", "oranges", "grapes"] ["- apples","- oranges","- grapes"] Think about the approach taken here: without using any if-statements, loops, recursion, or variables, we've created a useful type-checked function in a few lines of simple code.