More About Clojure
==================

Clojure_ is a powerful language packed with many interesting ideas. Even if
you never end up using Clojure_ in practice, it's instructive to examine some
of its many interesting feature.


apply and eval
--------------

The ``apply`` and ``eval`` functions are used to evaluate lists. Here is the
usual way of evaluating expressions::

	=> (* 1 2 3)
	6
	=> (+ 1 2 3)
	6
	=> (- 1 2 3)
	-4

An equivalent way is to use ``apply`` like this::

	=> (apply * '(1 2 3))
	6
	=> (apply + '(1 2 3))
	6
	=> (apply - '(1 2 3))
	-4

As you can see, ``(apply f seq)`` takes a function ``f`` and evaluates it
using the items in ``seq`` as its input. It's as if ``f`` is "cons-ed"  onto
``seq``, and then that expression is evaluated.

The ``eval`` function takes an entire list an evaluates it::

	=> (eval '(* 1 2 3))
	6
	=> (eval '(+ 1 2 3))
	6
	=> (eval '(- 1 2 3))
	-4

``eval`` is a very powerful function: it is essentially Clojure_ implemented
in Clojure_! For example, you can easily implement your own version of
``apply`` using ``eval``::

	(defn my-apply [f seq]
		(eval (cons f seq))
	)


Rest Args
---------

Sometimes it's convenient to pass an unspecified number of inputs to a
function. For example, it's nice to have a ``sum`` function that works like
this::

	=> (sum 1 2 3)
	6
	=> (sum 6 7 8 9)
	30
	=> (sum 6)
	6
	=> (sum)
	0

This cuts down on the number parentheses and ``'`` marks, which makes the code
easier to read.

In Clojure_, you can implement ``sum`` using **rest arguments**::

	(defn sum [& nums]
		(apply + nums)
	)


Here, ``nums`` is the sequence of all the arguments passed to ``sum`` --- the
``&`` tells Clojure_ that 0 or more inputs are expected.


Argument Destructuring
----------------------

Consider the following function that adds two 2-dimensional points::

	(defn add-points1 [p1 p2]
		(let [x1 (first p1)
			  y1 (second p1)
			  x2 (first p2)
			  y2 (second p2) 
			]
			(list (+ x1 x2) (+ y1 y2))
		)
	)

	=> (add-points1 '(1 2) '(3 4))
    (4 6)

Most of the function is devoted to extracting that data out of ``p1`` and
``p2``. The code isn't difficult, but it is tedious to write, and thus error-
prone (are you sure you got all the 1s and 2s right?).

So Clojure_ provides a nice feature called **destructuring** that can help you
skip a lot of this sort of data extraction::

	(defn add-points2 [[x1 y1] [x2 y2]]    ;; with destructuring
		(list (+ x1 x2) (+ y1 y2))
	)

This is shorter and clearer. Clojure_ automatically assigns the correspond
values to ``x1``, ``y1``, ``x2``, and ``y2``.

However, it's still up to you, the programmer, to pass in correctly formatted
data::

	=> (add-points2 '(1 2 3) '(3 4 5))
	(4 6)
	=> (add-points2 '(1) '(3 4 5))
	java.lang.NullPointerException (NO_SOURCE_FILE:0)


Multiple Function Signatures
----------------------------

Sometimes you want a function to take different numbers, or kinds, of inputs.
For example::

	=> (make-point)
	(0 0)
	=> (make-point 8)
	(8 8)
	=> (make-point 4 5)
	(4 5)

You can implement ``make-point`` like this::

	(defn make-point
		([] '(0 0))
		([x] (list x x))
		([x y] (list x y))
	)

Each line of ``make-point`` is a different function signature with a different
function body.

Here's another example from http://clojure.org/reference/special_forms::

	(defn mymax
	  ([x] x)
	  ([x y] (if (> x y) x y))
	  ([x y & more]
	   (reduce mymax (mymax x y) more)))


Functions that Return Functions
-------------------------------

In Lisp, functions are values that be passed to functions, returned from
functions, and assigned to variables.  For example::

    (defn make-adder [n] 
        (fn [x] (+ n x))
    )

``make-adder`` takes a number, ``n``, as input and returns a function that
adds ``n`` to the value given to it. For example::

    => (def add4 (make-adder 4))   ;; note the use of def instead of defn

    => (add4 2)
    6
    => (add4 -3)
    1

You can even do things like this::

    => ((make-adder 4) 5)
    9

When ``((make-adder 4) 5)`` is evaluated, first ``(make-adder 4)`` is
evaluated, which returns a function, and then this function is applied to 5.

Here's another handy function that returns a function::

    (defn negate [pred] 
        (fn [x] (not (pred x)))
    )

This takes a predicate, i.e. a function that takes one input and returns a
boolean value, as input, and returns new predicate that is its negation. For
example::

    => (def my-odd (negate even?))
    #'user/my-odd
    user=> (my-odd 5)
    true
    user=> (my-odd 6)
    false

Or::

    => (keep-if (negate even?) '(1 2 3 4 5 6 7))
    (1 3 5 7)

This removes all the even numbers from a list. If you instead used ``(negate
odd?)``, it would remove all the odd numbers.


Memoization
-----------

Consider the following classic recursive function::

    (defn fib [n]
        (cond
            (= n 0) 1
            (= n 1) 1
            :else (+ (fib (- n 2)) (fib (- n 1)))
        )
    )

``(fib n)`` recursively calculates the nth Fibonacci number using the 
traditional recurrence, i.e. :math:`F_0 = 1`, :math:`F_1 = 1`, and 
:math:`F_n = F_{n-1} + F_{n-2}`.

On the plus side, this function is very close to the mathematical definition,
and so it is quick to write, short, and easy to check for correctness.

On the negative side, it is extremely inefficient because almost every call to
``fib`` makes two more calls to ``fib``, and this results in a huge amount of
redundant calculation. Try calculating, say, ``(fib 35)``, to see how slow it
is.

One way to improve this function is to rewrite it using a loop. Clojure_
provides a number of looping constructs that you could use to do this.

But there is another interesting and general-purpose approach to speeding-up
this function that is much easier to implement (once you know about it!).

First, lets rewrite the definition of ``fib`` to use plain ``def`` instead of
``defn``::

    (def fib
        (fn [n]
            (cond
                (= n 0) 1
                (= n 1) 1
                :else (+ (fib (- n 2)) (fib (- n 1)))
            )
        )
    ) 

Now, we can speed up this version of the function by using the built-in
``memoize`` function::

    (def fib
        (memoize
            (fn [n]
                (cond
                    (= n 0) 1
                    (= n 1) 1
                    :else (+ (fib (- n 2)) (fib (- n 1)))
                )
            )
        )
    )

``(memoize f)`` returns a **memoized** version of ``f``. In this case, the
result is a dramatic speed-up of ``fib``: I can calculate ``(fib 91)``
apparently instantaneously (values greater than 91 cause overflow).

Memoization does two basic things. It adds:

- A hash table to ``fib`` where it stores any values that it calculates.

- An if-statement to the beginning of the function so that when ``(fib n)`` is
  called, it first checks to see if that value of ``n`` has already been
  calculated. If so, it returns that value and does not call the body of the
  function. If ``n`` is not in the table, then it is calculated using the body
  of the function, and this result is stored in the table and then returned as
  the value of the function.

Essentially, memoization caches values. This can speed up almost any function
that makes lots of function calls (such as ``fib``). It's the same sort of
trick that high-traffic websites use to increase performance, but on the level
of a function instead of an entire website.

The downside of memoization is that the storage table takes up extra space.
For the ``fib`` function, the memoization table is never going to be very big
because it can't calculate values greater than ``(fib 91)``.

This is a traditional trade-off: by using more memory, we get more speed.


Aside: Memoization in Python
----------------------------

Let's look at how memoization can be implemented in Python. Python's syntax
makes the details a little clearer, plus it provides a convenient usage syntax
called a decorator.

For concreteness, lets memoize this function by hand::

    def fib(n):
        if n == 0:
            return 1
        elif n == 1:
            return 1
        else:
            return fib(n-2) + fib(n-1)

Memoization works by adding a table that stores (``n``, ``f(n)``) pairs. The
easiest way to implement such a table in Python is to use a dictionary::

    cache = {}  # an initially empty dictionary

Next, we modify the code in ``fib`` to check to see if the desired value has
already been calculated::

    cache = {}
    def fib2(n):
        if n == 0:
            return 1
        elif n == 1:
            return 1
        else:
            if not n in cache:
                cache[n] = fib2(n-2) + fib2(n-1)
            return cache[n]

This is *much* faster than ``fib``! 

What's interesting about this is that adding the memoizing does not require
modifying the body of ``fib``. This the key to writing a general-purpose
memoizer that works with any function.

In Python, the best way to write a memoizer is to use a class like this::

    class memoize(object):
        def __init__(self, fn):
            self.fn = fn
            self.cache = {}
        def __call__(self, *args):
            if not args in self.cache:
                self.cache[args] = self.fn(*args)
            return self.cache[args]

Don't worry about the exact syntactic details of Python. Just notice that the
code that was used to implement memoization was generalized and put into a
class.

Now we can easily memoize a function by using decorator syntax, e.g.::

    @memoize    # decorator
    def fib(n):
        if n == 0:
            return 1
        elif n == 1:
            return 1
        else:
            return fib(n-2) + fib(n-1)

The decorator acts as if we had written ``fib = memoize(fib)``. 


Closures
--------

Why is Clojure_ called Clojure_? It's a play on the term **closure**, and the
"j" is for Java_.

So what is a closure? It's a combination of two things: a function, and an
environment of (variable, value) pairs. The function is allowed to use
variables from this environment.

For example, consider this code::

	(defn f [x] (+ x 1))

``f`` is the name of a function, not a closure. But things change when we
define it like this::

    (defn make-adder [n] 
        (fn [x] (+ n x))   ;; n's value comes from outside the lambda function
    )

    (def g (make-adder 1))  ;; note the use of def

``g`` names a closure that includes both a function and a binding for the
variable ``n``. If you have just the function ``(fn [x] (+ n x))``, then you
can't actually evaluate it because ``n`` is not specified. But, of course, in
this example, ``n`` is bound to the value 1. The function plus this binding of
``n`` is a closure.

More concretely, compare this::

	=> (def g1 (let [n 1] (fn [x] (+ n x))))

	=> (g1 5)
	6

	=> (g1 9)
	10

To this::

	=> (def g2 (fn [x] (+ n x)))
	user=> CompilerException java.lang.RuntimeException: Unable to resolve symbol: n in this context, compiling:(/tmp/form-init597735295358479020.clj:1:17) 

Clojure_ doesn't let us define ``g2`` because the variable ``n`` is free, i.e.
it is not bound to any value. And so how could Clojure_ possibly evaluate
``(g2 5)`` or ``(g2 9)``? It can't!

``g1`` is a closure. You can see that it's not *just* a function, but a
function along with some variable bindings that it needs.

Any programming language that let you to define a function inside another
function must deal with the issue of what do about variables the inner
function uses that are not defined inside it. Closures are a natural and
useful solution to this problem. Languages that have lambda functions --- such
as Go_, Java_, C++, Python, JavaScript, etc. --- also support closures.

In Go_, for example, you can do this::

	package main

	import "fmt"

	type intFunc func(int) int

	func makeAdder(n int) intFunc {
		result := func(x int) int { return x + n }
		return result
	}

	func main() {
		add2 := makeAdder(2)   // add2 is a closure

		fmt.Println(add2(5))
	}

``add2`` is a closure. The ``n`` inside the result function is bound to the
value of ``n`` passed to ``makeAdder``, and it stays bound for the life of the
``result`` function. So even after ``makeAdder`` finishes executing, the ``n``
in the ``result`` is still there and can be used as long as ``add2`` exists.

As an aside, notice how the *typing* information that Go_ requires makes the
code longer, and more cluttered, than the same code in Clojure_. Clojure_ is
dynamically typed, and so doesn't check for type information until run time.


Composing Functions
-------------------

Another interesting feature of Lisp_ is the ability to easily compose
functions. Recall how function composition works in mathematics. Given
:math:`f(x) = x^2` and :math:`g(x) = 2x + 1`, the composition of
:math:`f` and :math:`g`, denoted :math:`f \circ g`, can be calculated as
:math:`f(g(x)) = g(x)^2 = (2x + 1)^2 = 4x^2 + 4x + 1`.

In Clojure_, we can write these functions like this::

    (defn f [x] (* x x))
    (defn g [x] (+ (* 2 x) 1))

For example::

    user=> (f 2)
    4
    user=> (g 2)
    5
    user=> (f (g 2))
    25

One way to compose functions is to write a function called ``compose`` like
this::

    (defn compose [f g] (fn [x] (f (g x))))

``compose`` takes two functions, ``f`` and ``g`` as input, and returns the 
function ``(fn [x] (f (g x)))`` as output.

Now we can write the composition of ``f`` and ``g`` like this::

    (def fog (compose f g))

(The built-in Clojure_ function ``comp`` does the same thing.)

And it works::

    user=> (fog 2)
    25

The ``twice`` function takes a function, ``f``, as input and returns ``f``
composed with itself, i.e. :math:`f \circ f`::

    (defn twice [f] (compose f f))

For instance::

    (def garnish (twice (fn [x] (cons 'cheese x))))

    => (garnish '(1 2 3))
    (cheese cheese 1 2 3)

We can generalize this as follows::

    (defn compose-n [f n]
        (cond
            (= n 1) f
            :else (compose f (compose-n f (- n 1))) 
        )
    )

In a sense, this returns a new function that is ``f`` to the power of ``n``,
where function composition is being used instead of multiplication.


Partial Function Application
----------------------------

Composition is a way of creating a new function from a pair of other
functions. Another interesting way to create a function from a function is to
use **partial evaluation**. For example::

    (defn add [x y] (+ x y))
    (def add5 (partial add 5))  ;; note the use of def (not defn)

    => (add5 6)
    11

The ``add`` function takes two inputs, ``x`` and ``y``, while ``add5`` takes
only one input. 

The built-in function ``partial`` takes a function ``f`` with n inputs as the
its first argument, followed by fewer than n arguments. It then returns a new
function that is the same as the passed-in function, except the parameters
passed to ``partial`` are bound to the corresponding formal parameters of
``f``.

Here's another example::

    (def hatter (partial cons 'hat))  ;; note the use of def (not defn)

    => (hatter '(a b c))
    (hat a b c)

And one more::

    (def cheese-it (partial compose-n (fn [x] (cons 'cheese x))))

    => ((cheese-it 5) '(1 2 3))
    (cheese cheese cheese cheese cheese 1 2 3)


Example: Making Selector Functions
----------------------------------

As an example of functional programming, lets see how we might go about
creating a function that makes selector functions. Here are some examples of
selector functions (``first`` and ``second`` are already defined in
Clojure_)::

    (defn third [seq]
        (first (rest (rest seq)))
    )

    (defn fourth [seq]
        (first (rest (rest (rest seq))))
    )

For example::

    user=> (third '(1 2 3 4 5))
    3
    user=> (fourth '(1 2 3 4 5))
    4

It's not hard to see the pattern for how these functions are defined: to get
the nth element of a sequence, call ``rest`` on it n-1 times (thus removing
the first n-1 elements), and then call ``first``.

With some thought, you can come up with a function like this::

    (defn make-selector1 [n]
        (compose first (compose-n rest (- n 1)))
    )

``make-selector1`` returns a function that can be used as a selector. So we 
could use it like this::

    (defn third [seq]
        (make-selector1 3)
    )

    (defn fourth [seq]
        (make-selector1 4)
    )

    (defn fifteenth [seq]
        (make-selector1 15)
    )

Another interesting way to create a selector-making function is to used the
built-in compose function in Clojure_ called ``comp``. It's like the
``compose`` function we defined, but it lets you compose more than two
functions at once, e.g.::

    user=> (defn add1 [x] (+ 1 x))
    user=> (def t (comp add1 add1 add1))
    user=> (t 5)
    8

This saves us from typing some extra parentheses (always a nice thing in
Lisp_!), e.g.::

    (defn third [seq]
        ((comp first rest rest) seq)
    )

    (defn fourth [seq]
        ((comp first rest rest rest) seq)
    )

The pattern of how these functions work is now even clearer. Using the built-
in function ``replicate``, we could do this::

    user=> (replicate 5 rest)
    (#<core$rest clojure.core$rest@18dec5c> #<core$rest clojure.core$rest@18dec5c> 
     #<core$rest clojure.core$rest@18dec5c> #<core$rest clojure.core$rest@18dec5c> 
     #<core$rest clojure.core$rest@18dec5c>)

``(replicate n x)`` returns a list containing n copies of x. The example here
shows a list containing 5 copies of the ``rest`` function.

With this, and the ``apply`` function (explained below), we can write this
function::

    (defn make-selector2 [n]
        (apply comp 
               (cons first (replicate (- n 1) rest))
        )
    )

    (def third (make-selector2 3))
    (def fourth (make-selector2 4))
    (def fifteenth (make-selector2 15))

Look at the body of ``make-selector2``. The expression ``(cons first
(replicate (- n 1) rest))`` creates a list whose first element is the function
``first``, and whose remaining n-1 elements are the function ``rest``.

What we want to do is compose all those functions together using ``comp``. You
might first guess that you could do it like this::

    (cons comp 
          (cons first (replicate (- n 1) rest))
    )

But this returns a list whose first element is the function ``comp``, and
whose remaining elements are as above (``first``, followed by many copies of
``rest``). It is just a list, and does not get evaluated.

So instead of ``cons``, Clojure_ provides the built-in ``apply`` function that
takes a function and a list of parameters as input, and returns the result of
applying the function to those parameters. For example::

    => (defn plus [x y] (+ x y))
    => (apply plus '(2 3))
    5

That's just what we need here because the arguments to ``comp`` are
constructed as a list::

    (apply comp 
           (cons first (replicate (- n 1) rest))
    )


Macros
------

Macros are one of the most interesting and powerful features in Lisp_.
Intuitively, you can think of them as programs that make other *programs*.
This is in contrast to functions, which are programs that make *values*.
Macros let you do things like create our own def-like environments, and our
own control structures --- things that are usually impossible to do languages
like Java or C++.

C and C++ also have macros, but they're less-powerful than Clojure_ macros.
C/C++ macros are just string-based search-and-replace done before compilation,
and they are not an intrinsic part of the language. You could even replace the
default C/C++ macro processor with a different one, if you like. Clojure_
macros, however, are a fundamental feature of the language that let you decide
*when* an expression is evaluated. They work on Clojure_ expressions, not just
strings.

To see why you might want to make a macro, consider this example::

	=> (def x 3)
	#'user/x
	=> x
	3

``def`` binds the value 3 to ``x``. But here's an interesting question: is
``def`` a function? No, it turns out that it's not. If ``def`` were a
function, then you'd get an error saying ``x`` is undefined, just like this::

	;; assume x is not defined
	=> (+ 2 x)
	java.lang.Exception: Unable to resolve symbol: x in this context (NO_SOURCE_FILE:1)

That's because the rule for calling ``(f x y)`` in Clojure_ is to first
evaluate ``x`` and ``y``, and then pass those results to ``f``. So ``(def x
3)`` wouldn't work if it were a function because Clojure would try to evaluate
``x``. But ``x`` doesn't exist yet.

Clojure_ calls expressions like ``(def x 3)`` **special forms** to distinguish
them from regular functions. Other special forms are ``defn``, ``and``,
``let``, ``cond``, etc. In most languages, special forms are hard-coded into
the language, and you can't add your own. But in Lisp, macros let you do just
that (among other things!).

Here's another example. Suppose you want to create your if-then-else control
structure. You can't do that with a function::

	;; BAD!: can't create a control structure with a function
	;; a and b are both evaluated before they are passed to if-then-else,
	;; but we only ever want to evaluate one of then
	(defn if-then-else [condition a b]
		(if condition a b)
	)

	user=> (if-then-else true (println "yes") (println "no"))
	yes
	no
	nil

	user=> (if-then-else (= 1 2) (println "yes") (println "no"))
	yes
	no
	nil

The problem is that the if-part and the else-part are always *both* evaluated
when ``if-then-else`` is called. That's wrong. For an if-else statement, we
only ever want to evaluate one of the ``a`` or ``b``, and *never* both.

The problem is that the function parameters are evaluated *before* being
passed to the function. But with expressions like ``def`` and ``if-then-else``, we want the parameters to be evaluated *after* they are passed in.

This is what macros do: they let us control when parameters are evaluated. The
essential difference between a macro and a function is that a macro is passed
its parameters *unevaluated*, while functions are passed their parameters
evaluated. So we can use macros to write our own special forms, e.g. def-like
environments, or control structures like ``if-then-else``.

This macro correctly implements ``if-then-else``::

	(defmacro if-then-else [condition a b]
		(list 'if condition a b)
	)

The trick with a macro is that its body creates an expression that, when
evaluated, does what we want it to do. Macros run in a two-step process: they
first create an expression, and then they evaluate the expression. The
creation step is called **expansion**, and it is this step that functions
don't do.

Use the ``macroexpand`` function to see what a macro expands to::

	=> (macroexpand '(if-then-else false (println "Hi!") (println "Bye!")))
	(if false (println "Hi!") (println "Bye!"))

The result here shows exactly what Clojure_ will evaluate when you call ``(if-
then-else false (println "Hi!") (println "Bye!"))``.

In ``if-then-else``, ``a`` and ``b`` are *not* evaluated when the macro is
expanded: they are just put onto a list as-is. When the macro is evaluated, so
too are ``a`` and ``b``: they are evaluated *after* they are passed to 
``if-then-else``.

Clojure_ does not have a ``nand`` operator, so lets write a macro for it. The
expression ``(nand a b)`` should be true just when ``a`` and ``b`` are not
both true. In other words, ``(nand a b)`` always has the same value as ``(not
(and a b))``::

	(defmacro nand [a b]
		(list 'not (list 'and a b))
	)

	r=> (nand true false)
	true

	=> (nand (< 1 2) (< 3 4))
	false

It seems to work. Use ``macroexpand`` to check that the expression it
evaluates is correct::

	=> (macroexpand '(nand true false))
	(not (and true false))

This tells us that when we evaluate ``(nand true false)``, it will first
expand into ``(not (and true false))``, and then this expanded expression will
be evaluated. Similarly::

	=> (macroexpand '(nand (> 1 2) (> 3 4)))
	(not (and (> 1 2) (> 3 4)))

``macroexpand`` is essential for debugging macros. It shows you *exactly* what
expression gets evaluated.

Notice that we used ``list`` twice inside ``nand``. This version would **not**
work::

	(defmacro nand-bad1 [a b]
		(list 'not '(and a b))
	)

	=> (nand-bad1 true false)
	java.lang.Exception: Unable to resolve symbol: a in this context (NO_SOURCE_FILE:106)

	=> (macroexpand '(nand-bad1 true false))
	(not (and a b))

``macroexpand`` shows why this is wrong. When ``(nand-bad1 true false)`` is
evaluated, it's expanded to ``(not (and a b))``, and then ``(not (and a b))``
is evaluated. The error is that ``a`` is not defined here. Indeed, we don't
want ``a`` here, we want the expression ``a`` refers to, which is ``true`` in
this case.


A def-like Macro
~~~~~~~~~~~~~~~~

Here's an example of a def-like macro. Suppose we're writing a program that
involves point objects defined like this::

	(defn make-point-obj [x y]
	    (list 'point (list 'x x) (list 'y y))
	)

	=> (make-point-obj 4 9)
	(point (x 4) (y 9))

	=> (make-point-obj -6 18)
	(point (x -6) (y 18))

You can use ``def`` to assign a point to a global variable, e.g.::

	=> (def origin (make-point-obj 0 0))
	#'user/origin
	
	=> origin
	(point (x 0) (y 0))

But you could also write a special macro for defining points like this::

	(defmacro defpoint [vble a b]
	    (list 'def vble (list 'make-point-obj a b))
	)

Now you can write code like this::

	=> (defpoint origin 100 50)
	#'user/origin

	=> origin
	(point (x 100) (y 50))

Writing ``(defpoint origin 100 50)`` is a little more readable than ``(def
origin (make-point-obj 0 0))``.

You could also allow for the possibility of defining a point without yet
knowing exactly what value you want to give it::

	(defmacro defpoint
	    ([v]
	        (list 'def v (list 'make-point-obj 0 0))
	    )
	    ([v a b]
	        (list 'def v (list 'make-point-obj a b))
	    )
	)

	=> (defpoint q)
	#'user/q

	=> q
	(point (x 0) (y 0))


Macro Pitfalls
--------------

Macros are trickier to use than functions because of their extra expansion
step. They can often suffer from subtle bugs if not implemented carefully.
We'll discuss two basic problems you can run into: **variable capture**, and
**double evaluation**. Keep in mind that these are *not* the only problems
macros can suffer from --- there are other kinds of problems with macros that
can arise!


Variable Capture
~~~~~~~~~~~~~~~~

Macros can, if they need to, define variables. For instance, consider this
macro::

	(defmacro add1-bad [n]
	    (list
	        'let ['incr 1]
	        (list '+ n 'incr)
	    )
	)

	=> (add1-bad 5)
	6
	
	=> (let [a 2] (add1-bad a))
	3

It's a contrived example, but it shows a subtle problem::

	=> (let [incr 2] (add1-bad incr))
	2

For some strange reason, the programmer decided use a variable named ``incr``
in this expression, and this breaks the macro: 2 is the wrong result! It
should be 3. The problem here is that, unknown to the programmer, ``add1-bad``
uses ``incr`` as an internal variable. So the macro expands to this::

	(let [incr 1] (+ incr incr))

Clearly, the problem is that the ``incr`` variable that gets passed into
``add1-bad`` has been *captured* by the local ``incr`` in the ``add1-bad``
``let`` expression.

The way to fix this problem is to make sure that the local variables in a
macro *never* have the same name as any variable that might be passed into it.
However, for any name we change ``incr`` to, it's possible that the programmer
could pass that name in!

The standard Lisp_ solution to the problem is to generate a unique variable
name using the ``gensym`` function::

	=> (gensym)
	G__2728
	=> (gensym)
	G__2731
	=> (gensym)
	G__2734

Every time ``(gensym)`` is evaluated, it returns a new name that it has not
returned before.

Now we can replace the variable ``incr`` with a name generated by ``gensym``::

	(defmacro add1-good1 [n]
	    (let [vbl (gensym)]       ;; note that this outer let is not quoted
	        (list
	            'let [vbl 1]      ;; vbl is not quoted
	            (list '+ n vbl)
	        )
	    )
	)

	=> (let [incr 1] (add1-good1 incr))
	2

	=> (let [a 1] (add1-good1 a))
	2

This trick is so common in macros, that Clojure_ provides some syntactic help
to simplify it a bit::

	(defmacro add1-good2 [n]
	    (list
	        'let ['incr# 1]
	        (list '+ n 'incr#)
	    )
	)

Putting a ``#`` at the end of a symbol is called an **auto-gensysm**. The name
``incr#`` is guaranteed to be unique.


Double Evaluation
~~~~~~~~~~~~~~~~~

Consider this function and macro::

	(defn dbg [x]
	    (println x)
	    x
	)

	=> (dbg 4)
	4
	4


	(defmacro square [n]
	    (list '* n n)
	)

	=> (square 4)
	16

The ``dbg`` function first prints whatever you pass to it, and then returns it
as a value. As its name suggests, it's meant for debugging.

Using ``dbg``, we can see that the ``square`` macro suffers from *double
evaluation*, i.e. whatever input you pass it gets evaluated twice (because
``n`` appears two times in the macro)::

	=> (square (dbg 4))
	4
	4
	16

	=> (macroexpand '(square (dbg 4)))
	(* (dbg 4) (dbg 4))

The correct answer (16) is returned, but you might expect that 4 should be
printed only once. It certainly appears that ``(dbg 4)`` is only evaluated
once, but if you read the ``square`` macro you can see that . anything you
pass to it gets evaluated *twice*. Not only is this inefficient, it can cause
serious problems for any expressions with side-effects, such as printing to
the screen, writing to a file, modifying a global variable, etc.

To avoid double evaluation, you can use ``let`` to make sure the evaluation
only happens once::

	(defmacro square-fixed [n]
	    (list
	        'let ['a# n]      ;; n is only evaluated once here 
	        (list '* 'a# 'a#)
	    )
	)

	=> (square-fixed (dbg 4))
	4
	16
	user=> (macroexpand '(square-fixed (dbg 4)))
	(let* [a (dbg 4)] (* 4 4))


When Should You Use Macros?
---------------------------

Some programmers believe that macros are a disaster. They say that they're
hard to write, hard to understand, and hard to debug. You shouldn't use
programming language features that are both difficult and error-prone, they
say. The *lack* of macros in a language is a *feature*!

But some programmers think that the benefits of macros outweigh these costs.
Yes, they say, macros are harder to write, understand, and debug as compared
to regular functions, but they are also more powerful than regular functions.
When use with care, they can make your programs much easier to read and
understand, and more fun to use. Plus they can do things you can't possibly do
with functions, like create new control structures.

A general rule of thumb is to use macros sparingly. If a function will do,
then use a function. Use macros only when functions won't do the job, or when
they provide a clear and obvious benefit, such as significantly improved
syntax. 

Finally, we should mention that if you use macros a lot, using so many
``list`` and ``'`` tokens soon becomes tiresome, and this leads to errors. So
Clojure_ provides some extra syntax designed specifically for helping with
macros. For example, a better way to write the ``if-then-else`` macro is like
this::

	=> (defmacro if-then-else [condition a b] `(if ~condition ~a ~b))
	#'user/if-then-else

We've used a **backquote** instead of a regular (forward) quote in this
function. A backquote in Clojure_ is like a regular quote, but it lets us use
``~`` inside the list to *turn off quoting*. So ``if`` is a symbol that stand
for itself, while ``~condition``, ``~a`` and ``~b`` are replaced by what those
variables refer to. 

Although we haven't used it here, the backquote also lets use the ``~@``
operator for splicing lists, e.g.::

    => (def vars '(x y z))

	=> `(a b ~@vars c)
	(user/a user/b x y z user/c)

You can also see here that the backquote has prepended ``user/`` in front of
``a``, ``b``, and ``c``. ``user/`` refers to the namespace that these
variables are in, and this helps to prevent unintended variable capture.

Admittedly, the benefit of using a backquote in ``if-then-else`` is small. But
if you write a lot of macros, especially larger and more complex ones,
backquotes can improve readability quite a bit.


Making Your cons, first, and rest
---------------------------------

One of the many interesting things about Lisp_ is that it's possible to write
your own versions of the standard ``cons``, ``first``, and ``rest`` functions
as follows::

    (defn mycons [x y] (fn [m] (m x y)))

    (defn myfirst [z] (z (fn [p q] p)))
    
    (defn myrest [z] (z (fn [p q] q)))

The is known as the Church encoding (after Alonzo Church). It essentially
defines a list data type in Clojure_ using only functions. For instance::

    => (def lst (mycons 1 (mycons 2 (mycons 3 ()))))
    #'user/lst
    => lst
    #object[user$mycons$fn__1212 0x5c2ec4 "user$mycons$fn__1212@5c2ec4"]
    => (myfirst lst)
    1
    => (myfirst (myrest lst))
    2

The trick is that the list ``(3)`` can be represented by the expression
``(mycons 3 ())``. This is equivalent to the function ``(fn [m] (m 3 ()))``,
where ``3`` and ``()`` are stored within the function like a pair of values.
When ``myfirst`` is called with this function as the input, ``m`` gets
replaced by a function that takes two inputs, ``p`` and ``q``, and returns
``p``. ``myrest`` is similar, except its function returns ``q`` instead.

This is not an efficient way to represent lists in practice, but it is useful
in theory. It shows that functions like ``cons``, ``first``, and ``rest`` need
not primitives, but instead can be defined in terms of functions.

It's also possible to represent integers and arithmetic in this purely
functional way. Again, this is not efficient, but it is `quite important in
theory <https://en.wikipedia.org/wiki/Lambda_calculus>`_.


Clojure Lists
-------------

Lists in Clojure_ are implemented as traditional singly-linked lists. Each
item in a list is represented by a `cons cell
<https://en.wikipedia.org/wiki/Cons>`_, i.e. a record that contains two
pointers, one the first element of the list, and one to the rest of the list.

Because of this implementation, ``cons``, ``first``, and ``rest`` are all
:math:`O(1)` functions. Functions like ``append`` or ``last`` are
:math:`O(n)`.

Interestingly, and usefully, Clojure_ lists are normally **immutable**,
meaning that once you create a list you can't change it. A consequence of this
is that lists can easily share parts, e.g.::

    (def A '(1 2 3 4))
    (def B (rest A))

Here, ``B`` is essentially a pointer to the
<https://en.wikipedia.org/wiki/Cons>`_ for 2. No copy of the elements in ``A``
needs to be made --- ``B`` and ``A`` share a sublist.

Another useful consequence is list immutability is that concurrent programming
with them tends to be easier because processes never need to worry about lists
changing. Concurrent processes can only read a list, never write it.


Clojure Vectors
---------------

We've been using lists as the main data type in Clojure_. But Clojure_ also
supports vectors and maps, and in this section we will take a brief look the
unusual way that Clojure_ implements its vectors.

All programming languages have some kind of array, or vector, for storing a
sequence of items. For example, in Java_, you could create an array like
this::

    int[] nums = {4, 5, 6, 7};
    System.out.println(nums[3]);  // prints 7

Internally, Java_ implements ``nums`` as a *contiguous* sequence of ``int``\
s. Thus, accessing an element by index takes :math:`O(1)` time, no matter how
long the array is.

In general, this is how most programming languages implement arrays or
vectors. One other detail is whether or not the size of the array can change
after it's created. In languages like Java_, C/C++, and Go_, once you create
an array its length **never** changes. We call this a fixed-size, or static
array.

But in other languages, such as sequences in Python_, or vectors in C++, are
dynamic, i.e. their size can grow or shrink as needed. For instance, this is
common in Python_::

    >>> nums = []                 # Python code
    >>> nums.append(5)
    >>> nums.append(6)
    >>> nums.append(7)
    >>> nums
    [5, 6, 7]

Here, ``nums`` grows in size as needed. Internally, Python_ implements
``nums`` as a contiguous sequence of objects, and ``append`` has some rules
for deciding how much memory the underlying array should take up. Dynamic
arrays like this are responsible for managing their own memory, which can be
very convenient for many sorts of applications. But the downside is that
``nums`` might use more memory than it needs to store its current numbers in
order to make *future* calls to ``append`` faster. This is a common trade-off:
use more memory to make a program faster.

With that it mind, let's look at Clojure_ vectors, i.e. it's version of
arrays. Clojure_ vector literals use square brackets, e.g.::

    user=> (def nums [4 5 6 7])
    #'user/nums
    user=> nums
    [4 5 6 7]
    user=> (nums 3)    ;; access the 3rd element of nums

Notice that no extra syntax is used to access elements. To get the number at
location ``i``, you can just write ``(nums i)``.

A basic fact about Clojure_ vectors are that they are *persistent*, which
means that any time you modify or update them, you are creating a new vector.
Think about this for a moment, and you'll realize that if Clojure_ vectors
were implemented as a contiguous sequence as in Java_ or C/C++, then every
time you change a vector you'd need to make a copy of it. That would be
prohibitively expensive in both time and memory.

So Clojure_ implements its vectors in an usual way: they're a kind of tree.
While we don't have time to go into the details here (see `this blog post
<http://hypirion.com/musings/understanding-persistent-vector-pt-1>`_ if you
are interested), the upshot is that Clojure_ vectors have nearly the same good
performance characteristics as C/C++-style arrays, e.g. *almost* :math:`O(1)`
time for appending an element to a vector, and *almost* :math:`O(1)` time for
modifying an element (assuming you know its index position).

We say *almost* :math:`O(1)` because, with a tree, it's actually :math:`O(log
n)`. But, Clojure_ knows that vector performance is important, and so uses a
number of optimizations and tricks to get the performance to be pretty nearly
:math:`O(1)` in most practical uses.

Also, Clojure_ vector use more memory than a C/C++-style array because of the
pointers it uses to keep track of all the nodes.

The reason Clojure_ uses this implementation of vectors is to have vectors
that are persistent, i.e. that don't change. This is very useful for
concurrent/parallel programming, where one thread of control might modify a
vector being used by a different thread of control. In C/C++, the array must
be locked, or somehow specially managed, to avoid race conditions. But in
Clojure_, an array never changes, and so you don't need to worry about some
other thread over-writing a shared vector.

Clojure_ isn't the only language with an usual implementation of arrays. `Lua
<http://www.lua.org/>`_, for instance, implements arrays (and lists!) as a
table of key as (key, value) pairs. Tables like this could be implemented as
hash tables, or trees.


Extra: Running a Clojure App with Leiningen
-------------------------------------------

Now lets look at a very neat tool that Clojure_ provides in its large set of
library functions: **unification**. Unification is an algorithm that takes two
lists as input, either of which may contain variables, and, if possible,
returns a list of assignments to the variables that makes the two list the
same. This turns out to be a surprisingly powerful and useful tool.

Clojure_ provides a unification library called `core.unify
<https://github.com/clojure/core.unify/>`_. The easiest way to get access to
it is to first install the `Leiningen package management tool
<http://leiningen.org/>`_, and then to create a new Clojure_ "app" at the
command-line like this::

    $ lein new app prop-eval

This creates a new folder called ``prop-eval`` that contains various
configuration files, including the file ``project.clj``. Inside
``project.clj`` is where we put the requirement to use ``core.unify``::

    (defproject app "1.0.0-SNAPSHOT"
      :description "FIXME: write description"
      :dependencies [[org.clojure/clojure "1.4.0"]
                     [org.clojure/core.unify "0.5.5"]   ;; core.unify dependency
                    ])

The dependency information ``[org.clojure/core.unify "0.5.5"]`` was cut-and-
pasted directly from the `core.unify
<https://github.com/clojure/core.unify/>`_ page.

Note also that the version of Clojure_ is listed as a dependency here. This
makes it easy to use whatever version of Clojure_ you want.

To use ``core.unfiy``, we now run Clojure_ through `Leiningen
<http://leiningen.org/>`_. Run ``lein repl`` in the same folder as
``project.clj``::

    $ lein repl
    ... text indicating libraries are being downloaded ...

    REPL started; server listening on localhost port 9347
    user=> 

This works just like ordinary Clojure_, but now the dependencies in
``project.clj`` are satisfied, meaning we can import ``core.unify`` like
this::

    user=> (use 'clojure.core.unify)
    nil


Unification with core.unify
---------------------------

Assuming you have set up your Clojure_ app as in the previous section, we can
now write code either in the REPL launched by ``lein repl``, or inside the
file ``prop-eval/src/app/core.clj``.

Here's how ``unify`` works::

    user=> (use 'clojure.core.unify)
    nil

    user=> (unify '?x 5)
    {?x 5}

The input to ``unify`` is always two expressions. In this case, the first
expression is ``'?x``, and the second expression is ``5``. The ``?x`` in is a
variable --- ``unify`` treats all symbols starting with ``?`` as variables.

What ``unify`` does is try to find an assignment of values to the variables in
the two expressions that make the expressions identical. In this case the
answer is easy: setting ``?x`` to 5 unifies the expressions.

If ``unify`` is able to unify the expressions, then it returns a Clojure_ map
showing the assignment of values to variables that make the expressions the
same.

Here's another example::

    user=> (unify '(?x and ?y) '(f and t))
    {?y t, ?x f}

This says that if you replace ``?y`` with ``t`` and ``?x`` with ``f``, the two
expressions passed to ``unify`` will be identical.

In this example, two variables unify with each other::

    user=> (unify '(?x 3 ?y) '(2 ?z ?w))
    {?y ?w, ?z 3, ?x 2}

The result says that ``?y`` and ``?w`` should have the same value.

Sometimes expressions can't be unified. For example::

    user=> (unify '(?x) '(a b))
    nil

No matter what you assign to ``?x``, it cannot be made to be equal to ``'(a
b)``. So we say the unification has failed.

Unification turns out to be very useful for pattern matching. For example,
consider the  ``is-and?`` function from above::

    ;; (expr and expr)
    (defn is-and? [e] 
       (and (list? e) (= 3 (count e)) (= 'and (second e))))

We could re-write it using unification like this::

    (defn is-and? [e] 
      (not (nil? (unify e '(?x and ?y))))
    )

This can be improved a little bit by writing a helper function called
``match?``::

    (defn match? [e1 e2]
        (not (nil? (unify e1 e2)))
    )

Now ``is-and?`` becomes this::

    (defn is-and? [e] (match? e '(?x and ?y)))

This is clearer and simpler than the original ``is-and?``. It's now easy to
write the other ``is-`` functions in the same style::

    (defn is-and? [e] (match? e '(?x and ?y)))
    (defn is-or? [e] (match? e '(?x or ?y)))
    (defn is-nand? [e] (match? e '(?x nand ?y)))
    (defn is-not? [e] (match? e '(not ?x)))

As this shows, unification can be quite useful in practice when you want to do
pattern matching.