More About Scheme
==================

apply and eval
--------------

The ``apply`` and ``eval`` functions are used to evaluate lists. Here is the
usual way of evaluating expressions::

    => (* 1 2 3)
    6
    => (+ 1 2 3)
    6
    => (- 1 2 3)
    -4

An equivalent way is to use ``apply`` like this::

    => (apply * '(1 2 3))
    6
    => (apply + '(1 2 3))
    6
    => (apply - '(1 2 3))
    -4

As you can see, ``(apply f seq)`` takes a function ``f`` and evaluates it
using the items in ``seq`` as its input. It's as if ``f`` is "cons-ed"  onto
``seq``, and then that expression is evaluated.

The ``eval`` function takes an entire list an evaluates it::

    => (eval '(* 1 2 3) user-initial-environment)
    6
    => (eval '(+ 1 2 3) user-initial-environment)
    6
    => (eval '(- 1 2 3) user-initial-environment)
    -4

The second parameter to ``eval`` is the *environment* in which the expression
will be evaluated, i.e. all the bindings that the expression has access to.
The built-in environment variable ``user-initial-environment`` is the
environment used by the interpreter.

``eval`` is a very powerful function: it is essentially Scheme_ implemented in
Scheme_! For instance, you can make your own version of apply like this::

    => (my-apply '+ '(1 2 3))

    ;Value: 6

In practice, it is usually only useful for evaluating expressions
generated "on the fly".

.. note: It turns out that it is possible to implement the ``eval`` function
   in Scheme_ itself using only a few basic Scheme_ functions. Such a function
   is called a **meta-circular interpreter**.


The Scope of Names
------------------

A variable's scope is the range of statements over which it is visible. A
**local variable** is a variable whose scope is restricted to a block of code.
A **nonlocal variable** is visible outside of the block in which it was
declared. Global variables are a kind of nonlocal variable that can be used
anywhere in a program.

Many modern languages are **statically scoped** (**lexically scoped**). This
means that the scope of a variable can be determined at compile-time *before*
the program runs just by examining the source code. Static scoping helps
programmers to read source code and figure out its type without running it.

Consider this Scheme_ code::

    (define x 1)
    (define f (lambda (x) (g 2)))
    (define g (lambda (y) (+ x y)))  ;; which x does this x refer to?

Scheme_ is statically scoped, and so if you evaluate ``(f 5)`` the answer is
3. If Scheme_ were instead dynamically scoped, the answer would be 7.

It is useful to trace through what is going on here in some detail. Before
``(f 5)`` is called, ``x`` is bound to 1 by the first ``define``. When ``(f
5)`` is called, the 5 is bound to ``x`` in the lambda expression for ``f``.
Then ``(g 2)`` is called, and the 2 is bound ``y`` in the lambda expression
for ``g``. So at this point, there are three bound variables: ``x`` bound to
1, ``x`` bound to 5, and ``y`` bound to 2. In ``g``\ s expression ``(+ x y)``,
what value should be used for ``x``? Should it be 1, or should it be 2?
Scheme_ is statically scoped, and so it decides on bindings *before* the code
runs, which means it must use the ``x`` bound to 1. However, in a dynamically
scoped language (notably many versions of Lisp_ before Scheme_), the most
recently bound value of ``x`` is used. If Scheme_ were dynamically scoped,
then ``(f 5)`` would print 7.

Here's another example of static scoping, this time from JavaScript_::

    function big() {
        function sub1() {
            var x = 7;   // hides the x defined in big
            sub2();
        }

        function sub2() {
            var y = x;      // which x does this refer to?
        }
        var x = 3;
        sub1();
    }

Javascript_ is statically scoped, and so the ``x`` in ``sub2`` is the ``x``
with the value 3 that is defined in ``big``. If Javascript_ were dynamically
scoped, then then ``x`` would refer to the most recently bound ``x`` at
runtime, i.e. the ``x`` bound to 7.

Dynamic scoping is an alternative to static scoping that has largely fallen
out of favor. Most examples of dynamic scoping occur in older languages, such
as APL, SNOBOL, and Lisp_ (at least early versions). Perl lets you optionally
declare variables that follow dynamic scoping rules.

The idea of dynamic scoping is that the meaning of a variable depends upon the
value of the most recent variable with the same name in the current function
call stack (as opposed to the enclosing block of source code).

Here is one more example that shows the difference between static and dynamic
scoping::

    const int b = 5;    

    int foo()
    {
       int a = b + 5;  // What is b?
       return a;
    }
     
    int bar()
    {
       int b = 2;
       return foo();
    }
     
    int main()
    {
       foo(); // returns 10 for static scoping; 10 for dynamic scoping
       bar(); // returns 10 for static scoping; 7 for dynamic scoping
    }

In general, dynamic scoping is unpopular because it makes it harder to reason
about the meaning of programs from their source code alone. Under dynamic
scoping, you can't always tell for sure what a variable refers to until the
code runs, because the order in which functions are called matters.

Another problem with dynamic scoping is that it exposes the local variables of
a function to other functions, thus allowing the possibility that they could
be modified.

One good thing about dynamic scoping is that it is easier to implement than
static scoping.


Closures
--------

A closure combines two things: a function, and an environment of (variable,
value) pairs. The function is allowed to use variables from this environment.

For example, consider this code::

    (define f
        (lambda (n)
            (+ n 1)
        )
    )

    => (f 5)
    1

``f`` is a function, but it's *not* a closure. But this is different::

    (define make-adder
        (lambda (n)
            (lambda (x) (+ n x))  ;; n is outside of the lambda function
        )
    )

    (define g (make-adder 1))

    => (g 5)
    1

``g`` is a closure that includes both a function and a binding for the
variable ``n``. By itself, the function ``(lambda (x) (+ n x))`` can't be
evaluated because ``n`` is free. But, in ``g``, ``n`` is bound to the value 1.
The function plus this binding of ``n`` is a closure.

More concretely, you can think of a closure as being similar to this::

    (define g1
        (let ((n 1))
            (lambda (x)
                (+ n x)
            )
        )
    )

    => (g1 5)
    6

`You can see that ``g1`` not *just* a function, but a function along with some
variable bindings that it needs.

Any programming language that lets you to define functions inside functions
must deal with the issue of what do about variables the inner function uses
that are not defined inside it. Closures are a natural and useful solution to
this problem.

In Go_, for example, you can do this::

    package main

    import "fmt"

    type intFunc func(int) int

    func makeAdder(n int) intFunc {
        result := func(x int) int { return x + n }
        return result
    }

    func main() {
        add2 := makeAdder(2)   // add2 is a closure

        fmt.Println(add2(5))
    }

``add2`` is a closure. The ``n`` inside the result function is bound to the
value of ``n`` passed to ``makeAdder``, and it stays bound for the life of the
``result`` function. So even after ``makeAdder`` finishes executing, the ``n``
in the ``result`` is still there and can be used as long as ``add2`` exists.

As an aside, notice how the *typing* information that Go_ requires makes the
code longer, and more cluttered, than the same code in Scheme_. Scheme_ is
dynamically typed, and so *doesn't* check for type information at compile
time. Instead, it just runs the code and assumes the types are correct. If
they aren't, then there will be a run-time error.


Composing Functions
-------------------

Another interesting feature of Scheme_ is the ability to easily compose
functions. Recall how function composition works in mathematics. Given
:math:`f(x) = x^2` and :math:`g(x) = 2x + 1`, the composition of
:math:`f` and :math:`g`, denoted :math:`f \circ g`, can be calculated as
:math:`f(g(x)) = g(x)^2 = (2x + 1)^2 = 4x^2 + 4x + 1`.

In Scheme_, we can write these functions like this::

    (define f
        (lambda (x)
            (* x x)
        )
    )

    (define g
        (lambda (x)
            (+ (* 2 x) 1)
        )
    )

    => (f 2)
    4
    => (g 2)
    5
    => (f (g 2))
    25

One way to compose functions is to write a function called ``compose`` like
this::

    (define compose
        (lambda (f g)
            (lambda (x)
                (f (g x))
            )
        )
    )

``compose`` takes two functions, ``f`` and ``g`` as input, and returns the
function ``(lambda (x) (f (g x)))`` as output.

Now we can write the composition of ``f`` and ``g`` like this::

    (define fg (compose f g))

    => (fg 2)
    25

The ``twice`` function takes a function, ``f``, as input and returns ``f``
composed with itself, i.e. :math:`f \circ f`::

    (define twice
        (lambda (f)
            (compose f f)
        )
    )

For instance::

    (define garnish 
        (twice (lambda (x) (cons 'cheese x)))
    )

    => (garnish '(1 2 3))
    (cheese cheese 1 2 3)

We can generalize this as follows::

    (define compose-n
        (lambda (f n)
            (if (= n 1)
                f
                (compose f (compose-n f (- n 1)))
            )
        )
    )

    => ((compose-n 1+ 5) 1)    

    ;Value: 6

    1 ]=> (define triple-cherry (compose-n (lambda (lst) (cons 'cherry lst)) 3))

    ;Value: triple-cherry

    1 ]=> (triple-cherry '(vanilla))

    ;Value 16: (cherry cherry cherry vanilla)

``compose-n`` returns a new function that we could refer to as ``f`` to the
power of ``n``, where function composition is being used instead of
multiplication.


Curried Functions
-----------------

Here are two different ways to write the addition function::

    (define add_a
        (lambda (x y)
            (+ x y)
        )
    )

    => (add_a 3 4)
    7

    (define add_b
        (lambda (x)
            (lambda (y)
                (+ x y)
            )
        )
    )

    => ((add_b 3) 4)
    7

``add_a`` takes 2 inputs, while ``add_b`` takes only one input and returns a
function that takes the second input (thus is must be called differently than
``add_a``).

The nice thing about ``add_b`` is that if we give it only a single input
``n``, the we get a function that adds ``n`` to a number::

    (define add5 (add_b 5))

``add_b`` is an example of a **curried** function, i.e. it is a function that
takes multiple inputs by processing one input at time, and returning a
function that processes the rest of the inputs. In practice, it is a sometimes
useful trick for creating new functions out of old ones.

In Scheme_, most functions are not written in a curried style because of the
awkward calling syntax, i.e. ``((add_b 3) 4)`` is not as nice as ``(add_a 3
4)``. But it is possible to write a function that converts a non-curried
function into a curried one::

    ;; given a 2-parameter curried function, curry2 returns a curried version
    (define curry2
        (lambda (f)
          (lambda (x)
            (lambda (y)
              (f x y)))))

``curry2`` assumes ``f`` is a function that takes exactly 2 inputs. We could
use it to create a new function like this::

    (define add5 ((curry2 +) 5))

    => (add5 3)
    8

``+`` is a pre-defined 2-argument function, and ``(curry2 +)`` is this::

  (lambda (x)
    (lambda (y)
      (+ x y)))

This is a curried version of ``+``. Thus ``((curry2 +) 5)`` is this::

    (lambda (y)
      (+ 5 y))

Here's a another example. Recall that ``(filter pred lst)`` returns a new list
containing just the elements of ``lst`` for which the predicate function
``pred`` returns true on::

    (define keep-odds ((curry2 filter) odd?))

    => (keep-odds '(1 2 3 4 5))
    (1 3 5)

Notice that the definition of ``keep-odds`` does not have ``lambda``
explicitly in it anywhere. That's because ``((curry2 filter) odd?))`` returns
a function which is ready to accept the arguments it needs; there is no need
to add any extra parameters to it.

It is also possible to write an ``uncurry2`` function that takes a 2-argument
curried function as input and returns a non-curried version of it::

    ;; converts a 2-argument curried function to a non-curried function
    (define uncurry2
        (lambda (f)
          (lambda (x y)
            ((f x) y))))


Pattern Matching with Unification
---------------------------------

If you do a lot of Scheme_ programming, you will likely discover that you
write a lot of code that splits lists up into pieces using functions like
``car``, ``cdr``, ``cadr``, ``cddr``, and so on. While this code isn't hard to
write, it does get pretty tedious.

An alternative to writing this sort of code by hand is to use pattern matching
to more easily split up the elements of a list. Here, we will use the simple
pattern matching code in the file :download:`unify.scm
<_downloads/unify.scm>`. The code here implements **unification** through the
use of the function ``(unify expr1 expr2)``. It is easiest to understand what
this is through some examples.

:download:`unify.scm <_downloads/unify.scm>` provides one main function called
``unify`` that tries that can be called like this::

    => (unify 'a 'b)
    #f

This returns ``#f`` (false) because ``'a`` and ``'b`` are not the same, i.e.
they don't unify. However::

    => (unify 'a 'a)
    ()

``'a`` and ``'a`` do unify, because they are the same. ``(unify 'a 'a)``
returns the empty list, ``'()``, which might seem a bit odd. It will make more
sense after seeing some more examples.

Symbols that begin with a ``?`` are variables from the point of view of
``unify``, and they are can be bound to other expressions. For example::

    => (unify '?x 'a)
    ((?x . a))

The return value is ``((?x . a))``, which is list of pairs (i.e. an
association list), where each pair is a variable and its value. ``unify``
promises that these if you replace the variables in the input expressions with
their corresponds values, then the two expressions will be in the same.

In other words, ``(unify expr1 expr2)`` returns a list of variable/value pairs
that, when the variables in ``expr1`` and ``expr2`` are replaced by their
values, the resulting expressions will be the same.

More examples should make this clear::

    => (unify '(?x b c) '(a b c))
    ((?x . a))

    => (unify '(?x b c) '(a b ?y))
    ((?y . c) (?x . a))

    => (unify '(?x ?x ?x) '(?y 2 ?y))
    ((?y . 2) (?x . ?y))

    => (unify '(?left ?op ?right) '(5 + x))
    ((?right . x) (?op . +) (?left . 5))

    => (unify '(?left ?op ?right) '((2 * a) - (16 / 4)))
    ((?right 16 / 4) (?op . -) (?left 2 * a))

If there are no assignments of values to the variables in an expression that
will make them equal, then ``#f`` is returned::

    => (unify '(?s 1) '(2 ?s))
    #f

    => (unify '(?s ?t) '(?s ?t ?t))
    #f

If you try to unify two expressions that are the same and have no variables,
or the variables can be any value, ``unify`` returns an empty list::

    => (unify '(2 (a b)) '(2 (a b)))
    ()

    => (unify '(?x ?x) '(?x ?x))
    ()

The practical value of unification is that it's often a handy way to chop of
lists into smaller parts, thus resulting in code that is shorter and easier to
read.

For example, suppose we want to check that the top-level of a list is a
propositional expression::

    (define make-matcher
        (lambda (e1)
            (lambda (e2)
                (if (unify e1 e2)
                    #t
                    #f
                )
            )
        )
    )

    (define is-and? (make-matcher '(?x and ?y)))
    (define is-or? (make-matcher '(?x or ?y)))
    (define is-not? (make-matcher '(not ?x)))

The ``is-and?``, ``is-or?``, and ``is-not?`` functions are quite short and
easy to understand, which is always a good thing in source code.


How Scheme Implements Lists
---------------------------

Lists in Scheme_ are implemented as traditional singly-linked lists. Each item
in a list is represented by a `cons cell
<https://en.wikipedia.org/wiki/Cons>`_, i.e. an object that contains two
pointers, one to the first element of the list, and one to the rest of the
list. The first pointer is called the **car**, and the second pointer is
called the **cdr**.

A new cons cells is created using ``cons``, e.g.::

    => (cons 2 3)
    (2 . 3)

In Scheme_, ``(2 . 3)`` is a **pair**, where 2 is the car of the pair, and 3
is the cdr of the pair. In a **proper list**, the cdr of every cons cell must
point to a list, and the last cdr must be the empty list. For example, the
list ``(1 2 3)`` is short-hand for the pair ``(1 . (2 . (3 . ())))``. The car
of this pair points to 1, and the cdr points to the pair ``(2 . (3 . ()))``.

The ``pair?`` function tests if an object is a pair, e.g.::

    => (pair? (cons 2 3))
    #t

    => (pair? '(1 2))
    #t

    => (pair? '())
    #f

    => (pair? 'pear)
    #f

    => (pair? '(1 . (2 . (3 . ()))))
    #t

See :download:`cons cell box diagrams <_downloads/consCellNotes.pdf>` for
examples of how to draw diagrams of pairs and lists.