More About Scheme ================== apply and eval -------------- The ``apply`` and ``eval`` functions are used to evaluate lists. Here is the usual way of evaluating expressions:: => (* 1 2 3) 6 => (+ 1 2 3) 6 => (- 1 2 3) -4 An equivalent way is to use ``apply`` like this:: => (apply * '(1 2 3)) 6 => (apply + '(1 2 3)) 6 => (apply - '(1 2 3)) -4 As you can see, ``(apply f seq)`` takes a function ``f`` and evaluates it using the items in ``seq`` as its input. It's as if ``f`` is "cons-ed" onto ``seq``, and then that expression is evaluated. The ``eval`` function takes an entire list an evaluates it:: => (eval '(* 1 2 3) user-initial-environment) 6 => (eval '(+ 1 2 3) user-initial-environment) 6 => (eval '(- 1 2 3) user-initial-environment) -4 The second parameter to ``eval`` is the *environment* in which the expression will be evaluated, i.e. all the bindings that the expression has access to. The built-in environment variable ``user-initial-environment`` is the environment used by the interpreter. ``eval`` is a very powerful function: it is essentially Scheme_ implemented in Scheme_! For instance, you can make your own version of apply like this:: => (my-apply '+ '(1 2 3)) ;Value: 6 In practice, it is usually only useful for evaluating expressions generated "on the fly". .. note: It turns out that it is possible to implement the ``eval`` function in Scheme_ itself using only a few basic Scheme_ functions. Such a function is called a **meta-circular interpreter**. The Scope of Names ------------------ A variable's scope is the range of statements over which it is visible. A **local variable** is a variable whose scope is restricted to a block of code. A **nonlocal variable** is visible outside of the block in which it was declared. Global variables are a kind of nonlocal variable that can be used anywhere in a program. Many modern languages are **statically scoped** (**lexically scoped**). This means that the scope of a variable can be determined at compile-time *before* the program runs just by examining the source code. Static scoping helps programmers to read source code and figure out its type without running it. Consider this Scheme_ code:: (define x 1) (define f (lambda (x) (g 2))) (define g (lambda (y) (+ x y))) ;; which x does this x refer to? Scheme_ is statically scoped, and so if you evaluate ``(f 5)`` the answer is 3. If Scheme_ were instead dynamically scoped, the answer would be 7. It is useful to trace through what is going on here in some detail. Before ``(f 5)`` is called, ``x`` is bound to 1 by the first ``define``. When ``(f 5)`` is called, the 5 is bound to ``x`` in the lambda expression for ``f``. Then ``(g 2)`` is called, and the 2 is bound ``y`` in the lambda expression for ``g``. So at this point, there are three bound variables: ``x`` bound to 1, ``x`` bound to 5, and ``y`` bound to 2. In ``g``\ s expression ``(+ x y)``, what value should be used for ``x``? Should it be 1, or should it be 2? Scheme_ is statically scoped, and so it decides on bindings *before* the code runs, which means it must use the ``x`` bound to 1. However, in a dynamically scoped language (notably many versions of Lisp_ before Scheme_), the most recently bound value of ``x`` is used. If Scheme_ were dynamically scoped, then ``(f 5)`` would print 7. Here's another example of static scoping, this time from JavaScript_:: function big() { function sub1() { var x = 7; // hides the x defined in big sub2(); } function sub2() { var y = x; // which x does this refer to? } var x = 3; sub1(); } Javascript_ is statically scoped, and so the ``x`` in ``sub2`` is the ``x`` with the value 3 that is defined in ``big``. If Javascript_ were dynamically scoped, then then ``x`` would refer to the most recently bound ``x`` at runtime, i.e. the ``x`` bound to 7. Dynamic scoping is an alternative to static scoping that has largely fallen out of favor. Most examples of dynamic scoping occur in older languages, such as APL, SNOBOL, and Lisp_ (at least early versions). Perl lets you optionally declare variables that follow dynamic scoping rules. The idea of dynamic scoping is that the meaning of a variable depends upon the value of the most recent variable with the same name in the current function call stack (as opposed to the enclosing block of source code). Here is one more example that shows the difference between static and dynamic scoping:: const int b = 5; int foo() { int a = b + 5; // What is b? return a; } int bar() { int b = 2; return foo(); } int main() { foo(); // returns 10 for static scoping; 10 for dynamic scoping bar(); // returns 10 for static scoping; 7 for dynamic scoping } In general, dynamic scoping is unpopular because it makes it harder to reason about the meaning of programs from their source code alone. Under dynamic scoping, you can't always tell for sure what a variable refers to until the code runs, because the order in which functions are called matters. Another problem with dynamic scoping is that it exposes the local variables of a function to other functions, thus allowing the possibility that they could be modified. One good thing about dynamic scoping is that it is easier to implement than static scoping. Closures -------- A closure combines two things: a function, and an environment of (variable, value) pairs. The function is allowed to use variables from this environment. For example, consider this code:: (define f (lambda (n) (+ n 1) ) ) => (f 5) 1 ``f`` is a function, but it's *not* a closure. But this is different:: (define make-adder (lambda (n) (lambda (x) (+ n x)) ;; n is outside of the lambda function ) ) (define g (make-adder 1)) => (g 5) 1 ``g`` is a closure that includes both a function and a binding for the variable ``n``. By itself, the function ``(lambda (x) (+ n x))`` can't be evaluated because ``n`` is free. But, in ``g``, ``n`` is bound to the value 1. The function plus this binding of ``n`` is a closure. More concretely, you can think of a closure as being similar to this:: (define g1 (let ((n 1)) (lambda (x) (+ n x) ) ) ) => (g1 5) 6 `You can see that ``g1`` not *just* a function, but a function along with some variable bindings that it needs. Any programming language that lets you to define functions inside functions must deal with the issue of what do about variables the inner function uses that are not defined inside it. Closures are a natural and useful solution to this problem. In Go_, for example, you can do this:: package main import "fmt" type intFunc func(int) int func makeAdder(n int) intFunc { result := func(x int) int { return x + n } return result } func main() { add2 := makeAdder(2) // add2 is a closure fmt.Println(add2(5)) } ``add2`` is a closure. The ``n`` inside the result function is bound to the value of ``n`` passed to ``makeAdder``, and it stays bound for the life of the ``result`` function. So even after ``makeAdder`` finishes executing, the ``n`` in the ``result`` is still there and can be used as long as ``add2`` exists. As an aside, notice how the *typing* information that Go_ requires makes the code longer, and more cluttered, than the same code in Scheme_. Scheme_ is dynamically typed, and so *doesn't* check for type information at compile time. Instead, it just runs the code and assumes the types are correct. If they aren't, then there will be a run-time error. Composing Functions ------------------- Another interesting feature of Scheme_ is the ability to easily compose functions. Recall how function composition works in mathematics. Given :math:`f(x) = x^2` and :math:`g(x) = 2x + 1`, the composition of :math:`f` and :math:`g`, denoted :math:`f \circ g`, can be calculated as :math:`f(g(x)) = g(x)^2 = (2x + 1)^2 = 4x^2 + 4x + 1`. In Scheme_, we can write these functions like this:: (define f (lambda (x) (* x x) ) ) (define g (lambda (x) (+ (* 2 x) 1) ) ) => (f 2) 4 => (g 2) 5 => (f (g 2)) 25 One way to compose functions is to write a function called ``compose`` like this:: (define compose (lambda (f g) (lambda (x) (f (g x)) ) ) ) ``compose`` takes two functions, ``f`` and ``g`` as input, and returns the function ``(lambda (x) (f (g x)))`` as output. Now we can write the composition of ``f`` and ``g`` like this:: (define fg (compose f g)) => (fg 2) 25 The ``twice`` function takes a function, ``f``, as input and returns ``f`` composed with itself, i.e. :math:`f \circ f`:: (define twice (lambda (f) (compose f f) ) ) For instance:: (define garnish (twice (lambda (x) (cons 'cheese x))) ) => (garnish '(1 2 3)) (cheese cheese 1 2 3) We can generalize this as follows:: (define compose-n (lambda (f n) (if (= n 1) f (compose f (compose-n f (- n 1))) ) ) ) => ((compose-n 1+ 5) 1) ;Value: 6 1 ]=> (define triple-cherry (compose-n (lambda (lst) (cons 'cherry lst)) 3)) ;Value: triple-cherry 1 ]=> (triple-cherry '(vanilla)) ;Value 16: (cherry cherry cherry vanilla) ``compose-n`` returns a new function that we could refer to as ``f`` to the power of ``n``, where function composition is being used instead of multiplication. Curried Functions ----------------- Here are two different ways to write the addition function:: (define add_a (lambda (x y) (+ x y) ) ) => (add_a 3 4) 7 (define add_b (lambda (x) (lambda (y) (+ x y) ) ) ) => ((add_b 3) 4) 7 ``add_a`` takes 2 inputs, while ``add_b`` takes only one input and returns a function that takes the second input (thus is must be called differently than ``add_a``). The nice thing about ``add_b`` is that if we give it only a single input ``n``, the we get a function that adds ``n`` to a number:: (define add5 (add_b 5)) ``add_b`` is an example of a **curried** function, i.e. it is a function that takes multiple inputs by processing one input at time, and returning a function that processes the rest of the inputs. In practice, it is a sometimes useful trick for creating new functions out of old ones. In Scheme_, most functions are not written in a curried style because of the awkward calling syntax, i.e. ``((add_b 3) 4)`` is not as nice as ``(add_a 3 4)``. But it is possible to write a function that converts a non-curried function into a curried one:: ;; given a 2-parameter curried function, curry2 returns a curried version (define curry2 (lambda (f) (lambda (x) (lambda (y) (f x y))))) ``curry2`` assumes ``f`` is a function that takes exactly 2 inputs. We could use it to create a new function like this:: (define add5 ((curry2 +) 5)) => (add5 3) 8 ``+`` is a pre-defined 2-argument function, and ``(curry2 +)`` is this:: (lambda (x) (lambda (y) (+ x y))) This is a curried version of ``+``. Thus ``((curry2 +) 5)`` is this:: (lambda (y) (+ 5 y)) Here's a another example. Recall that ``(filter pred lst)`` returns a new list containing just the elements of ``lst`` for which the predicate function ``pred`` returns true on:: (define keep-odds ((curry2 filter) odd?)) => (keep-odds '(1 2 3 4 5)) (1 3 5) Notice that the definition of ``keep-odds`` does not have ``lambda`` explicitly in it anywhere. That's because ``((curry2 filter) odd?))`` returns a function which is ready to accept the arguments it needs; there is no need to add any extra parameters to it. It is also possible to write an ``uncurry2`` function that takes a 2-argument curried function as input and returns a non-curried version of it:: ;; converts a 2-argument curried function to a non-curried function (define uncurry2 (lambda (f) (lambda (x y) ((f x) y)))) Pattern Matching with Unification --------------------------------- If you do a lot of Scheme_ programming, you will likely discover that you write a lot of code that splits lists up into pieces using functions like ``car``, ``cdr``, ``cadr``, ``cddr``, and so on. While this code isn't hard to write, it does get pretty tedious. An alternative to writing this sort of code by hand is to use pattern matching to more easily split up the elements of a list. Here, we will use the simple pattern matching code in the file :download:`unify.scm <_downloads/unify.scm>`. The code here implements **unification** through the use of the function ``(unify expr1 expr2)``. It is easiest to understand what this is through some examples. :download:`unify.scm <_downloads/unify.scm>` provides one main function called ``unify`` that tries that can be called like this:: => (unify 'a 'b) #f This returns ``#f`` (false) because ``'a`` and ``'b`` are not the same, i.e. they don't unify. However:: => (unify 'a 'a) () ``'a`` and ``'a`` do unify, because they are the same. ``(unify 'a 'a)`` returns the empty list, ``'()``, which might seem a bit odd. It will make more sense after seeing some more examples. Symbols that begin with a ``?`` are variables from the point of view of ``unify``, and they are can be bound to other expressions. For example:: => (unify '?x 'a) ((?x . a)) The return value is ``((?x . a))``, which is list of pairs (i.e. an association list), where each pair is a variable and its value. ``unify`` promises that these if you replace the variables in the input expressions with their corresponds values, then the two expressions will be in the same. In other words, ``(unify expr1 expr2)`` returns a list of variable/value pairs that, when the variables in ``expr1`` and ``expr2`` are replaced by their values, the resulting expressions will be the same. More examples should make this clear:: => (unify '(?x b c) '(a b c)) ((?x . a)) => (unify '(?x b c) '(a b ?y)) ((?y . c) (?x . a)) => (unify '(?x ?x ?x) '(?y 2 ?y)) ((?y . 2) (?x . ?y)) => (unify '(?left ?op ?right) '(5 + x)) ((?right . x) (?op . +) (?left . 5)) => (unify '(?left ?op ?right) '((2 * a) - (16 / 4))) ((?right 16 / 4) (?op . -) (?left 2 * a)) If there are no assignments of values to the variables in an expression that will make them equal, then ``#f`` is returned:: => (unify '(?s 1) '(2 ?s)) #f => (unify '(?s ?t) '(?s ?t ?t)) #f If you try to unify two expressions that are the same and have no variables, or the variables can be any value, ``unify`` returns an empty list:: => (unify '(2 (a b)) '(2 (a b))) () => (unify '(?x ?x) '(?x ?x)) () The practical value of unification is that it's often a handy way to chop of lists into smaller parts, thus resulting in code that is shorter and easier to read. For example, suppose we want to check that the top-level of a list is a propositional expression:: (define make-matcher (lambda (e1) (lambda (e2) (if (unify e1 e2) #t #f ) ) ) ) (define is-and? (make-matcher '(?x and ?y))) (define is-or? (make-matcher '(?x or ?y))) (define is-not? (make-matcher '(not ?x))) The ``is-and?``, ``is-or?``, and ``is-not?`` functions are quite short and easy to understand, which is always a good thing in source code. How Scheme Implements Lists --------------------------- Lists in Scheme_ are implemented as traditional singly-linked lists. Each item in a list is represented by a `cons cell `_, i.e. an object that contains two pointers, one to the first element of the list, and one to the rest of the list. The first pointer is called the **car**, and the second pointer is called the **cdr**. A new cons cells is created using ``cons``, e.g.:: => (cons 2 3) (2 . 3) In Scheme_, ``(2 . 3)`` is a **pair**, where 2 is the car of the pair, and 3 is the cdr of the pair. In a **proper list**, the cdr of every cons cell must point to a list, and the last cdr must be the empty list. For example, the list ``(1 2 3)`` is short-hand for the pair ``(1 . (2 . (3 . ())))``. The car of this pair points to 1, and the cdr points to the pair ``(2 . (3 . ()))``. The ``pair?`` function tests if an object is a pair, e.g.:: => (pair? (cons 2 3)) #t => (pair? '(1 2)) #t => (pair? '()) #f => (pair? 'pear) #f => (pair? '(1 . (2 . (3 . ())))) #t See :download:`cons cell box diagrams <_downloads/consCellNotes.pdf>` for examples of how to draw diagrams of pairs and lists.