More About Scheme

apply and eval

The apply and eval functions are used to evaluate lists. Here is the usual way of evaluating expressions:

=> (* 1 2 3)
6
=> (+ 1 2 3)
6
=> (- 1 2 3)
-4

An equivalent way is to use apply like this:

=> (apply * '(1 2 3))
6
=> (apply + '(1 2 3))
6
=> (apply - '(1 2 3))
-4

As you can see, (apply f seq) takes a function f and evaluates it using the items in seq as its input. It’s as if f is “cons-ed” onto seq, and then that expression is evaluated.

The eval function takes an entire list an evaluates it:

=> (eval '(* 1 2 3) user-initial-environment)
6
=> (eval '(+ 1 2 3) user-initial-environment)
6
=> (eval '(- 1 2 3) user-initial-environment)
-4

The second parameter to eval is the environment in which the expression will be evaluated, i.e. all the bindings that the expression has access to. The built-in environment variable user-initial-environment is the environment used by the interpreter.

eval is a very powerful function: it is essentially Scheme implemented in Scheme! For instance, you can make your own version of apply like this:

=> (my-apply '+ '(1 2 3))

;Value: 6

In practice, it is usually only useful for evaluating expressions generated “on the fly”.

The Scope of Names

A variable’s scope is the range of statements over which it is visible. A local variable is a variable whose scope is restricted to a block of code. A nonlocal variable is visible outside of the block in which it was declared. Global variables are a kind of nonlocal variable that can be used anywhere in a program.

Many modern languages are statically scoped (lexically scoped). This means that the scope of a variable can be determined at compile-time before the program runs just by examining the source code. Static scoping helps programmers to read source code and figure out its type without running it.

Consider this Scheme code:

(define x 1)
(define f (lambda (x) (g 2)))
(define g (lambda (y) (+ x y)))  ;; which x does this x refer to?

Scheme is statically scoped, and so if you evaluate (f 5) the answer is 3. If Scheme were instead dynamically scoped, the answer would be 7.

It is useful to trace through what is going on here in some detail. Before (f 5) is called, x is bound to 1 by the first define. When (f 5) is called, the 5 is bound to x in the lambda expression for f. Then (g 2) is called, and the 2 is bound y in the lambda expression for g. So at this point, there are three bound variables: x bound to 1, x bound to 5, and y bound to 2. In gs expression (+ x y), what value should be used for x? Should it be 1, or should it be 2? Scheme is statically scoped, and so it decides on bindings before the code runs, which means it must use the x bound to 1. However, in a dynamically scoped language (notably many versions of Lisp before Scheme), the most recently bound value of x is used. If Scheme were dynamically scoped, then (f 5) would print 7.

Here’s another example of static scoping, this time from JavaScript:

function big() {
    function sub1() {
        var x = 7;   // hides the x defined in big
        sub2();
    }

    function sub2() {
        var y = x;      // which x does this refer to?
    }
    var x = 3;
    sub1();
}

Javascript is statically scoped, and so the x in sub2 is the x with the value 3 that is defined in big. If Javascript were dynamically scoped, then then x would refer to the most recently bound x at runtime, i.e. the x bound to 7.

Dynamic scoping is an alternative to static scoping that has largely fallen out of favor. Most examples of dynamic scoping occur in older languages, such as APL, SNOBOL, and Lisp (at least early versions). Perl lets you optionally declare variables that follow dynamic scoping rules.

The idea of dynamic scoping is that the meaning of a variable depends upon the value of the most recent variable with the same name in the current function call stack (as opposed to the enclosing block of source code).

Here is one more example that shows the difference between static and dynamic scoping:

const int b = 5;

int foo()
{
   int a = b + 5;  // What is b?
   return a;
}

int bar()
{
   int b = 2;
   return foo();
}

int main()
{
   foo(); // returns 10 for static scoping; 10 for dynamic scoping
   bar(); // returns 10 for static scoping; 7 for dynamic scoping
}

In general, dynamic scoping is unpopular because it makes it harder to reason about the meaning of programs from their source code alone. Under dynamic scoping, you can’t always tell for sure what a variable refers to until the code runs, because the order in which functions are called matters.

Another problem with dynamic scoping is that it exposes the local variables of a function to other functions, thus allowing the possibility that they could be modified.

One good thing about dynamic scoping is that it is easier to implement than static scoping.

Closures

A closure combines two things: a function, and an environment of (variable, value) pairs. The function is allowed to use variables from this environment.

For example, consider this code:

(define f
    (lambda (n)
        (+ n 1)
    )
)

=> (f 5)
1

f is a function, but it’s not a closure. But this is different:

(define make-adder
    (lambda (n)
        (lambda (x) (+ n x))  ;; n is outside of the lambda function
    )
)

(define g (make-adder 1))

=> (g 5)
1

g is a closure that includes both a function and a binding for the variable n. By itself, the function (lambda (x) (+ n x)) can’t be evaluated because n is free. But, in g, n is bound to the value 1. The function plus this binding of n is a closure.

More concretely, you can think of a closure as being similar to this:

(define g1
    (let ((n 1))
        (lambda (x)
            (+ n x)
        )
    )
)

=> (g1 5)
6

You can see that ``g1` not just a function, but a function along with some variable bindings that it needs.

Any programming language that lets you to define functions inside functions must deal with the issue of what do about variables the inner function uses that are not defined inside it. Closures are a natural and useful solution to this problem.

In Go, for example, you can do this:

package main

import "fmt"

type intFunc func(int) int

func makeAdder(n int) intFunc {
    result := func(x int) int { return x + n }
    return result
}

func main() {
    add2 := makeAdder(2)   // add2 is a closure

    fmt.Println(add2(5))
}

add2 is a closure. The n inside the result function is bound to the value of n passed to makeAdder, and it stays bound for the life of the result function. So even after makeAdder finishes executing, the n in the result is still there and can be used as long as add2 exists.

As an aside, notice how the typing information that Go requires makes the code longer, and more cluttered, than the same code in Scheme. Scheme is dynamically typed, and so doesn’t check for type information at compile time. Instead, it just runs the code and assumes the types are correct. If they aren’t, then there will be a run-time error.

Composing Functions

Another interesting feature of Scheme is the ability to easily compose functions. Recall how function composition works in mathematics. Given \(f(x) = x^2\) and \(g(x) = 2x + 1\), the composition of \(f\) and \(g\), denoted \(f \circ g\), can be calculated as \(f(g(x)) = g(x)^2 = (2x + 1)^2 = 4x^2 + 4x + 1\).

In Scheme, we can write these functions like this:

(define f
    (lambda (x)
        (* x x)
    )
)

(define g
    (lambda (x)
        (+ (* 2 x) 1)
    )
)

=> (f 2)
4
=> (g 2)
5
=> (f (g 2))
25

One way to compose functions is to write a function called compose like this:

(define compose
    (lambda (f g)
        (lambda (x)
            (f (g x))
        )
    )
)

compose takes two functions, f and g as input, and returns the function (lambda (x) (f (g x))) as output.

Now we can write the composition of f and g like this:

(define fg (compose f g))

=> (fg 2)
25

The twice function takes a function, f, as input and returns f composed with itself, i.e. \(f \circ f\):

(define twice
    (lambda (f)
        (compose f f)
    )
)

For instance:

(define garnish
    (twice (lambda (x) (cons 'cheese x)))
)

=> (garnish '(1 2 3))
(cheese cheese 1 2 3)

We can generalize this as follows:

(define compose-n
    (lambda (f n)
        (if (= n 1)
            f
            (compose f (compose-n f (- n 1)))
        )
    )
)

=> ((compose-n 1+ 5) 1)

;Value: 6

1 ]=> (define triple-cherry (compose-n (lambda (lst) (cons 'cherry lst)) 3))

;Value: triple-cherry

1 ]=> (triple-cherry '(vanilla))

;Value 16: (cherry cherry cherry vanilla)

compose-n returns a new function that we could refer to as f to the power of n, where function composition is being used instead of multiplication.

Curried Functions

Here are two different ways to write the addition function:

(define add_a
    (lambda (x y)
        (+ x y)
    )
)

=> (add_a 3 4)
7

(define add_b
    (lambda (x)
        (lambda (y)
            (+ x y)
        )
    )
)

=> ((add_b 3) 4)
7

add_a takes 2 inputs, while add_b takes only one input and returns a function that takes the second input (thus is must be called differently than add_a).

The nice thing about add_b is that if we give it only a single input n, the we get a function that adds n to a number:

(define add5 (add_b 5))

add_b is an example of a curried function, i.e. it is a function that takes multiple inputs by processing one input at time, and returning a function that processes the rest of the inputs. In practice, it is a sometimes useful trick for creating new functions out of old ones.

In Scheme, most functions are not written in a curried style because of the awkward calling syntax, i.e. ((add_b 3) 4) is not as nice as (add_a 3 4). But it is possible to write a function that converts a non-curried function into a curried one:

;; given a 2-parameter curried function, curry2 returns a curried version
(define curry2
    (lambda (f)
      (lambda (x)
        (lambda (y)
          (f x y)))))

curry2 assumes f is a function that takes exactly 2 inputs. We could use it to create a new function like this:

(define add5 ((curry2 +) 5))

=> (add5 3)
8

+ is a pre-defined 2-argument function, and (curry2 +) is this:

(lambda (x)
  (lambda (y)
    (+ x y)))

This is a curried version of +. Thus ((curry2 +) 5) is this:

(lambda (y)
  (+ 5 y))

Here’s a another example. Recall that (filter pred lst) returns a new list containing just the elements of lst for which the predicate function pred returns true on:

(define keep-odds ((curry2 filter) odd?))

=> (keep-odds '(1 2 3 4 5))
(1 3 5)

Notice that the definition of keep-odds does not have lambda explicitly in it anywhere. That’s because ((curry2 filter) odd?)) returns a function which is ready to accept the arguments it needs; there is no need to add any extra parameters to it.

It is also possible to write an uncurry2 function that takes a 2-argument curried function as input and returns a non-curried version of it:

;; converts a 2-argument curried function to a non-curried function
(define uncurry2
    (lambda (f)
      (lambda (x y)
        ((f x) y))))

Pattern Matching with Unification

If you do a lot of Scheme programming, you will likely discover that you write a lot of code that splits lists up into pieces using functions like car, cdr, cadr, cddr, and so on. While this code isn’t hard to write, it does get pretty tedious.

An alternative to writing this sort of code by hand is to use pattern matching to more easily split up the elements of a list. Here, we will use the simple pattern matching code in the file unify.scm. The code here implements unification through the use of the function (unify expr1 expr2). It is easiest to understand what this is through some examples.

unify.scm provides one main function called unify that tries that can be called like this:

=> (unify 'a 'b)
#f

This returns #f (false) because 'a and 'b are not the same, i.e. they don’t unify. However:

=> (unify 'a 'a)
()

'a and 'a do unify, because they are the same. (unify 'a 'a) returns the empty list, '(), which might seem a bit odd. It will make more sense after seeing some more examples.

Symbols that begin with a ? are variables from the point of view of unify, and they are can be bound to other expressions. For example:

=> (unify '?x 'a)
((?x . a))

The return value is ((?x . a)), which is list of pairs (i.e. an association list), where each pair is a variable and its value. unify promises that these if you replace the variables in the input expressions with their corresponds values, then the two expressions will be in the same.

In other words, (unify expr1 expr2) returns a list of variable/value pairs that, when the variables in expr1 and expr2 are replaced by their values, the resulting expressions will be the same.

More examples should make this clear:

=> (unify '(?x b c) '(a b c))
((?x . a))

=> (unify '(?x b c) '(a b ?y))
((?y . c) (?x . a))

=> (unify '(?x ?x ?x) '(?y 2 ?y))
((?y . 2) (?x . ?y))

=> (unify '(?left ?op ?right) '(5 + x))
((?right . x) (?op . +) (?left . 5))

=> (unify '(?left ?op ?right) '((2 * a) - (16 / 4)))
((?right 16 / 4) (?op . -) (?left 2 * a))

If there are no assignments of values to the variables in an expression that will make them equal, then #f is returned:

=> (unify '(?s 1) '(2 ?s))
#f

=> (unify '(?s ?t) '(?s ?t ?t))
#f

If you try to unify two expressions that are the same and have no variables, or the variables can be any value, unify returns an empty list:

=> (unify '(2 (a b)) '(2 (a b)))
()

=> (unify '(?x ?x) '(?x ?x))
()

The practical value of unification is that it’s often a handy way to chop of lists into smaller parts, thus resulting in code that is shorter and easier to read.

For example, suppose we want to check that the top-level of a list is a propositional expression:

(define make-matcher
    (lambda (e1)
        (lambda (e2)
            (if (unify e1 e2)
                #t
                #f
            )
        )
    )
)

(define is-and? (make-matcher '(?x and ?y)))
(define is-or? (make-matcher '(?x or ?y)))
(define is-not? (make-matcher '(not ?x)))

The is-and?, is-or?, and is-not? functions are quite short and easy to understand, which is always a good thing in source code.

How Scheme Implements Lists

Lists in Scheme are implemented as traditional singly-linked lists. Each item in a list is represented by a cons cell, i.e. an object that contains two pointers, one to the first element of the list, and one to the rest of the list. The first pointer is called the car, and the second pointer is called the cdr.

A new cons cells is created using cons, e.g.:

=> (cons 2 3)
(2 . 3)

In Scheme, (2 . 3) is a pair, where 2 is the car of the pair, and 3 is the cdr of the pair. In a proper list, the cdr of every cons cell must point to a list, and the last cdr must be the empty list. For example, the list (1 2 3) is short-hand for the pair (1 . (2 . (3 . ()))). The car of this pair points to 1, and the cdr points to the pair (2 . (3 . ())).

The pair? function tests if an object is a pair, e.g.:

=> (pair? (cons 2 3))
#t

=> (pair? '(1 2))
#t

=> (pair? '())
#f

=> (pair? 'pear)
#f

=> (pair? '(1 . (2 . (3 . ()))))
#t

See cons cell box diagrams for examples of how to draw diagrams of pairs and lists.