Binding, Scope, and Functions

Binding Time of Names

names in programming languages are bound to values, and this binding can occur at many different times

e.g. the name for was bound to the idea of a loop in Go at language design-time

e.g. compiler-specific extensions of a language (such as special I/O functions), occur at language implementation time

e.g. double x = 5.3;, x is bound to 5.3 at program-writing time

e.g. in x + y the compiler may be able to determine that + means string concatenation (because x and y are strings); this is an example of compile-time binding, also known as static binding

e.g. names that refer to the code in other, separately compiled modules, are bound at link time

e.g. in a.print(), the code associated with print might not be known until run-time; this is an example of run-time binding, also known as dynamic binding

Binding Time of Names

the distinction between compile-time and run-time binding is a key to a lot of object-oriented programming

consider this example of Go code:

// printer.go

package main

import (
    "fmt"
)

type Printer interface {
    print()
}

////////////////////////////////////////////////////////////////

type Point struct {
    x, y float64
}

func (p Point) print() {
    fmt.Printf("(%v, %v)\n", p.x, p.y)
}

func (p Point) println() {   // not in Printer interface
    p.print()
    fmt.Println()
}

////////////////////////////////////////////////////////////////

type Person struct {
    name string
    age  int
}

func (p Person) print() {
    fmt.Printf("%v, age %v\n", p.name, p.age)
}

func (p Person) println() {  // not in Printer interface
    p.print()
    fmt.Println()
}

func main() {
    var x Printer

    fmt.Println(x) // prints: <nil>

    s := ""
    fmt.Scanf("%s", &s)
    if len(s)%2 == 0 {
        x = Point{3, 4}
    } else {
        x = Person{"Bob", 20}
    }

    x.print() // which print function is called?

    x.println()
} // main

there’s no way to tell at compile time which of the two possible print() methods is called by x.print()

we need to know the type of the value in x, i.e. whether it is a Point or a Person

and in this case it’s impossible to know this until the program runs

even though we can’t be sure which print() function is called, we can be sure there’s no type error because whatever the value in x we know it must implement the Printer interface, and so the value in x must have a print() method

another thing to notice is that only methods listed in the Printer interface can be called on x

in this example, both Person and Point have a println method, but x.println is an error because println is not in Printer

we can see that, in this program, calling x.println() would be fine, but Go decides what to do only by looking at the Printer interface; if we were to later add a new type that implemented a print() method but not a println() method, then x.println() would an error

Three Basic Kinds of Allocation

static

stack

dynamic (heap)

Static Allocation

global variables, constants

special data structures created by the compiler, e.g. symbol table with the names of variables (for debugging)

size of the static allocation region never changes

must be known at compile time

cannot support recursion

Stack Allocation

stack data structure

when a function is called, parameters and any local variables are pushed onto the stack in a “stack frame”

return address is also pushed so program knows where to continue execution after the function ends

when the function ends, the corresponding stack frame is popped (and a return value can be put on the stack)

size of the stack shrinks and grows over time

allows for local variables that are automatically allocated when the function starts, and automatically de-allocated when the function ends

can handle recursive calls

problem: stack-based memory only exists while a function is active, and it cannot handle data that you want to exist between function calls

Dynamic (Heap) Memory

a heap is a region of memory where blocks of memory can be be allocated and de-allocated as needed (this is not the heap data structure used for implementing priority queues!)

often, an explicit command is needed to allocate heap memory, e.g.

string* p = new string; // C++

allocated memory stays allocated until the program ends, or it is explicitly de-allocated

two basic de-allocation techniques: manual and automatic

manual de-allocation is used in C/C++, e.g.:

delete p;   // the memory p points to is de-allocated

automatic de-allocation is called garbage collection, i.e. blocks are de- allocated automatically at run-time by a special process that keeps track of which blocks of heap memory are still accessible

Dynamic (Heap) Memory

in practice, manual de-allocation of heap memory is useful when you need to write high-performance programs that carefully manage their memory (e.g. think operating systems)

but experience shows that manual de-allocation is extremely difficult to do 100% correctly

in C/C++, if you de-allocate heap memory that is pointed to by a pointer, then the pointer refers to memory that is no longer valid, and this is called a dangling pointer

using the memory the pointer refers to can corrupt your programs memory and cause strange bugs that are difficult to discover

if you don’t de-allocate heap memory after you are finished using it, then your program uses more memory then it needs

this might be fine if the computer has lots of memory

but, over time, allocated but unused blocks can build up, using more and more memory

this is called a memory leak

over time, a memory leak could use up so much memory that your program’s performance suffers, or even causes your program to crash (because there is no more memory to allocate)

Dynamic (Heap) Memory

garbage collection is a solution to dangling pointers and memory leaks

however, it does have a run-time cost, and so it might not be appropriate for high-performance applications, or in where there is not a lot of memory available

most modern language provide garbage collection

that is, the trade-of between performance and memory safety is in favor of memory safety

i.e. most computers are relatively efficient, and most applications can tolerate small garbage collection delays, and so can afford the extra safety and convenience of automated heap de-allocation

Dynamic (Heap) Memory

one problem with heap allocation is that it cause the heap to become fragmented, i.e. it becomes a mix of allocated and unallocated sequences of bytes, of different lengths

we don’t know when allocations/de-allocations will occur, and we don’t know their size

it’s possible that there could be enough total memory for a particular allocation, but no contiguous block of memory is big enough to satisfy it

there are various algorithms for how, exactly, to structure the heap to help minimize fragmentation

for example, free heap blocks might be arranged by size in powers of 2, and when an allocation request is made the smallest power of 2 block that fits the allocation is chosen

Escape Analysis in Go

consider this Go code:

type Point struct {
    x, y float64
}

func getOrigin() *Point {
    p := Point{0, 0}
    return &p
}

func main() {
    p := getOrigin()
    fmt.Println(p)
}

notice that getOrigin has a local variable p, and a pointer to its value is returned

the question is: where is the value that p refers to stored?

if the value of p were on the stack, then it would be automatically de- allocated when getOrigin ends, and so the returned pointer would be invalid, i.e. it would be a dangling pointer that refers to memory that is not a Point

because of this, Go does not store the value of p on the stack in this case, and instead stores it on the heap

by storing the value of p on the heap, it can exist even after getOrigin ends and the returned pointer is valid

Go does escape analysis to determine if a value “escapes” from a function; values that escape are allocated on the heap, while values that don’t escape can be allocated on the stack

in contrast, C++ puts all local variables on the stack; the equivalent function is an error:

// C++

struct Point {
    float x, y'
};

Point* getOrigin() {
    Point p{0, 0};
    return &p;  // oops: returning a dangling pointer
}

this functions is incorrect in C++ because all local variables in C++ are stored on the stack, and so are de-allocated when the function ends

Scope

scope is the region of a program in which a binding is active

most programming languages use static scope (or lexical scope), which means that the scope of a name is limited to a block of the source code that can be determined by looking at the source (and not having to run the program)

so in a statically scoped language, the scope of a name can be determined before running the program, e.g. by the compiler

a global name is a name whose scope is the entire program; a local name is a name whose scope is not the entire program (e.g. limited to a function, module, or other block of code)

Scope: Nested Blocks

many languages allows blocks of code to be nested within each other, and use the closest nested scope rule for determining what a name refers to

consider this Go example:

package main

import (
    "fmt"
)

var x int = 1

func main() {
    fmt.Println(x) // 1
    var x int = 2
    fmt.Println(x) // 2
    for x := 3; x <= 5; x++ {
        fmt.Println(x) // 3, 4, 5
    }
}

the comments show what is printed; we say that the inner x variables shadow the outer ones

here’s another Go example that shows that variables defined in an outer scope are available in nested scopes:

package main

import (
    "fmt"
)

var a int = 1

func main() {
    var b int = 2
    for c := 3; c <= 5; c++ {
        d := a + b + c
        fmt.Println(d) // 6, 7, 8
    }
}

in Go, the order of global variable declarations doesn’t usually matter, e.g. this code prints the same thing:

func main() {
    var b int = 2
    for c := 3; c <= 5; c++ {
        d := a + b + c
        fmt.Println(d) // 6, 7, 8
    }
}

var a int = 1

however, the order of local declaration matters, e.g. this code doesn’t compile because b is used before it is declared:

var a int = 1

func main() {
    for c := 3; c <= 5; c++ {
        d := a + b + c  // oops: b is used before it has been declared
        fmt.Println(d)
    }
    var b int = 2
}

Dynamic Scope

some languages, such as early versions of Lisp, use dynamic scope instead of static scope

with dynamic scoping, a name refers to the most recent occurrence of a variable on the call stack (as opposed to the inner-most textual block with the that name)

in general, with dynamic scoping it’s not possible to determine what value a variable actually refers to without running the program

this makes debugging harder, and can also result in surprising behaviour

here’s some code (based on JavaScript) that works differently using the rules of lexical scoping compared to the rules of dynamic scoping:

function big() {
    function sub1() {
        var x = 7;   // hides the x defined in big
        sub2();
    }

    function sub2() {
        var y = x;      // which x does this refer to?
    }
    var x = 3;
    sub1();
}

Javascript is statically scoped, and so the x in sub2 is the x with the value 3 that is defined in big

if Javascript were instead dynamically scoped, then then x would refer to the most recently bound x at runtime, i.e. the x bound to 7

here’s another example that shows the difference between static scoping and dynamic scoping:

const int b = 5;

int foo()
{
   int a = b + 5;  // What is b?
   return a;
}

int bar()
{
   int b = 2;
   return foo();
}

int main()
{
   foo(); // returns 10 for static scoping; 10 for dynamic scoping
   bar(); // returns 10 for static scoping; 7 for dynamic scoping
}

Functions within Functions

Go lets you define functions within functions, e.g.:

package main

import (
    "fmt"
    "math"
)

func questions(name string) (string, string, string) {
    ask := func(prompt string) string {
        ans := ""
        fmt.Print(name + ", " + prompt + " ")
        fmt.Scanf("%s", &ans)
        return ans
    } // ask

    color := ask("what's your favourite color?")
    shape := ask("what's your favourite shape?")
    pet := ask("what's the name of your first pet?")
    return color, shape, pet
}

func main() {
    c, s, p := questions("Dave")
    fmt.Println(c, s, p)
    // or you could do the same thing in one line:
    // fmt.Println(questions("Dave"))
}

ask refers to a local defined function

notice that ask has its own local variable, ans

ask also refers to the variable name which comes from the surrounding function (questions)

Passing a Function to a Function

you can also pass a function to a function, as in this example:

package main

import (
    "fmt"
    "math"
)

func derivative(f func(float64) float64, a, h float64) float64 {
    return (f(a+h) - f(a)) / h
}

func main() {
    fmt.Println(derivative(math.Sqrt, 2, 0.001))
    fmt.Println(derivative(math.Sin, 2, 0.001))
    f := func(x float64) float64 {
        return x*x*x - x
    }
    fmt.Println(derivative(f, 2, 0.001))
    fmt.Println(derivative(func(x float64) float64 {
        return 1 / x
    }, 2, 0.001))
}

notice the type of the f parameter to the derivative function:

func(float64) float64   // type of f

this is the same as Go’s regular function-header notation, except the function and parameters have no name

also note the last Println statement in main: it uses an anonymous function, also called a lambda function, directly in the call to derivative

Passing a Function to a Function

here’s another function takes a function as input:

func iterate(n int, start float64, f func(float64) float64) float64 {
    result := start
    for i := 0; i < n; i++ {
        result = f(result)
    }
    return result
}

func main() {
    fmt.Println(iterate(5, 2, math.Sqrt))
    fmt.Println(iterate(10, 2, math.Sqrt))
    fmt.Println(iterate(15, 2, math.Sqrt))
}

for example, iterate(3, 7, math.Sqrt) returns the value of math.Sqrt(math.Sqrt(math.Sqrt(6)))

Passing a Function to a Function

in this example, a function is applied to every element of a slice:

func intMap(f func(int) int, seq []int) {
    for i := range seq {
        seq[i] = f(seq[i])
    }
}

func main() {
    lst := []int{1, 2, 3, 4}
    fmt.Println(lst)
    intMap(func(x int) int { return 2 * x }, lst)
    fmt.Println(lst)
    intMap(func(x int) int { return x - 1 }, lst)
    fmt.Println(lst)
}

this is an example of a mapping function, and it is one of the most useful functions in function programming

Go does not make it easy to write such functions, though

in addition to the unwieldy syntax for calling functions, you cannot write a Go function with an unspecified type

that is, you cannot write a function like this in Go, where T is any type:

func map(f func(T) T, seq []T) {   // T is not a type, so not allowed in Go!
    for i := range seq {
        seq[i] = f(seq[i])
    }
}

the idea here is that T is any type, and that the f takes an input of type T and returns an output also of type T; and seq is also a slice of type T

you might try writing it using the empty interface, interface{}, like this:

func anyMap(f func(interface{}) interface{}, seq []interface{}) {
    for i := range seq {
        seq[i] = f(seq[i])
    }
}

this compiles, but code like this doesn’t compile:

lst := []int{1, 2, 3, 4}
anyMap(func(x int) int { return 2 * x }, lst)  // compiler error

this error message is printed:

cannot use func literal (type func(int) int)
as type func(interface {}) interface {} in argument to anyMap

to see why this doesn’t work, consider this line:

seq[i] = f(seq[i])

seq[i] could be a value of any type, and f returns a value of any type

so it could be that seq[i] is of type string, and f is of type float64

but you cannot assign a float64 to a string slice

what is needed is some extra constraint that says f takes and returns a value of the same type T, and seq[i] is of the same type T

Go does not allow for such constraints on types, and this one of the most common criticisms of Go

languages such as C++ and Java support this sort of thing by way of generics, and so in those language you could write a usable version of anyMap that works correctly (and is also statically typed checked at compile-time)

Returning a Function from a Function

returning a function from a function introduces a powerful new concept: closures

consider this function:

func makeAdder1() func() int {
    i := 0
    return func() int {
        i += 1 // i is defined outside of this function
        return i
    }
}

it returns a value of type func() int, i.e. a function that takes no inputs and returns one int

look carefully at what it returns:

func() int {
    i += 1 // i is defined outside of this function
    return i
}

this looks like a function, but there’s a problem: i is not declared within it

i is a local variable in the function that surrounds it (makeAdder1)

that means its scope, and lifetime, is restricted to makeAdder1

so what happens in code like this?:

add := makeAdder1()
fmt.Println(add())

when add() is called, what is the i inside of add referring to?

does it even compile?

Returning a Function from a Function: Closures

it turns out that this works pretty much as expected

but Go does some clever things behind the scenes to make it work

the Go compiler can tell that i escapes from makeAdder1

because of this, Go does not allocate i on the stack, but instead allocates it dynamically so that the returned function can still refer to i

the variable i and the returned function are wrapped up together into a single data structure called a closure

in Go, closures act just like functions, but they are not just functions: a closure is a function plus an associated environment of variables that the function can refer to

so we say that makeAdder1 returns a closure, and the associated environment of this closure contains the int variable i

importantly, i will exist as long as its associated function exists, so the function will always be able to read/write i

Returning a Function from a Function: Closures

closures can be confusing at first since they are so similar to regular functions

they even have the same type signature as a regular function

but you usually identify a closure by looking at the body of a function and seeing if it contains any non-global variables that are defined outside of the function itself

it may also not be immediately clear how powerful a feature closures are