Binding, Scope, and Functions¶
Binding Time of Names¶
names in programming languages are bound to values, and this binding can occur at many different times
e.g. the name for
was bound to the idea of a loop in Go at language
design-time
e.g. compiler-specific extensions of a language (such as special I/O functions), occur at language implementation time
e.g. double x = 5.3;
, x
is bound to 5.3 at program-writing time
e.g. in x + y
the compiler may be able to determine that +
means
string concatenation (because x
and y
are strings); this is an
example of compile-time binding, also known as static binding
e.g. names that refer to the code in other, separately compiled modules, are bound at link time
e.g. in a.print()
, the code associated with print
might not be known
until run-time; this is an example of run-time binding, also known as
dynamic binding
Binding Time of Names¶
the distinction between compile-time and run-time binding is a key to a lot of object-oriented programming
consider this example of Go code:
// printer.go
package main
import (
"fmt"
)
type Printer interface {
print()
}
////////////////////////////////////////////////////////////////
type Point struct {
x, y float64
}
func (p Point) print() {
fmt.Printf("(%v, %v)\n", p.x, p.y)
}
func (p Point) println() { // not in Printer interface
p.print()
fmt.Println()
}
////////////////////////////////////////////////////////////////
type Person struct {
name string
age int
}
func (p Person) print() {
fmt.Printf("%v, age %v\n", p.name, p.age)
}
func (p Person) println() { // not in Printer interface
p.print()
fmt.Println()
}
func main() {
var x Printer
fmt.Println(x) // prints: <nil>
s := ""
fmt.Scanf("%s", &s)
if len(s)%2 == 0 {
x = Point{3, 4}
} else {
x = Person{"Bob", 20}
}
x.print() // which print function is called?
x.println()
} // main
there’s no way to tell at compile time which of the two possible print()
methods is called by x.print()
we need to know the type of the value in x
, i.e. whether it is a Point
or a Person
and in this case it’s impossible to know this until the program runs
even though we can’t be sure which print()
function is called, we can be
sure there’s no type error because whatever the value in x
we know it must
implement the Printer
interface, and so the value in x
must have a
print()
method
another thing to notice is that only methods listed in the Printer
interface can be called on x
in this example, both Person
and Point
have a println
method, but
x.println
is an error because println
is not in Printer
we can see that, in this program, calling x.println()
would be fine, but
Go decides what to do only by looking at the Printer
interface; if we
were to later add a new type that implemented a print()
method but not a
println()
method, then x.println()
would an error
Static Allocation¶
global variables, constants
special data structures created by the compiler, e.g. symbol table with the names of variables (for debugging)
size of the static allocation region never changes
must be known at compile time
cannot support recursion
Stack Allocation¶
stack data structure
when a function is called, parameters and any local variables are pushed onto the stack in a “stack frame”
return address is also pushed so program knows where to continue execution after the function ends
when the function ends, the corresponding stack frame is popped (and a return value can be put on the stack)
size of the stack shrinks and grows over time
allows for local variables that are automatically allocated when the function starts, and automatically de-allocated when the function ends
can handle recursive calls
problem: stack-based memory only exists while a function is active, and it cannot handle data that you want to exist between function calls
Dynamic (Heap) Memory¶
a heap is a region of memory where blocks of memory can be be allocated and de-allocated as needed (this is not the heap data structure used for implementing priority queues!)
often, an explicit command is needed to allocate heap memory, e.g.
string* p = new string; // C++
allocated memory stays allocated until the program ends, or it is explicitly de-allocated
two basic de-allocation techniques: manual and automatic
manual de-allocation is used in C/C++, e.g.:
delete p; // the memory p points to is de-allocated
automatic de-allocation is called garbage collection, i.e. blocks are de- allocated automatically at run-time by a special process that keeps track of which blocks of heap memory are still accessible
Dynamic (Heap) Memory¶
in practice, manual de-allocation of heap memory is useful when you need to write high-performance programs that carefully manage their memory (e.g. think operating systems)
but experience shows that manual de-allocation is extremely difficult to do 100% correctly
in C/C++, if you de-allocate heap memory that is pointed to by a pointer, then the pointer refers to memory that is no longer valid, and this is called a dangling pointer
using the memory the pointer refers to can corrupt your programs memory and cause strange bugs that are difficult to discover
if you don’t de-allocate heap memory after you are finished using it, then your program uses more memory then it needs
this might be fine if the computer has lots of memory
but, over time, allocated but unused blocks can build up, using more and more memory
this is called a memory leak
over time, a memory leak could use up so much memory that your program’s performance suffers, or even causes your program to crash (because there is no more memory to allocate)
Dynamic (Heap) Memory¶
garbage collection is a solution to dangling pointers and memory leaks
however, it does have a run-time cost, and so it might not be appropriate for high-performance applications, or in where there is not a lot of memory available
most modern language provide garbage collection
that is, the trade-of between performance and memory safety is in favor of memory safety
i.e. most computers are relatively efficient, and most applications can tolerate small garbage collection delays, and so can afford the extra safety and convenience of automated heap de-allocation
Dynamic (Heap) Memory¶
one problem with heap allocation is that it cause the heap to become fragmented, i.e. it becomes a mix of allocated and unallocated sequences of bytes, of different lengths
we don’t know when allocations/de-allocations will occur, and we don’t know their size
it’s possible that there could be enough total memory for a particular allocation, but no contiguous block of memory is big enough to satisfy it
there are various algorithms for how, exactly, to structure the heap to help minimize fragmentation
for example, free heap blocks might be arranged by size in powers of 2, and when an allocation request is made the smallest power of 2 block that fits the allocation is chosen
Escape Analysis in Go¶
consider this Go code:
type Point struct {
x, y float64
}
func getOrigin() *Point {
p := Point{0, 0}
return &p
}
func main() {
p := getOrigin()
fmt.Println(p)
}
notice that getOrigin
has a local variable p
, and a pointer to its
value is returned
the question is: where is the value that p
refers to stored?
if the value of p
were on the stack, then it would be automatically de-
allocated when getOrigin
ends, and so the returned pointer would be
invalid, i.e. it would be a dangling pointer that refers to memory that is not
a Point
because of this, Go does not store the value of p
on the stack in this
case, and instead stores it on the heap
by storing the value of p
on the heap, it can exist even after
getOrigin
ends and the returned pointer is valid
Go does escape analysis to determine if a value “escapes” from a function; values that escape are allocated on the heap, while values that don’t escape can be allocated on the stack
in contrast, C++ puts all local variables on the stack; the equivalent function is an error:
// C++
struct Point {
float x, y'
};
Point* getOrigin() {
Point p{0, 0};
return &p; // oops: returning a dangling pointer
}
this functions is incorrect in C++ because all local variables in C++ are stored on the stack, and so are de-allocated when the function ends
Scope¶
scope is the region of a program in which a binding is active
most programming languages use static scope (or lexical scope), which means that the scope of a name is limited to a block of the source code that can be determined by looking at the source (and not having to run the program)
so in a statically scoped language, the scope of a name can be determined before running the program, e.g. by the compiler
a global name is a name whose scope is the entire program; a local name is a name whose scope is not the entire program (e.g. limited to a function, module, or other block of code)
Scope: Nested Blocks¶
many languages allows blocks of code to be nested within each other, and use the closest nested scope rule for determining what a name refers to
consider this Go example:
package main
import (
"fmt"
)
var x int = 1
func main() {
fmt.Println(x) // 1
var x int = 2
fmt.Println(x) // 2
for x := 3; x <= 5; x++ {
fmt.Println(x) // 3, 4, 5
}
}
the comments show what is printed; we say that the inner x
variables
shadow the outer ones
here’s another Go example that shows that variables defined in an outer scope are available in nested scopes:
package main
import (
"fmt"
)
var a int = 1
func main() {
var b int = 2
for c := 3; c <= 5; c++ {
d := a + b + c
fmt.Println(d) // 6, 7, 8
}
}
in Go, the order of global variable declarations doesn’t usually matter, e.g. this code prints the same thing:
func main() {
var b int = 2
for c := 3; c <= 5; c++ {
d := a + b + c
fmt.Println(d) // 6, 7, 8
}
}
var a int = 1
however, the order of local declaration matters, e.g. this code doesn’t
compile because b
is used before it is declared:
var a int = 1
func main() {
for c := 3; c <= 5; c++ {
d := a + b + c // oops: b is used before it has been declared
fmt.Println(d)
}
var b int = 2
}
Dynamic Scope¶
some languages, such as early versions of Lisp, use dynamic scope instead of static scope
with dynamic scoping, a name refers to the most recent occurrence of a variable on the call stack (as opposed to the inner-most textual block with the that name)
in general, with dynamic scoping it’s not possible to determine what value a variable actually refers to without running the program
this makes debugging harder, and can also result in surprising behaviour
here’s some code (based on JavaScript) that works differently using the rules of lexical scoping compared to the rules of dynamic scoping:
function big() {
function sub1() {
var x = 7; // hides the x defined in big
sub2();
}
function sub2() {
var y = x; // which x does this refer to?
}
var x = 3;
sub1();
}
Javascript is statically scoped, and so the x
in sub2
is the x
with the value 3 that is defined in big
if Javascript were instead dynamically scoped, then then x
would refer to
the most recently bound x
at runtime, i.e. the x
bound to 7
here’s another example that shows the difference between static scoping and dynamic scoping:
const int b = 5;
int foo()
{
int a = b + 5; // What is b?
return a;
}
int bar()
{
int b = 2;
return foo();
}
int main()
{
foo(); // returns 10 for static scoping; 10 for dynamic scoping
bar(); // returns 10 for static scoping; 7 for dynamic scoping
}
Functions within Functions¶
Go lets you define functions within functions, e.g.:
package main
import (
"fmt"
"math"
)
func questions(name string) (string, string, string) {
ask := func(prompt string) string {
ans := ""
fmt.Print(name + ", " + prompt + " ")
fmt.Scanf("%s", &ans)
return ans
} // ask
color := ask("what's your favourite color?")
shape := ask("what's your favourite shape?")
pet := ask("what's the name of your first pet?")
return color, shape, pet
}
func main() {
c, s, p := questions("Dave")
fmt.Println(c, s, p)
// or you could do the same thing in one line:
// fmt.Println(questions("Dave"))
}
ask
refers to a local defined function
notice that ask
has its own local variable, ans
ask
also refers to the variable name
which comes from the surrounding
function (questions
)
Passing a Function to a Function¶
you can also pass a function to a function, as in this example:
package main
import (
"fmt"
"math"
)
func derivative(f func(float64) float64, a, h float64) float64 {
return (f(a+h) - f(a)) / h
}
func main() {
fmt.Println(derivative(math.Sqrt, 2, 0.001))
fmt.Println(derivative(math.Sin, 2, 0.001))
f := func(x float64) float64 {
return x*x*x - x
}
fmt.Println(derivative(f, 2, 0.001))
fmt.Println(derivative(func(x float64) float64 {
return 1 / x
}, 2, 0.001))
}
notice the type of the f
parameter to the derivative
function:
func(float64) float64 // type of f
this is the same as Go’s regular function-header notation, except the function and parameters have no name
also note the last Println
statement in main
: it uses an anonymous
function, also called a lambda function, directly in the call to
derivative
Passing a Function to a Function¶
here’s another function takes a function as input:
func iterate(n int, start float64, f func(float64) float64) float64 {
result := start
for i := 0; i < n; i++ {
result = f(result)
}
return result
}
func main() {
fmt.Println(iterate(5, 2, math.Sqrt))
fmt.Println(iterate(10, 2, math.Sqrt))
fmt.Println(iterate(15, 2, math.Sqrt))
}
for example, iterate(3, 7, math.Sqrt)
returns the value of
math.Sqrt(math.Sqrt(math.Sqrt(6)))
Passing a Function to a Function¶
in this example, a function is applied to every element of a slice:
func intMap(f func(int) int, seq []int) {
for i := range seq {
seq[i] = f(seq[i])
}
}
func main() {
lst := []int{1, 2, 3, 4}
fmt.Println(lst)
intMap(func(x int) int { return 2 * x }, lst)
fmt.Println(lst)
intMap(func(x int) int { return x - 1 }, lst)
fmt.Println(lst)
}
this is an example of a mapping function, and it is one of the most useful functions in function programming
Go does not make it easy to write such functions, though
in addition to the unwieldy syntax for calling functions, you cannot write a Go function with an unspecified type
that is, you cannot write a function like this in Go, where T
is any
type:
func map(f func(T) T, seq []T) { // T is not a type, so not allowed in Go!
for i := range seq {
seq[i] = f(seq[i])
}
}
the idea here is that T
is any type, and that the f
takes an input of
type T
and returns an output also of type T
; and seq
is also a
slice of type T
you might try writing it using the empty interface, interface{}
, like this:
func anyMap(f func(interface{}) interface{}, seq []interface{}) {
for i := range seq {
seq[i] = f(seq[i])
}
}
this compiles, but code like this doesn’t compile:
lst := []int{1, 2, 3, 4}
anyMap(func(x int) int { return 2 * x }, lst) // compiler error
this error message is printed:
cannot use func literal (type func(int) int)
as type func(interface {}) interface {} in argument to anyMap
to see why this doesn’t work, consider this line:
seq[i] = f(seq[i])
seq[i]
could be a value of any type, and f
returns a value of any type
so it could be that seq[i]
is of type string
, and f
is of type
float64
but you cannot assign a float64
to a string
slice
what is needed is some extra constraint that says f
takes and returns a
value of the same type T
, and seq[i]
is of the same type T
Go does not allow for such constraints on types, and this one of the most common criticisms of Go
languages such as C++ and Java support this sort of thing by way of
generics, and so in those language you could write a usable version of
anyMap
that works correctly (and is also statically typed checked at
compile-time)
Returning a Function from a Function¶
returning a function from a function introduces a powerful new concept: closures
consider this function:
func makeAdder1() func() int {
i := 0
return func() int {
i += 1 // i is defined outside of this function
return i
}
}
it returns a value of type func() int
, i.e. a function that takes no
inputs and returns one int
look carefully at what it returns:
func() int {
i += 1 // i is defined outside of this function
return i
}
this looks like a function, but there’s a problem: i
is not declared
within it
i
is a local variable in the function that surrounds it (makeAdder1
)
that means its scope, and lifetime, is restricted to makeAdder1
so what happens in code like this?:
add := makeAdder1()
fmt.Println(add())
when add()
is called, what is the i
inside of add
referring to?
does it even compile?
Returning a Function from a Function: Closures¶
it turns out that this works pretty much as expected
but Go does some clever things behind the scenes to make it work
the Go compiler can tell that i
escapes from makeAdder1
because of this, Go does not allocate i
on the stack, but instead
allocates it dynamically so that the returned function can still refer to
i
the variable i
and the returned function are wrapped up together into a
single data structure called a closure
in Go, closures act just like functions, but they are not just functions: a closure is a function plus an associated environment of variables that the function can refer to
so we say that makeAdder1
returns a closure, and the associated
environment of this closure contains the int
variable i
importantly, i
will exist as long as its associated function exists, so
the function will always be able to read/write i
Returning a Function from a Function: Closures¶
closures can be confusing at first since they are so similar to regular functions
they even have the same type signature as a regular function
but you usually identify a closure by looking at the body of a function and seeing if it contains any non-global variables that are defined outside of the function itself
it may also not be immediately clear how powerful a feature closures are