.. highlight:: go Binding, Scope, and Functions ============================= Binding Time of Names --------------------- names in programming languages are bound to values, and this binding can occur at many different times e.g. the name ``for`` was bound to the idea of a loop in Go_ at language design-time e.g. compiler-specific extensions of a language (such as special I/O functions), occur at language implementation time e.g. ``double x = 5.3;``, ``x`` is bound to 5.3 at program-writing time e.g. in ``x + y`` the compiler may be able to determine that ``+`` means string concatenation (because ``x`` and ``y`` are strings); this is an example of **compile-time binding**, also known as **static binding** e.g. names that refer to the code in other, separately compiled modules, are bound at link time e.g. in ``a.print()``, the code associated with ``print`` might not be known until run-time; this is an example of **run-time binding**, also known as **dynamic binding** Binding Time of Names --------------------- the distinction between compile-time and run-time binding is a key to a lot of object-oriented programming consider this example of Go_ code:: // printer.go package main import ( "fmt" ) type Printer interface { print() } //////////////////////////////////////////////////////////////// type Point struct { x, y float64 } func (p Point) print() { fmt.Printf("(%v, %v)\n", p.x, p.y) } func (p Point) println() { // not in Printer interface p.print() fmt.Println() } //////////////////////////////////////////////////////////////// type Person struct { name string age int } func (p Person) print() { fmt.Printf("%v, age %v\n", p.name, p.age) } func (p Person) println() { // not in Printer interface p.print() fmt.Println() } func main() { var x Printer fmt.Println(x) // prints: s := "" fmt.Scanf("%s", &s) if len(s)%2 == 0 { x = Point{3, 4} } else { x = Person{"Bob", 20} } x.print() // which print function is called? x.println() } // main there's no way to tell *at compile time* which of the two possible ``print()`` methods is called by ``x.print()`` we need to know the type of the value in ``x``, i.e. whether it is a ``Point`` or a ``Person`` and in this case it's impossible to know this until the program runs even though we can't be sure which ``print()`` function is called, we can be sure there's no type error because whatever the value in ``x`` we know it must implement the ``Printer`` interface, and so the value in ``x`` must have a ``print()`` method another thing to notice is that only methods listed in the ``Printer`` interface can be called on ``x`` in this example, both ``Person`` and ``Point`` have a ``println`` method, but ``x.println`` is an error because ``println`` is not in ``Printer`` we can see that, in this program, calling ``x.println()`` would be fine, but Go_ decides what to do only by looking at the ``Printer`` interface; if we were to later add a new type that implemented a ``print()`` method but not a ``println()`` method, then ``x.println()`` would an error Three Basic Kinds of Allocation ------------------------------- static stack dynamic (heap) Static Allocation ----------------- global variables, constants special data structures created by the compiler, e.g. symbol table with the names of variables (for debugging) size of the static allocation region never changes must be known at compile time cannot support recursion Stack Allocation ---------------- stack data structure when a function is called, parameters and any local variables are pushed onto the stack in a "stack frame" return address is also pushed so program knows where to continue execution after the function ends when the function ends, the corresponding stack frame is popped (and a return value can be put on the stack) size of the stack shrinks and grows over time allows for local variables that are automatically allocated when the function starts, and automatically de-allocated when the function ends can handle recursive calls problem: stack-based memory only exists while a function is active, and it cannot handle data that you want to exist between function calls Dynamic (Heap) Memory --------------------- a **heap** is a region of memory where blocks of memory can be be allocated and de-allocated as needed (this is *not* the heap data structure used for implementing priority queues!) often, an explicit command is needed to allocate heap memory, e.g. string* p = new string; // C++ allocated memory stays allocated until the program ends, or it is explicitly de-allocated two basic de-allocation techniques: manual and automatic manual de-allocation is used in C/C++, e.g.:: delete p; // the memory p points to is de-allocated automatic de-allocation is called **garbage collection**, i.e. blocks are de- allocated automatically at run-time by a special process that keeps track of which blocks of heap memory are still accessible Dynamic (Heap) Memory --------------------- in practice, manual de-allocation of heap memory is useful when you need to write high-performance programs that carefully manage their memory (e.g. think operating systems) but experience shows that manual de-allocation is extremely difficult to do 100% correctly in C/C++, if you de-allocate heap memory that is pointed to by a pointer, then the pointer refers to memory that is no longer valid, and this is called a **dangling pointer** using the memory the pointer refers to can corrupt your programs memory and cause strange bugs that are difficult to discover if you don't de-allocate heap memory after you are finished using it, then your program uses more memory then it needs this might be fine if the computer has lots of memory but, over time, allocated but unused blocks can build up, using more and more memory this is called a **memory leak** over time, a **memory leak** could use up so much memory that your program's performance suffers, or even causes your program to crash (because there is no more memory to allocate) Dynamic (Heap) Memory --------------------- garbage collection is a solution to dangling pointers and memory leaks however, it does have a run-time cost, and so it might not be appropriate for high-performance applications, or in where there is not a lot of memory available most modern language provide garbage collection that is, the trade-of between performance and memory safety is in favor of memory safety i.e. most computers are relatively efficient, and most applications can tolerate small garbage collection delays, and so can afford the extra safety and convenience of automated heap de-allocation Dynamic (Heap) Memory --------------------- one problem with heap allocation is that it cause the heap to become fragmented, i.e. it becomes a mix of allocated and unallocated sequences of bytes, of different lengths we don't know when allocations/de-allocations will occur, and we don't know their size it's possible that there could be enough total memory for a particular allocation, but no contiguous block of memory is big enough to satisfy it there are various algorithms for how, exactly, to structure the heap to help minimize fragmentation for example, free heap blocks might be arranged by size in powers of 2, and when an allocation request is made the smallest power of 2 block that fits the allocation is chosen Escape Analysis in Go --------------------- consider this Go_ code:: type Point struct { x, y float64 } func getOrigin() *Point { p := Point{0, 0} return &p } func main() { p := getOrigin() fmt.Println(p) } notice that ``getOrigin`` has a local variable ``p``, and a pointer to its value is returned the question is: where is the *value* that ``p`` refers to stored? if the value of ``p`` were on the stack, then it would be automatically de- allocated when ``getOrigin`` ends, and so the returned pointer would be invalid, i.e. it would be a dangling pointer that refers to memory that is not a ``Point`` because of this, Go_ does not store the value of ``p`` on the stack in this case, and instead stores it on the heap by storing the value of ``p`` on the heap, it can exist even after ``getOrigin`` ends and the returned pointer is valid Go_ does **escape analysis** to determine if a value "escapes" from a function; values that escape are allocated on the heap, while values that don't escape can be allocated on the stack in contrast, C++ puts all local variables on the stack; the equivalent function is an error:: // C++ struct Point { float x, y' }; Point* getOrigin() { Point p{0, 0}; return &p; // oops: returning a dangling pointer } this functions is incorrect in C++ because all local variables in C++ are stored on the stack, and so are de-allocated when the function ends Scope ----- **scope** is the region of a program in which a binding is active most programming languages use **static scope** (or **lexical scope**), which means that the scope of a name is limited to a block of the source code that can be determined by looking at the source (and not having to run the program) so in a statically scoped language, the scope of a name can be determined before running the program, e.g. by the compiler a **global name** is a name whose scope is the entire program; a **local name** is a name whose scope is not the entire program (e.g. limited to a function, module, or other block of code) Scope: Nested Blocks -------------------- many languages allows blocks of code to be nested within each other, and use the **closest nested scope** rule for determining what a name refers to consider this Go example:: package main import ( "fmt" ) var x int = 1 func main() { fmt.Println(x) // 1 var x int = 2 fmt.Println(x) // 2 for x := 3; x <= 5; x++ { fmt.Println(x) // 3, 4, 5 } } the comments show what is printed; we say that the inner ``x`` variables **shadow** the outer ones here's another Go example that shows that variables defined in an outer scope are available in nested scopes:: package main import ( "fmt" ) var a int = 1 func main() { var b int = 2 for c := 3; c <= 5; c++ { d := a + b + c fmt.Println(d) // 6, 7, 8 } } in Go, the order of *global* variable declarations doesn't usually matter, e.g. this code prints the same thing:: func main() { var b int = 2 for c := 3; c <= 5; c++ { d := a + b + c fmt.Println(d) // 6, 7, 8 } } var a int = 1 however, the order of *local* declaration matters, e.g. this code doesn't compile because ``b`` is used before it is declared:: var a int = 1 func main() { for c := 3; c <= 5; c++ { d := a + b + c // oops: b is used before it has been declared fmt.Println(d) } var b int = 2 } Dynamic Scope ------------- some languages, such as early versions of Lisp, use **dynamic scope** instead of static scope with dynamic scoping, a name refers to the most recent occurrence of a variable *on the call stack* (as opposed to the inner-most textual block with the that name) in general, with dynamic scoping it's not possible to determine what value a variable actually refers to without running the program this makes debugging harder, and can also result in surprising behaviour here's some code (based on JavaScript_) that works differently using the rules of lexical scoping compared to the rules of dynamic scoping:: function big() { function sub1() { var x = 7; // hides the x defined in big sub2(); } function sub2() { var y = x; // which x does this refer to? } var x = 3; sub1(); } Javascript_ is statically scoped, and so the ``x`` in ``sub2`` is the ``x`` with the value 3 that is defined in ``big`` if Javascript_ were instead dynamically scoped, then then ``x`` would refer to the most recently bound ``x`` at runtime, i.e. the ``x`` bound to 7 here's another example that shows the difference between static scoping and dynamic scoping:: const int b = 5; int foo() { int a = b + 5; // What is b? return a; } int bar() { int b = 2; return foo(); } int main() { foo(); // returns 10 for static scoping; 10 for dynamic scoping bar(); // returns 10 for static scoping; 7 for dynamic scoping } Functions within Functions -------------------------- Go lets you define functions within functions, e.g.:: package main import ( "fmt" "math" ) func questions(name string) (string, string, string) { ask := func(prompt string) string { ans := "" fmt.Print(name + ", " + prompt + " ") fmt.Scanf("%s", &ans) return ans } // ask color := ask("what's your favourite color?") shape := ask("what's your favourite shape?") pet := ask("what's the name of your first pet?") return color, shape, pet } func main() { c, s, p := questions("Dave") fmt.Println(c, s, p) // or you could do the same thing in one line: // fmt.Println(questions("Dave")) } ``ask`` refers to a local defined function notice that ``ask`` has its own local variable, ``ans`` ``ask`` also refers to the variable ``name`` which comes from the surrounding function (``questions``) Passing a Function to a Function -------------------------------- you can also pass a function to a function, as in this example:: package main import ( "fmt" "math" ) func derivative(f func(float64) float64, a, h float64) float64 { return (f(a+h) - f(a)) / h } func main() { fmt.Println(derivative(math.Sqrt, 2, 0.001)) fmt.Println(derivative(math.Sin, 2, 0.001)) f := func(x float64) float64 { return x*x*x - x } fmt.Println(derivative(f, 2, 0.001)) fmt.Println(derivative(func(x float64) float64 { return 1 / x }, 2, 0.001)) } notice the type of the ``f`` parameter to the ``derivative`` function:: func(float64) float64 // type of f this is the same as Go's regular function-header notation, except the function and parameters have no name also note the last ``Println`` statement in ``main``: it uses an **anonymous function**, also called a **lambda function**, directly in the call to ``derivative`` Passing a Function to a Function -------------------------------- here's another function takes a function as input:: func iterate(n int, start float64, f func(float64) float64) float64 { result := start for i := 0; i < n; i++ { result = f(result) } return result } func main() { fmt.Println(iterate(5, 2, math.Sqrt)) fmt.Println(iterate(10, 2, math.Sqrt)) fmt.Println(iterate(15, 2, math.Sqrt)) } for example, ``iterate(3, 7, math.Sqrt)`` returns the value of ``math.Sqrt(math.Sqrt(math.Sqrt(6)))`` Passing a Function to a Function -------------------------------- in this example, a function is applied to every element of a slice:: func intMap(f func(int) int, seq []int) { for i := range seq { seq[i] = f(seq[i]) } } func main() { lst := []int{1, 2, 3, 4} fmt.Println(lst) intMap(func(x int) int { return 2 * x }, lst) fmt.Println(lst) intMap(func(x int) int { return x - 1 }, lst) fmt.Println(lst) } this is an example of a *mapping function*, and it is one of the most useful functions in function programming Go does not make it easy to write such functions, though in addition to the unwieldy syntax for calling functions, you **cannot** write a Go function with an unspecified type that is, you **cannot** write a function like this in Go_, where ``T`` is any type:: func map(f func(T) T, seq []T) { // T is not a type, so not allowed in Go! for i := range seq { seq[i] = f(seq[i]) } } the idea here is that ``T`` is any type, and that the ``f`` takes an input of type ``T`` and returns an output also of type ``T``; and ``seq`` is also a slice of type ``T`` you might try writing it using the empty interface, ``interface{}``, like this:: func anyMap(f func(interface{}) interface{}, seq []interface{}) { for i := range seq { seq[i] = f(seq[i]) } } this compiles, but code like this doesn't compile:: lst := []int{1, 2, 3, 4} anyMap(func(x int) int { return 2 * x }, lst) // compiler error this error message is printed:: cannot use func literal (type func(int) int) as type func(interface {}) interface {} in argument to anyMap to see why this doesn't work, consider this line:: seq[i] = f(seq[i]) ``seq[i]`` could be a value of any type, and ``f`` returns a value of any type so it could be that ``seq[i]`` is of type ``string``, and ``f`` is of type ``float64`` but you cannot assign a ``float64`` to a ``string`` slice what is needed is some extra constraint that says ``f`` takes and returns a value of the **same** type ``T``, and ``seq[i]`` is of the **same** type ``T`` Go_ does not allow for such constraints on types, and this one of the most common criticisms of Go_ languages such as C++ and Java support this sort of thing by way of **generics**, and so in those language you *could* write a usable version of ``anyMap`` that works correctly (and is also statically typed checked at compile-time) Returning a Function from a Function ------------------------------------ returning a function from a function introduces a powerful new concept: **closures** consider this function:: func makeAdder1() func() int { i := 0 return func() int { i += 1 // i is defined outside of this function return i } } it returns a value of type ``func() int``, i.e. a function that takes no inputs and returns one ``int`` look carefully at what it returns:: func() int { i += 1 // i is defined outside of this function return i } this *looks* like a function, but there's a problem: ``i`` is not declared within it ``i`` is a local variable in the function that surrounds it (``makeAdder1``) that means its scope, and lifetime, is restricted to ``makeAdder1`` so what happens in code like this?:: add := makeAdder1() fmt.Println(add()) when ``add()`` is called, what is the ``i`` inside of ``add`` referring to? does it even compile? Returning a Function from a Function: Closures ---------------------------------------------- it turns out that this works pretty much as expected but Go_ does some clever things behind the scenes to make it work the Go_ compiler can tell that ``i`` *escapes* from ``makeAdder1`` because of this, Go_ does not allocate ``i`` on the stack, but instead allocates it dynamically so that the returned function can still refer to ``i`` the variable ``i`` and the returned function are wrapped up together into a single data structure called a **closure** in Go_, closures act just like functions, but they are not just functions: a closure is a function *plus* an associated environment of variables that the function can refer to so we say that ``makeAdder1`` returns a closure, and the associated environment of this closure contains the ``int`` variable ``i`` importantly, ``i`` will exist as long as its associated function exists, so the function will always be able to read/write ``i`` Returning a Function from a Function: Closures ---------------------------------------------- closures can be confusing at first since they are so similar to regular functions they even have the same type signature as a regular function but you usually identify a closure by looking at the body of a function and seeing if it contains any non-global variables that are defined outside of the function itself it may also not be immediately clear how powerful a feature closures are