Basic Functions

Subprograms are an important part of most programming languages. Typically, they are implemented as functions, procedures (i.e. functions that don’t return a value but instead), or methods (i.e. functions that operate on an object in OOP).

Consider a simple function in a C-style language:

int f(int a, int b) {
    int result = 2 * (a + b);
    return result;
}

The only way to execute the code in this function is to call it, e.g. f(1, 2). When f is finished executing, it returns a copy of the value of result.

Note that we can create local variables inside the function. Local variables are only visible inside functions, and are automatically deleted when the function ends.

In this example, the only way for f to end is when it executes return.

Most programming languages use a call-stack to store parameters, local variables, and return values. So when a function returns a value, it actually pushes it to the top of the call stack.

A function ends when it executes return, and it is possible for a function to have multiple returns, e.g:

int sign(int x) {
    if (x == 0) {
        return 0;
    } else if (x < 0) {
        return -1;
    } else {
        return 1;
    }
}

We could write it so that it uses a single return like this:

int sign(int x) {
    int result = 1;
    if (x == 0) {
        result = 0;
    } else if (x < 0) {
        result = -1;
    }
    return result;
}

Some programmers prefer having a single return statement because it makes it clear when the function ends. This is especially useful in long, complex functions. However, for short functions, many programmers find this to be more trouble than it is worth and so use multiple returns.

Even though a C-style function can contain multiple return statements, a return statement can only return a single value. For example, there is no way in traditional C-style languages to write a function that returns two ints. Instead, you would need to create a struct or object to hold the two ints, and then return that. Some languages, such as Go and Python, allow for functions to return multiple objects in the form of a tuple, e.g.:

func origin() (int, int) {  // Go
    return 0, 0
}

This seems to be quite convenient in practice.

Formal and Actual Parameters

Consider again function f:

int f(int a, int b) {
    int result = 2 * (a + b);
    return result;
}

The header, or signature, of f is this line:

int f(int a, int b)

This tells us quite a lot of useful information about the function: it’s name, return type, and parameter list. Using this information we can, for instance, deduce that f(f(2), f(3)) is a valid way to call f.

The parameters listed in a functions header are often called its formal parameters. When you call a function, the values you pass are called the actual parameters, or arguments. These get bound to the corresponding formal parameters.

For instance, in the call f(2, x), 2 and x are the actual parameters, and 2 gets bound to the formal parameter a, while x gets bound to the formal parameter b.

Keyword Parameters

The parameters in function f are positional parameters because each actual parameter is bound to the formal parameter that is in the corresponding position, i.e. the first actual parameter is bound to the first formal parameter, the second actual parameter is bound to the second formal parameter, and so on.

This works well for many functions with a few parameters, but, especially in functions with many parameters, it can be confusing.

Consider this Python function:

def copy(a, b):
    # ... copies a into b ...

You would call copy like this: copy(x, y).

The issue here is that some programmers might reasonably believe that b is being copied into a, and so might write the parameters in the wrong order.

But using keywords parameters we could write copy like this:

def copy(copy_from=a, copy_to=b):
    # ... copies a into b ...

Now you can all it like this copy(copy_from=x, copy_to=y), or, equivalently, like this copy(copy_to=y, copy_from=x). The order of the parameters doesn’t matter here: it is the keyword names that specify the bindings.

Default Parameters

Languages such as C++ and Python let you give a parameter a default value. For example:

def calc_tax(rate = 0.12):
   # ...

If you call calc_tax with a parameter, e.g. calc_tax(0.14), than rate gets the value 0.14. If you call it without a parameter, then rate gets the default value of 0.12.

Variable Number of Parameters

Many languages provide some way to pass in an unknown number of parameters. For example, consider this Go function:

func sum(nums ...int) int {
    result := 0
    for _, n := range nums {
        result += n
    }
    return result
}

func main() {
    fmt.Println(sum(2))
    fmt.Println(sum(2, 4))
    fmt.Println(sum(2, 4, 5))
}

Using ... in the formal parameter list causes nums to be treated like an array of values. Thus, you can pass in as many values a you like.

Some languages, such as Ruby and Python, have much more elaborate mechanisms for handling variable parameters. The text gives some examples if you are curious.

Nested Functions

Some languages, particularly those from the functional programming tradition, let you define a function within a function. You might want to do this for reasons similar to why you would want to declare a local variable: a locally defined function is only usable within the function itself.

For example, in Python:

def f(n):
    def addn(x): return n + x
    print addn(4)

Calling f(3) would print 7.

In many C-style languages, you cannot declare a function inside a function. For instance, in C, you would need to write the above code like this:

int addn(int n, int x) {
    return n + x
}

void f(int n) {
    printf("%d\n", addn(4));
}

Notice that addn has an extra parameter in this case. Since addn is outside of the scope of f, this is the only way it can get a copy of n.

Some C-based languages, such as C++ and Java, do not allow you to declare named functions within a function, but do permit so-called lambda functions (i.e. functions with no name) to be declared inside named functions.

Parameter Passing

When we talk about passing parameters to functions, it is useful to distinguish three basic kinds of passing:

  • in mode is when a parameter is used pass a value into the function
  • out mode is when a parameter is used to pass a value out of a function (this is different than the return value)
  • inout mode is when a parameter is used to pass a value both in and out of a function (i.e. a combination of the first two modes)

These general categories of parameter passing can be implemented in a variety of different ways.

Pass By Value

In pass by value, which is an in mode way to pass a parameter, a copy of the actual parameter is assigned to the corresponding formal parameter. The formal parameter then acts like a local parameter in the function.

For example, in Java-like pseudocode:

void setToZero(int a) {   // Java
    a = 1;
}

// ...

int x = 5;
setToZero(x);
print(x);  // prints 5

Variable x is not modified because the a in setToZero is a copy of the value in x.

An obvious benefit of pass by value is that it does not allow the passed-in value to be modified. This can prevent errors where a function unintentionally modifies its parameters.

But this can also be a problem. As inn the example, pass by value cannot be used to modify the contents of a variable because the function only has access to a copy of the passed-in variables value (and no access to the variable itself). Indeed, pass by value works even if you pass in a constant value that is not a variable.

Another potential problem with pass by value is that copying a value takes time and memory, and so for large objects, like arrays or strings, this could be inefficient.

Go uses pass by value for all its parameter passing. However, it avoids expensive copies of arrays by using slices, which are small objects that refer to a subsequence of an array.

Pass By Result

Pass by result is a kind of out mode parameter passing where the function puts a value in the parameter that can then be used by the calling program. More specifically, the formal parameter would be initialized like a local variable, and then, after the function ends, the last value in the formal parameter is then bound to the actual parameter.

C# allows for pass by result using the out modifier, e.g.:

void Fixer(out int x, out int y) {   // C#, Sebesta p. 402
    x = 17;
    y = 35;
}

// ...

f.Fixer(out a, out a);

Notice that the same variable is passed here, so the value of a depends upon the order in which x and y are assigned.

Pass By Value-Result

Pass by value-result is essentially a combination of pass by value and pass by result that provides for inout mode parameters.

Pass By Reference

Pass by reference is a common way of implementing inout mode parameters. In pass by reference, the address of the actual parameter is passed to the formal parameter so that the formal parameter is essentially an alias for the actual parameter.

For example, in C++:

void setToZero(int& a) {  // C++
    a = 1;
}

// ...

int x = 5;
setToZero(x);
cout << x;  // prints 5

A major advantage of pass by reference is that is efficient in both time and memory. However, since it can modify the passed-in value, the programmer must be careful to not accidentally modify values that should not be changed.

C++ address this problem using const, e.g.:

void display(const string& s) {    // C++
    cout << "s = " << s << endl;
}

Here, s is passed by constant reference. No copy of s is made, and there is no way to modify s within the display function due to the const.

Note that while C only has pass by value, you can simulate pass by reference using pointers. For example:

void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

// ...

int x = 3;
int y = 5;
swap(&x, &y);
// now x == 6 and y == 3

Pass By Name

Pass by name is a somewhat unusual parameter passing method that is not used in many modern languages. In pass by name, formal parameters in the function are, essentially, replaced by a textual copy of the actual parameter. This is how parameters are passed in C/C++ macros, for instance.

Lets see some examples of pass by name. Suppose Go used pass by name:

func adda(n int) int {
    var a int = 3
    return n + a
}

In a simple expression like adda(4), 7 is returned. But consider this example:

func main() {
    x := 4
    fmt.Println(adda(x))  // error if passed by name!
}

When using pass by name, this is an error, because inside adda n is replaced with x. But x is not in scope in adda, and so an error results. We could fix it like this:

var x int

func main() {
    x = 4
    fmt.Println(adda(x))  // prints 7
}

If we rename x to a, we get a different result:

var a int

func main() {
    a = 4
    fmt.Println(adda(a))  // error: prints(?) 6
}

The problem with this example is that the passed-in a gets captured by the local a inside adda, and so it is the local value of a that is used in the return value.

Eager Evaluation and Lazy Evaluation

A subtle detail about parameter passing is when, exactly, the actual parameters passed to a function get evaluated.

Consider this Go code:

package main

import "fmt"

func forever() int {
    for { // infinite loop!
    }
    return 5
}

func two(n int) int {
    fmt.Println("two")
    return 2
}

func main() {
    a := two(3)  // a gets 2
    b := two(forever())

    fmt.Println(a, b)
}

The function call two(forever()) never returns a value because forever() contains an infinite loop. This shows that when two(forever()) is called, the actual parameter is first evaluated, and the result of its evaluation is passed to two.

This idea of evaluating the actual parameter before calling the function is known as eager evaluation, and it is how most languages pass parameters.

Consider another example:

func two() int {
    fmt.Println("two")
    return 2
}

func three() int {
    fmt.Println("three")
    return 3
}

func ifthen(cond bool, trueResult, falseResult int) int {
    if cond {
        return trueResult
    } else {
        return falseResult
    }
}

func main() {
    x := 2
    result := ifthen(x == 2, two(), three())
    fmt.Println(result)
}

The idea here is that we want to create our own version of an if-then statement using the function ifthen. Unfortunately, due to eager evaluation of parameters, it is impossible for ifthen to simulate the real if-then because trueResult and falseResult are both evaluated before being passed to the function. Thus they are both executed, which is wrong: we only want one of them to be executed.

This example shows that eager parameter evaluation means that you can’t write functions that mimic control structures. For most programmers, this is not a big deal because they were raised on languages where you lived with the control structures the language gave you and that’s that.

An alternative to eager evaluation is lazy evaluation. In lazy evaluation, an expression is not evaluated until its value is needed in a computation. So we get different behaviour in this example if lazy evaluation were used:

func forever() int {
    for { // infinite loop!
    }
    return 5
}

func two(n int) int {
    fmt.Println("two")
    return 2
}

func main() {
    a := two(3)  // a gets 2
    b := two(forever())  // b gets 2 (using lazy evaluation)

    fmt.Println(a, b)
}

Using lazy evaluation, this program would end and print “2 2”. When two(forever()) is called, forever() is not evaluated because its return value is not required (we just pass the whole function call to the expression). The two does actually need the return result of forever() anywhere in its body, and so it returns 2 without ever evaluating forever().

More usefully, lazy evaluation makes functions like ifthen work correctly. Only the result that is actually needed is returned.

The well-known Haskell language uses lazy evaluation, and so it has these benefits. Indeed, Haskell has some practical idioms that take advantage of infinite data structures that, in a language without lazy evaluation, would be infinite loops.

On the down side, experience with Haskell shows that it can be quite difficult to estimate the performance characteristics of a program that uses lazy evaluation. The problem is that you don’t know ahead of time when exactly an expression will be evaluated.