Basic Functions¶
Subprograms are an important part of most programming languages. Typically, they are implemented as functions, procedures (i.e. functions that don’t return a value but instead), or methods (i.e. functions that operate on an object in OOP).
Consider a simple function in a C-style language:
int f(int a, int b) {
int result = 2 * (a + b);
return result;
}
The only way to execute the code in this function is to call it, e.g.
f(1, 2)
. When f
is finished executing, it returns a copy of the value of
result
.
Note that we can create local variables inside the function. Local variables are only visible inside functions, and are automatically deleted when the function ends.
In this example, the only way for f
to end is when it executes return
.
Most programming languages use a call-stack to store parameters, local variables, and return values. So when a function returns a value, it actually pushes it to the top of the call stack.
A function ends when it executes return
, and it is possible for a function
to have multiple returns, e.g:
int sign(int x) {
if (x == 0) {
return 0;
} else if (x < 0) {
return -1;
} else {
return 1;
}
}
We could write it so that it uses a single return
like this:
int sign(int x) {
int result = 1;
if (x == 0) {
result = 0;
} else if (x < 0) {
result = -1;
}
return result;
}
Some programmers prefer having a single return
statement because it makes
it clear when the function ends. This is especially useful in long, complex
functions. However, for short functions, many programmers find this to be more
trouble than it is worth and so use multiple returns.
Even though a C-style function can contain multiple return
statements, a
return
statement can only return a single value. For example, there is no
way in traditional C-style languages to write a function that returns two
ints. Instead, you would need to create a struct or object to hold the two
ints, and then return that. Some languages, such as Go and Python, allow for
functions to return multiple objects in the form of a tuple, e.g.:
func origin() (int, int) { // Go
return 0, 0
}
This seems to be quite convenient in practice.
Formal and Actual Parameters¶
Consider again function f
:
int f(int a, int b) {
int result = 2 * (a + b);
return result;
}
The header, or signature, of f
is this line:
int f(int a, int b)
This tells us quite a lot of useful information about the function: it’s name,
return type, and parameter list. Using this information we can, for instance,
deduce that f(f(2), f(3))
is a valid way to call f
.
The parameters listed in a functions header are often called its formal parameters. When you call a function, the values you pass are called the actual parameters, or arguments. These get bound to the corresponding formal parameters.
For instance, in the call f(2, x)
, 2 and x
are the actual parameters,
and 2 gets bound to the formal parameter a
, while x
gets bound to the
formal parameter b
.
Keyword Parameters¶
The parameters in function f
are positional parameters because each
actual parameter is bound to the formal parameter that is in the corresponding
position, i.e. the first actual parameter is bound to the first formal
parameter, the second actual parameter is bound to the second formal
parameter, and so on.
This works well for many functions with a few parameters, but, especially in functions with many parameters, it can be confusing.
Consider this Python function:
def copy(a, b):
# ... copies a into b ...
You would call copy
like this: copy(x, y)
.
The issue here is that some programmers might reasonably believe that b
is
being copied into a
, and so might write the parameters in the wrong order.
But using keywords parameters we could write copy
like this:
def copy(copy_from=a, copy_to=b):
# ... copies a into b ...
Now you can all it like this copy(copy_from=x, copy_to=y)
, or,
equivalently, like this copy(copy_to=y, copy_from=x)
. The order of the
parameters doesn’t matter here: it is the keyword names that specify the
bindings.
Default Parameters¶
Languages such as C++ and Python let you give a parameter a default value. For example:
def calc_tax(rate = 0.12):
# ...
If you call calc_tax
with a parameter, e.g. calc_tax(0.14)
, than
rate
gets the value 0.14. If you call it without a parameter, then
rate
gets the default value of 0.12.
Variable Number of Parameters¶
Many languages provide some way to pass in an unknown number of parameters. For example, consider this Go function:
func sum(nums ...int) int {
result := 0
for _, n := range nums {
result += n
}
return result
}
func main() {
fmt.Println(sum(2))
fmt.Println(sum(2, 4))
fmt.Println(sum(2, 4, 5))
}
Using ...
in the formal parameter list causes nums
to be treated like
an array of values. Thus, you can pass in as many values a you like.
Some languages, such as Ruby and Python, have much more elaborate mechanisms for handling variable parameters. The text gives some examples if you are curious.
Nested Functions¶
Some languages, particularly those from the functional programming tradition, let you define a function within a function. You might want to do this for reasons similar to why you would want to declare a local variable: a locally defined function is only usable within the function itself.
For example, in Python:
def f(n):
def addn(x): return n + x
print addn(4)
Calling f(3)
would print 7.
In many C-style languages, you cannot declare a function inside a function. For instance, in C, you would need to write the above code like this:
int addn(int n, int x) {
return n + x
}
void f(int n) {
printf("%d\n", addn(4));
}
Notice that addn
has an extra parameter in this case. Since addn
is
outside of the scope of f
, this is the only way it can get a copy of
n
.
Some C-based languages, such as C++ and Java, do not allow you to declare named functions within a function, but do permit so-called lambda functions (i.e. functions with no name) to be declared inside named functions.
Parameter Passing¶
When we talk about passing parameters to functions, it is useful to distinguish three basic kinds of passing:
- in mode is when a parameter is used pass a value into the function
- out mode is when a parameter is used to pass a value out of a function (this is different than the return value)
- inout mode is when a parameter is used to pass a value both in and out of a function (i.e. a combination of the first two modes)
These general categories of parameter passing can be implemented in a variety of different ways.
Pass By Value¶
In pass by value, which is an in mode way to pass a parameter, a copy of the actual parameter is assigned to the corresponding formal parameter. The formal parameter then acts like a local parameter in the function.
For example, in Java-like pseudocode:
void setToZero(int a) { // Java
a = 1;
}
// ...
int x = 5;
setToZero(x);
print(x); // prints 5
Variable x
is not modified because the a
in setToZero
is a copy of the
value in x
.
An obvious benefit of pass by value is that it does not allow the passed-in value to be modified. This can prevent errors where a function unintentionally modifies its parameters.
But this can also be a problem. As inn the example, pass by value cannot be used to modify the contents of a variable because the function only has access to a copy of the passed-in variables value (and no access to the variable itself). Indeed, pass by value works even if you pass in a constant value that is not a variable.
Another potential problem with pass by value is that copying a value takes time and memory, and so for large objects, like arrays or strings, this could be inefficient.
Go uses pass by value for all its parameter passing. However, it avoids expensive copies of arrays by using slices, which are small objects that refer to a subsequence of an array.
Pass By Result¶
Pass by result is a kind of out mode parameter passing where the function puts a value in the parameter that can then be used by the calling program. More specifically, the formal parameter would be initialized like a local variable, and then, after the function ends, the last value in the formal parameter is then bound to the actual parameter.
C# allows for pass by result using the out
modifier, e.g.:
void Fixer(out int x, out int y) { // C#, Sebesta p. 402
x = 17;
y = 35;
}
// ...
f.Fixer(out a, out a);
Notice that the same variable is passed here, so the value of a
depends
upon the order in which x
and y
are assigned.
Pass By Value-Result¶
Pass by value-result is essentially a combination of pass by value and pass by result that provides for inout mode parameters.
Pass By Reference¶
Pass by reference is a common way of implementing inout mode parameters. In pass by reference, the address of the actual parameter is passed to the formal parameter so that the formal parameter is essentially an alias for the actual parameter.
For example, in C++:
void setToZero(int& a) { // C++
a = 1;
}
// ...
int x = 5;
setToZero(x);
cout << x; // prints 5
A major advantage of pass by reference is that is efficient in both time and memory. However, since it can modify the passed-in value, the programmer must be careful to not accidentally modify values that should not be changed.
C++ address this problem using const
, e.g.:
void display(const string& s) { // C++
cout << "s = " << s << endl;
}
Here, s
is passed by constant reference. No copy of s
is made, and
there is no way to modify s
within the display
function due to the
const
.
Note that while C only has pass by value, you can simulate pass by reference using pointers. For example:
void swap(int *a, int *b) {
int temp = *a;
*a = *b;
*b = temp;
}
// ...
int x = 3;
int y = 5;
swap(&x, &y);
// now x == 6 and y == 3
Pass By Name¶
Pass by name is a somewhat unusual parameter passing method that is not used in many modern languages. In pass by name, formal parameters in the function are, essentially, replaced by a textual copy of the actual parameter. This is how parameters are passed in C/C++ macros, for instance.
Lets see some examples of pass by name. Suppose Go used pass by name:
func adda(n int) int {
var a int = 3
return n + a
}
In a simple expression like adda(4)
, 7 is returned. But consider this
example:
func main() {
x := 4
fmt.Println(adda(x)) // error if passed by name!
}
When using pass by name, this is an error, because inside adda
n
is
replaced with x
. But x
is not in scope in adda
, and so an error
results. We could fix it like this:
var x int
func main() {
x = 4
fmt.Println(adda(x)) // prints 7
}
If we rename x
to a
, we get a different result:
var a int
func main() {
a = 4
fmt.Println(adda(a)) // error: prints(?) 6
}
The problem with this example is that the passed-in a
gets captured by the
local a
inside adda
, and so it is the local value of a
that is
used in the return value.
Eager Evaluation and Lazy Evaluation¶
A subtle detail about parameter passing is when, exactly, the actual parameters passed to a function get evaluated.
Consider this Go code:
package main
import "fmt"
func forever() int {
for { // infinite loop!
}
return 5
}
func two(n int) int {
fmt.Println("two")
return 2
}
func main() {
a := two(3) // a gets 2
b := two(forever())
fmt.Println(a, b)
}
The function call two(forever())
never returns a value because
forever()
contains an infinite loop. This shows that when
two(forever())
is called, the actual parameter is first evaluated, and the
result of its evaluation is passed to two
.
This idea of evaluating the actual parameter before calling the function is known as eager evaluation, and it is how most languages pass parameters.
Consider another example:
func two() int {
fmt.Println("two")
return 2
}
func three() int {
fmt.Println("three")
return 3
}
func ifthen(cond bool, trueResult, falseResult int) int {
if cond {
return trueResult
} else {
return falseResult
}
}
func main() {
x := 2
result := ifthen(x == 2, two(), three())
fmt.Println(result)
}
The idea here is that we want to create our own version of an if-then
statement using the function ifthen
. Unfortunately, due to eager
evaluation of parameters, it is impossible for ifthen
to simulate the real
if-then because trueResult
and falseResult
are both evaluated before
being passed to the function. Thus they are both executed, which is wrong: we
only want one of them to be executed.
This example shows that eager parameter evaluation means that you can’t write functions that mimic control structures. For most programmers, this is not a big deal because they were raised on languages where you lived with the control structures the language gave you and that’s that.
An alternative to eager evaluation is lazy evaluation. In lazy evaluation, an expression is not evaluated until its value is needed in a computation. So we get different behaviour in this example if lazy evaluation were used:
func forever() int {
for { // infinite loop!
}
return 5
}
func two(n int) int {
fmt.Println("two")
return 2
}
func main() {
a := two(3) // a gets 2
b := two(forever()) // b gets 2 (using lazy evaluation)
fmt.Println(a, b)
}
Using lazy evaluation, this program would end and print “2 2”. When
two(forever())
is called, forever()
is not evaluated because its
return value is not required (we just pass the whole function call to the
expression). The two
does actually need the return result of forever()
anywhere in its body, and so it returns 2 without ever evaluating
forever()
.
More usefully, lazy evaluation makes functions like ifthen
work correctly.
Only the result that is actually needed is returned.
The well-known Haskell language uses lazy evaluation, and so it has these benefits. Indeed, Haskell has some practical idioms that take advantage of infinite data structures that, in a language without lazy evaluation, would be infinite loops.
On the down side, experience with Haskell shows that it can be quite difficult to estimate the performance characteristics of a program that uses lazy evaluation. The problem is that you don’t know ahead of time when exactly an expression will be evaluated.