Slices and Arrays in Go
=======================

(The following is based largely on http://blog.golang.org/slices.)

Like most programming languages, Go_ has a data structure called an *array*
that stores a sequence of objects of the same type. Less commonly, Go_ has a
second --- and more useful --- data structure called a *slice* that lets you
refer to a contiguous sub-sequence of elements from an underlying array.


Arrays
------

Arrays are a fundamental data structure in Go_. For instance::
    
    // declares arr to be an array of 5 ints, all initialized to 0
    var arr [5]int

    for i, val := range arr {                  // print arr
        fmt.Printf("arr[%d] = %d\n", i, val)
    }

This prints::

    arr[0] = 0
    arr[1] = 0
    arr[2] = 0
    arr[3] = 0
    arr[4] = 0

When you create arrays, Go_ always initializes its element to the zero-value
for the array type.

You can also create an array using an *array literal*, e.g.::

    var arr = [5]int{1, 2, 3, 4, 5}  // array literal

Ranged for-loops (i.e. loops that use the keyword ``range`` as above) are
often the nicest way to traverse arrays in Go_. You can also use a more
traditional C-style loop::

    for i := 0; i < len(arr); i++ {
        fmt.Printf("arr[%d] = %d\n", i, arr[i])
    }

The ``len`` function returns the length of an array, while the usual square-
bracket notion is used to access individual elements at a given position. The
first index position of a Go_ array is always 0.

You can't change the size of an array in Go_. Once an array is created, it's
length remains fixed until it is disposed of by the garbage collector.

You can pass an array to a function. As with all parameters, Go_ passes arrays
by value (i.e. it makes a copy of the array)::

    func add1(arr [5]int) {
        for i := range arr {
            arr[i]++
        }
    }

Now call it with this code::

    var arr [5]int               // arr is initialized to all 0s
    add1(arr)                    // add 1 to every element of arr
    for i, val := range arr {    // print the results
        fmt.Printf("arr[%d] = %d\n", i, val)
    }

This prints::

    arr[0] = 0
    arr[1] = 0
    arr[2] = 0
    arr[3] = 0
    arr[4] = 0

Nothing has changed because ``add1`` modified the *copy* of ``arr``, not the
original.

Copying an array is generally inefficient, and one way around this is to re-
write ``add1`` to accept a pointer to the array, e.g.::

    func add1(arr *[5]int) {
        for i := range *arr {
            (*arr)[i]++
        }
    }

Then you call it like this::

    var arr [5]int               
    add1(&arr)                   // note the & 
    for i, val := range arr {    
        fmt.Printf("arr[%d] = %d\n", i, val)
    }

Now the original ``arr`` is changed::
    
    arr[0] = 1
    arr[1] = 1
    arr[2] = 1
    arr[3] = 1
    arr[4] = 1

The code for this is rather messy. It's easy to forget a ``*`` or ``&``.

Arrays have a significant limitation. Consider this slight change::

    var arr [6]int               // 6 elements instead of 5
    add1(&arr)                   
    for i, val := range arr {    
        fmt.Printf("arr[%d] = %d\n", i, val)
    }

This doesn't even compile! You get a type mis-match error because ``add1``
expects an array of length 5, but we are passing it an array of length 6. In
Go_, the length of an array is part of the array's type, so arrays of
different lengths have different types.

We get the same error in the version without pointers::

    func add1(arr [5]int) {
        for i := range arr {
            arr[i]++
        }
    }

    func main() {
        var arr [6]int               // 6 elements instead of 5          
        add1(arr)                   
        for i, val := range arr {    
            fmt.Printf("arr[%d] = %d\n", i, val)
        }
    }

Again, this is a type mis-match error: you can't pass a 6-element array to a
function expecting a 5-element array because an array's length is part of its
type.

This is a big problem with arrays. It essentially makes it impossible for us
to write one function that works on arrays of different sizes. 

Go_ addresses these problems with arrays by introducing a new data structure
called a *slice*.


Slicing an Array
----------------

A **slice** is a data structure that describes a contiguous sequence of
elements in an array. A slice is *not* an array, but for most purposes it acts
like one.

One way to make a slice is from an array::

    var arr [5]int           // an array with 5 elements

    var sl []int = arr[1:4]  // a slice referring to arr[1], arr[2], arr[3]

Here, ``sl`` is a slice, and it's type is ``[]int``, read "slice of ints". We
can also define ``sl`` like this::

    var sl = arr[1:4]  // type of sl inferred from arr[1:4]

Or this::

    sl := arr[1:4]  // can only use := if this line is in a function

You can use a slice pretty much the same way that you use an array::

    var arr [5]int
    for i := range arr {   // arr is {0, 1, 2, 3, 4}
        arr[i] = i
    }
    
    sl := arr[1:4]         // sl is a slice of length 3
    
    for i := 0; i < len(sl); i++ {            // print sl
        fmt.Printf("sl[%d] = %d\n", i, sl[i])
    }

It's important to understand that a slice always refers to an underlying
array, and if that array changes, then so to does the slice::

    arr[1] = -100   // change second element of arr
    
    for i := 0; i < len(sl); i++ {              // print sl
        fmt.Printf("sl[%d] = %d\n", i, sl[i])
    }

This prints::

    sl[0] = -100
    sl[1] = 2
    sl[2] = 3

You can pass slices to functions like this::

    // sl can be any int slice of any length
    func display(sl []int) {
        for i, val := range sl {
            fmt.Printf("sl[%d] = %d\n", i, val)
        }
    }

    func main() {
        var arr [5]int
        
        sl1 := arr[1:4]  // sl1 refers to arr[1], arr[2], arr[3]
        display(sl1)
        
        sl2 := arr[2:5]  // sl2 refers to arr[2], arr[3], arr[4]
        display(sl2)
    }

Unlike arrays, the length of a slice is *not* part of its type, so the
``display`` function can be passed any ``int`` slice.


Slice Notation in General
-------------------------

Suppose that ``arr`` is of length ``n``, and consider this statement::

    s := arr[a:b]

Since the indices of the underlying array ``arr`` start at 0 and end at ``n``
- 1, a few basic restrictions on ``a`` and ``b`` are clear:

- Both ``a`` and ``b`` must be non-negative. This is because the 0 is the 
  first index for the underlying array (all Go_ arrays start at 0). 

  You will get either a compiler error or a run-time error if you use negative
  indices. For example, this causes a compiler error::

    var arr [5]int
    
    s := arr[-2:4]  // compiler error: negative index not allowed in a slice

  Or sometimes you get a run-time error like this::

    var arr [5]int
    
    i := -2
    
    s := arr[i:4]      // runtime error when this line is executed
    
    fmt.Println(s[0])  // s must be used somewhere to avoid an 
                       // "s is unused error"

  It's unfortunate that Go_ can't recognize this indexing error at compile
  time. But, in general, the compiler can't always know the value of a
  variable before the program runs. For example, if ``i`` was read from the
  user, then the value of ``i`` is not known until the user types it.

- Both ``a`` and ``b`` must be less than, or equal to, ``n``. Again, this is a
  consequence of the fact that the underlying array has ``n`` elements.

  An important detail is that a slice may refer to location ``n``, even
  through ``arr[n]`` does not exist. For example::

    var arr [5]int
    
    s := arr[0:5]   // okay: s is the entire array

    s = arr[0:4]    // okay: s is the first 4 elements of the array

    s = arr[5:5]    // okay: s is the empty slice, []

  Note that we use ``:=`` only for the first assignment to ``s``. That's
  because ``:=`` assigns *and* creates the variable, and it would be an error
  to create ``s`` more than once.

- The first index, ``a``, must be less than, or equal to, ``b``. Otherwise, 
  you get an inverted slice error, e.g.::

    var arr [5]int
    
    s := arr[3:1]  // compiler error: slice indices are inverted

To summarize, if ``arr`` is an array of non-negative length ``n``, then the
slice ``arr[a:b]`` is a valid slice if 0 <= ``a`` <= ``b`` <= ``n``.

Finally, there are a couple of slice indexing shorthands that are useful to
know. Assuming ``arr`` is of length ``n``, then:

- ``arr[a:]`` is the same as ``arr[a:n]``

- ``arr[:b]`` is the same as ``arr[0:b]``

- ``arr[:]`` is the same as ``arr[a:n]``


Slices of Slices
----------------

We've seen that you can create a slice from an array. You can also create a
slice from another slice. For example::

    var arr [5]int
    
    for i := range arr { arr[i] = i }
    
    s1 := arr[1:4]  // a slice of an array
    s2 := s1[0:2]   // a slice of a slice
    
    fmt.Printf("s1 is %v\n", s1)  // %v prints the value of a Go object
    fmt.Printf("s2 is %v\n", s2)  // in a standard notation

This prints::

    s1 is [1 2 3]
    s2 is [1 2]

Both slices refer to the same underlying array, so if that changes, the slices
change too::

    // ... code same as above ...

    arr[1] = 77
    
    fmt.Printf("\ns1 is %v\n", s1)
    fmt.Printf("s2 is %v\n", s2)

The entire program prints::

    s1 is [1 2 3]
    s2 is [1 2]
    
    s1 is [77 2 3]
    s2 is [77 2]

We can also change the underlying array through a slice, e.g.::

    // ... code same as above two examples ...

    s2[0] = -55

    fmt.Printf("\narr is %v\n", arr)
    fmt.Printf("s1 is %v\n", s1)
    fmt.Printf("s2 is %v\n", s2)

The entire program now prints::

    s1 is [1 2 3]
    s2 is [1 2]

    s1 is [77 2 3]
    s2 is [77 2]

    arr is [0 -55 2 3 4]
    s1 is [-55 2 3]
    s2 is [-55 2]


How Slices are Implemented
--------------------------

It's instructive to see how slices are implemented in Go_, as that can help
explain their behaviour. Conceptually, a slice could be implemented like
this::

    type sliceHeader struct {
        Length        int
        Capacity      int
        ZerothElement *int     // pointer to an int
    }

**Note:** We can't mimic slices perfectly because every variable in a Go_
structure must have an explicit type. Here, we've given ``ZerothElement`` the
type ``*int``, so ``sliceHeader`` can only refer to arrays of ``int``\ s. In
general, the ``ZerothElement`` must be of type ``T*``, where ``T`` is the type
of the elements in the underlying array. Go_ currently has no way of
expressing ``T``: that would require something like templates as in C++ or
Java.

Importantly, a ``sliceHeader`` uses a small, fixed amount of memory. It
consists of two integers and a pointer. The elements of the slice are stored
in the underlying array. Thus, the cost of creating or copying a slice is
extremely low.

The capacity of a slice is the number of elements in the underlying array from
the start of the slice to the end of the array (not the end of the slice). The
function ``cap`` returns a slice's capacity::

    var arr [5]int
    
    s := arr[1:4]  // a slice of an array

    fmt.Printf("length of s is %d\n", len(s))
    fmt.Printf("capacity of s is %d\n", cap(s))

This prints::

    length of s is 3
    capacity of s is 4

So why does a slice store the capacity? The capacity of a slice is used for
extending its size. Basically, you can extend the size of a slice up to its
capacity. If you extend it past its capacity, you get an error.

Consider this function for appending an integer onto the end of a slice::

    func extend(slice []int, x int) []int {
        n := len(slice)
        slice = slice[0 : n+1]  // can only slice up to a slice's capacity 
        slice[n] = x
        return slice
    }

Then run this code::

    func main() {
        var arr [5]int

        s := arr[0:0]  // initially empty slice: length 0, capacity 5 

        s = extend(s, 1)
        s = extend(s, 2)
        s = extend(s, 3)
        s = extend(s, 4)
        s = extend(s, 5)

        fmt.Printf("s is %v\n", s)
        
        s = extend(s, 6)  // runtime error: slice bounds out of range
    }

The sixth time ``extend`` is called gives a runtime error because there is no
more space in the slice's underlying array. A slice can never be longer than
the array it refers to.


Passing a Slice to a Function
-----------------------------

When you pass an array to a function, the array is passed by value, i.e. a
copy of the array is made. This is both inefficient for large arrays, and also
a limitation because the function cannot modify the original array (it can
only modify the copy).

Slices behave differently. When you pass a slice to a function, the slice data
structure is copied, but the underlying array is *not* copied. The copied
slice points to the same underlying array as the original slice, and so both
slices access the same array. Thus passing a slice to a function is not only
efficient, but it lets you modify the underlying array.

For example::

    func add1(s []int) {     // s is slice
        for i := range s {
            s[i]++
        }
    }

    func main() { 
        var arr [5]int      // arr is an array
        s := arr[0:5]

        add1(s)

        for i, val := range s {
            fmt.Printf("s[%d] = %d\n", i, val)
        }
    }

This prints::

    s[0] = 1
    s[1] = 1
    s[2] = 1
    s[3] = 1
    s[4] = 1


Making Slices with make
-----------------------

We've seen that one way to create a slice is directly from an array. Another
option is to use Go's ``make`` function like this::

    s := make([]int, 10, 15)  // makes an int slice of length 10, capacity 15

This statement creates both an array of ints of length 15, and a slice of
length 10 (and capacity 15) that refers to that array.

If you want to create a slice whose length and capacity are equal, you don't
need to pass the third parameter to ``make``::

    s := make([]int, 10)  // same as make([]int, 10, 10)

If you need to increase the length of a slice past its capacity, your only
choice is to create a new slice. This is usually done using the ``append``
function  discussed below, but here we will do it "by hand" using ``make``::

    s := make([]int, 10, 15)

    fmt.Printf("length of s is %d\n", len(s))
    fmt.Printf("capacity of s is %d\n", cap(s))

    // make a new slice with the same length of s but twice the capacity
    s2 := make([]int, len(s), 2 * cap(s))
    
    for i := range s {  // copy values of s into s2
        s2[i] = s[i]
    }

    s = s2
    fmt.Printf("\nlength of s is %d\n", len(s))
    fmt.Printf("capacity of s is %d\n", cap(s))


The copy Function
-----------------

In the example code above we copied the values from one slice to another using
a loop. Copying slices is common enough in Go_ that a built-in copying
function is provided. Conveniently, it's called ``copy``::

    s1 := []int{1, 2, 3}                     // slice literal
    s2 := make([]int, len(s1), 2 * cap(s1))

    copy(s2, s1)  // copy values from s1 into s2

    fmt.Printf("%v\n", s2)  // prints: [1 2 3]

Notice that ``s1`` is initialized using a slice literal, which is yet another
way to create a slice. If we wrote ``[3]int{1, 2, 3}``, then it would instead
be an array literal.

Not only is ``copy`` easier to type than a loop, but it correctly handles a
few special cases. First, the number of values copied by ``copy`` equals the
minimum of the lengths of its two arguments. For example::

    s1 := []int{1, 2, 3}   // slice literal
    s2 := make([]int, 2)   // slice of length 2, capacity 2

    copy(s2, s1)  // copy values from s1 into s2

    fmt.Printf("s2 = %v\n", s2) // prints: [1 2]

The other thing ``copy`` does is to correctly handle over-lapping slices
within the same underlying array. This lets you use ``copy`` to move items
around inside a slice. For example::

    s := []int{0, 1, 2, 3, 4}   // s is a slice of ints

    copy(s[3:5], s[2:4])        // move [2 3] one position up in s

    fmt.Printf("s = %v\n", s)   // prints: [0 1 2 2 3]


The append Function
-------------------

What happens if you want a slice to grow longer than it's capacity? As
mentioned above, once a slice is created it's capacity can't change because
it's limited by the length of the underlying array (which can't change). So
you need to create a new, longer slice, and this is what the built-in
``append`` function does (when necessary).

For example::

    s := []int{0, 1, 2, 3, 4}

    fmt.Printf("len(s) is %d\n", len(s))
    fmt.Printf("cap(s) is %d\n", cap(s))

    s = append(s, 5)  // append one new element onto the end of s

    fmt.Printf("\ns = %v\n", s)
    fmt.Printf("len(s) is %d\n", len(s))
    fmt.Printf("cap(s) is %d\n", cap(s))

Running this code prints this::

    len(s) is 5
    cap(s) is 5

    s = [0 1 2 3 4 5]
    len(s) is 6
    cap(s) is 12

Appending 1 new element to ``s`` increases its length by 1, but its capacity
has increased from 5 to 12. That's because when ``append`` is forced to
increase the slice's capacity, it allocates more memory than is strictly
needed so that future appends will be faster. Note that `the documentation for
append <https://golang.org/pkg/builtin/#append>`_ does **not** say exactly how
much the capacity of the array will be increased, and so it is possible that
different versions of Go_ might print different capacities in this example.

The implementation of ``append`` is something like this::

    func Append(s []int, elements ...int) []int {
        n := len(s)
        total := len(s) + len(elements)
        if total > cap(s) {
            // reallocate
            newSize := 2 * total + 1
            newSlice := make([]int, total, newSize)
            copy(newSlice, s)
            s = newSlice
        }
        s = s[:total]
        copy(s[n:], elements)
        return s
    }

(This only works with ``int`` slices, while the built-in ``append`` works with
any type of slice. But the essential code is the same.)

As you can see, a new slice is created only when there is no more capacity in
``s`` to hold all everything in ``elements``. Thus appending is often
extremely efficient, and a new slice is only created when necessary.

One new feature used here is the argument type ``...int``. This lets you pass
in a single ``int``, or an entire slice of ints. For example::

    s := []int{0, 1, 2, 3, 4}
    t := []int{5, 6, 7}

    s = append(s, t...)  // the ... is necessary

    fmt.Printf("\ns = %v\n", s)
    fmt.Printf("len(s) is %d\n", len(s))
    fmt.Printf("cap(s) is %d\n", cap(s))


Exercises
---------

#. Write a function called ``shrink(s []int)`` that returns a new slice with 
   the same elements as ``s``, except the capacity of the newly returned 
   slice is the same as its length.

.. _Go: http://golang.org/