Slices and Arrays in Go ======================= (The following is based largely on http://blog.golang.org/slices.) Like most programming languages, Go_ has a data structure called an *array* that stores a sequence of objects of the same type. Less commonly, Go_ has a second --- and more useful --- data structure called a *slice* that lets you refer to a contiguous sub-sequence of elements from an underlying array. Arrays ------ Arrays are a fundamental data structure in Go_. For instance:: // declares arr to be an array of 5 ints, all initialized to 0 var arr [5]int for i, val := range arr { // print arr fmt.Printf("arr[%d] = %d\n", i, val) } This prints:: arr[0] = 0 arr[1] = 0 arr[2] = 0 arr[3] = 0 arr[4] = 0 When you create arrays, Go_ always initializes its element to the zero-value for the array type. You can also create an array using an *array literal*, e.g.:: var arr = [5]int{1, 2, 3, 4, 5} // array literal Ranged for-loops (i.e. loops that use the keyword ``range`` as above) are often the nicest way to traverse arrays in Go_. You can also use a more traditional C-style loop:: for i := 0; i < len(arr); i++ { fmt.Printf("arr[%d] = %d\n", i, arr[i]) } The ``len`` function returns the length of an array, while the usual square- bracket notion is used to access individual elements at a given position. The first index position of a Go_ array is always 0. You can't change the size of an array in Go_. Once an array is created, it's length remains fixed until it is disposed of by the garbage collector. You can pass an array to a function. As with all parameters, Go_ passes arrays by value (i.e. it makes a copy of the array):: func add1(arr [5]int) { for i := range arr { arr[i]++ } } Now call it with this code:: var arr [5]int // arr is initialized to all 0s add1(arr) // add 1 to every element of arr for i, val := range arr { // print the results fmt.Printf("arr[%d] = %d\n", i, val) } This prints:: arr[0] = 0 arr[1] = 0 arr[2] = 0 arr[3] = 0 arr[4] = 0 Nothing has changed because ``add1`` modified the *copy* of ``arr``, not the original. Copying an array is generally inefficient, and one way around this is to re- write ``add1`` to accept a pointer to the array, e.g.:: func add1(arr *[5]int) { for i := range *arr { (*arr)[i]++ } } Then you call it like this:: var arr [5]int add1(&arr) // note the & for i, val := range arr { fmt.Printf("arr[%d] = %d\n", i, val) } Now the original ``arr`` is changed:: arr[0] = 1 arr[1] = 1 arr[2] = 1 arr[3] = 1 arr[4] = 1 The code for this is rather messy. It's easy to forget a ``*`` or ``&``. Arrays have a significant limitation. Consider this slight change:: var arr [6]int // 6 elements instead of 5 add1(&arr) for i, val := range arr { fmt.Printf("arr[%d] = %d\n", i, val) } This doesn't even compile! You get a type mis-match error because ``add1`` expects an array of length 5, but we are passing it an array of length 6. In Go_, the length of an array is part of the array's type, so arrays of different lengths have different types. We get the same error in the version without pointers:: func add1(arr [5]int) { for i := range arr { arr[i]++ } } func main() { var arr [6]int // 6 elements instead of 5 add1(arr) for i, val := range arr { fmt.Printf("arr[%d] = %d\n", i, val) } } Again, this is a type mis-match error: you can't pass a 6-element array to a function expecting a 5-element array because an array's length is part of its type. This is a big problem with arrays. It essentially makes it impossible for us to write one function that works on arrays of different sizes. Go_ addresses these problems with arrays by introducing a new data structure called a *slice*. Slicing an Array ---------------- A **slice** is a data structure that describes a contiguous sequence of elements in an array. A slice is *not* an array, but for most purposes it acts like one. One way to make a slice is from an array:: var arr [5]int // an array with 5 elements var sl []int = arr[1:4] // a slice referring to arr[1], arr[2], arr[3] Here, ``sl`` is a slice, and it's type is ``[]int``, read "slice of ints". We can also define ``sl`` like this:: var sl = arr[1:4] // type of sl inferred from arr[1:4] Or this:: sl := arr[1:4] // can only use := if this line is in a function You can use a slice pretty much the same way that you use an array:: var arr [5]int for i := range arr { // arr is {0, 1, 2, 3, 4} arr[i] = i } sl := arr[1:4] // sl is a slice of length 3 for i := 0; i < len(sl); i++ { // print sl fmt.Printf("sl[%d] = %d\n", i, sl[i]) } It's important to understand that a slice always refers to an underlying array, and if that array changes, then so to does the slice:: arr[1] = -100 // change second element of arr for i := 0; i < len(sl); i++ { // print sl fmt.Printf("sl[%d] = %d\n", i, sl[i]) } This prints:: sl[0] = -100 sl[1] = 2 sl[2] = 3 You can pass slices to functions like this:: // sl can be any int slice of any length func display(sl []int) { for i, val := range sl { fmt.Printf("sl[%d] = %d\n", i, val) } } func main() { var arr [5]int sl1 := arr[1:4] // sl1 refers to arr[1], arr[2], arr[3] display(sl1) sl2 := arr[2:5] // sl2 refers to arr[2], arr[3], arr[4] display(sl2) } Unlike arrays, the length of a slice is *not* part of its type, so the ``display`` function can be passed any ``int`` slice. Slice Notation in General ------------------------- Suppose that ``arr`` is of length ``n``, and consider this statement:: s := arr[a:b] Since the indices of the underlying array ``arr`` start at 0 and end at ``n`` - 1, a few basic restrictions on ``a`` and ``b`` are clear: - Both ``a`` and ``b`` must be non-negative. This is because the 0 is the first index for the underlying array (all Go_ arrays start at 0). You will get either a compiler error or a run-time error if you use negative indices. For example, this causes a compiler error:: var arr [5]int s := arr[-2:4] // compiler error: negative index not allowed in a slice Or sometimes you get a run-time error like this:: var arr [5]int i := -2 s := arr[i:4] // runtime error when this line is executed fmt.Println(s[0]) // s must be used somewhere to avoid an // "s is unused error" It's unfortunate that Go_ can't recognize this indexing error at compile time. But, in general, the compiler can't always know the value of a variable before the program runs. For example, if ``i`` was read from the user, then the value of ``i`` is not known until the user types it. - Both ``a`` and ``b`` must be less than, or equal to, ``n``. Again, this is a consequence of the fact that the underlying array has ``n`` elements. An important detail is that a slice may refer to location ``n``, even through ``arr[n]`` does not exist. For example:: var arr [5]int s := arr[0:5] // okay: s is the entire array s = arr[0:4] // okay: s is the first 4 elements of the array s = arr[5:5] // okay: s is the empty slice, [] Note that we use ``:=`` only for the first assignment to ``s``. That's because ``:=`` assigns *and* creates the variable, and it would be an error to create ``s`` more than once. - The first index, ``a``, must be less than, or equal to, ``b``. Otherwise, you get an inverted slice error, e.g.:: var arr [5]int s := arr[3:1] // compiler error: slice indices are inverted To summarize, if ``arr`` is an array of non-negative length ``n``, then the slice ``arr[a:b]`` is a valid slice if 0 <= ``a`` <= ``b`` <= ``n``. Finally, there are a couple of slice indexing shorthands that are useful to know. Assuming ``arr`` is of length ``n``, then: - ``arr[a:]`` is the same as ``arr[a:n]`` - ``arr[:b]`` is the same as ``arr[0:b]`` - ``arr[:]`` is the same as ``arr[a:n]`` Slices of Slices ---------------- We've seen that you can create a slice from an array. You can also create a slice from another slice. For example:: var arr [5]int for i := range arr { arr[i] = i } s1 := arr[1:4] // a slice of an array s2 := s1[0:2] // a slice of a slice fmt.Printf("s1 is %v\n", s1) // %v prints the value of a Go object fmt.Printf("s2 is %v\n", s2) // in a standard notation This prints:: s1 is [1 2 3] s2 is [1 2] Both slices refer to the same underlying array, so if that changes, the slices change too:: // ... code same as above ... arr[1] = 77 fmt.Printf("\ns1 is %v\n", s1) fmt.Printf("s2 is %v\n", s2) The entire program prints:: s1 is [1 2 3] s2 is [1 2] s1 is [77 2 3] s2 is [77 2] We can also change the underlying array through a slice, e.g.:: // ... code same as above two examples ... s2[0] = -55 fmt.Printf("\narr is %v\n", arr) fmt.Printf("s1 is %v\n", s1) fmt.Printf("s2 is %v\n", s2) The entire program now prints:: s1 is [1 2 3] s2 is [1 2] s1 is [77 2 3] s2 is [77 2] arr is [0 -55 2 3 4] s1 is [-55 2 3] s2 is [-55 2] How Slices are Implemented -------------------------- It's instructive to see how slices are implemented in Go_, as that can help explain their behaviour. Conceptually, a slice could be implemented like this:: type sliceHeader struct { Length int Capacity int ZerothElement *int // pointer to an int } **Note:** We can't mimic slices perfectly because every variable in a Go_ structure must have an explicit type. Here, we've given ``ZerothElement`` the type ``*int``, so ``sliceHeader`` can only refer to arrays of ``int``\ s. In general, the ``ZerothElement`` must be of type ``T*``, where ``T`` is the type of the elements in the underlying array. Go_ currently has no way of expressing ``T``: that would require something like templates as in C++ or Java. Importantly, a ``sliceHeader`` uses a small, fixed amount of memory. It consists of two integers and a pointer. The elements of the slice are stored in the underlying array. Thus, the cost of creating or copying a slice is extremely low. The capacity of a slice is the number of elements in the underlying array from the start of the slice to the end of the array (not the end of the slice). The function ``cap`` returns a slice's capacity:: var arr [5]int s := arr[1:4] // a slice of an array fmt.Printf("length of s is %d\n", len(s)) fmt.Printf("capacity of s is %d\n", cap(s)) This prints:: length of s is 3 capacity of s is 4 So why does a slice store the capacity? The capacity of a slice is used for extending its size. Basically, you can extend the size of a slice up to its capacity. If you extend it past its capacity, you get an error. Consider this function for appending an integer onto the end of a slice:: func extend(slice []int, x int) []int { n := len(slice) slice = slice[0 : n+1] // can only slice up to a slice's capacity slice[n] = x return slice } Then run this code:: func main() { var arr [5]int s := arr[0:0] // initially empty slice: length 0, capacity 5 s = extend(s, 1) s = extend(s, 2) s = extend(s, 3) s = extend(s, 4) s = extend(s, 5) fmt.Printf("s is %v\n", s) s = extend(s, 6) // runtime error: slice bounds out of range } The sixth time ``extend`` is called gives a runtime error because there is no more space in the slice's underlying array. A slice can never be longer than the array it refers to. Passing a Slice to a Function ----------------------------- When you pass an array to a function, the array is passed by value, i.e. a copy of the array is made. This is both inefficient for large arrays, and also a limitation because the function cannot modify the original array (it can only modify the copy). Slices behave differently. When you pass a slice to a function, the slice data structure is copied, but the underlying array is *not* copied. The copied slice points to the same underlying array as the original slice, and so both slices access the same array. Thus passing a slice to a function is not only efficient, but it lets you modify the underlying array. For example:: func add1(s []int) { // s is slice for i := range s { s[i]++ } } func main() { var arr [5]int // arr is an array s := arr[0:5] add1(s) for i, val := range s { fmt.Printf("s[%d] = %d\n", i, val) } } This prints:: s[0] = 1 s[1] = 1 s[2] = 1 s[3] = 1 s[4] = 1 Making Slices with make ----------------------- We've seen that one way to create a slice is directly from an array. Another option is to use Go's ``make`` function like this:: s := make([]int, 10, 15) // makes an int slice of length 10, capacity 15 This statement creates both an array of ints of length 15, and a slice of length 10 (and capacity 15) that refers to that array. If you want to create a slice whose length and capacity are equal, you don't need to pass the third parameter to ``make``:: s := make([]int, 10) // same as make([]int, 10, 10) If you need to increase the length of a slice past its capacity, your only choice is to create a new slice. This is usually done using the ``append`` function discussed below, but here we will do it "by hand" using ``make``:: s := make([]int, 10, 15) fmt.Printf("length of s is %d\n", len(s)) fmt.Printf("capacity of s is %d\n", cap(s)) // make a new slice with the same length of s but twice the capacity s2 := make([]int, len(s), 2 * cap(s)) for i := range s { // copy values of s into s2 s2[i] = s[i] } s = s2 fmt.Printf("\nlength of s is %d\n", len(s)) fmt.Printf("capacity of s is %d\n", cap(s)) The copy Function ----------------- In the example code above we copied the values from one slice to another using a loop. Copying slices is common enough in Go_ that a built-in copying function is provided. Conveniently, it's called ``copy``:: s1 := []int{1, 2, 3} // slice literal s2 := make([]int, len(s1), 2 * cap(s1)) copy(s2, s1) // copy values from s1 into s2 fmt.Printf("%v\n", s2) // prints: [1 2 3] Notice that ``s1`` is initialized using a slice literal, which is yet another way to create a slice. If we wrote ``[3]int{1, 2, 3}``, then it would instead be an array literal. Not only is ``copy`` easier to type than a loop, but it correctly handles a few special cases. First, the number of values copied by ``copy`` equals the minimum of the lengths of its two arguments. For example:: s1 := []int{1, 2, 3} // slice literal s2 := make([]int, 2) // slice of length 2, capacity 2 copy(s2, s1) // copy values from s1 into s2 fmt.Printf("s2 = %v\n", s2) // prints: [1 2] The other thing ``copy`` does is to correctly handle over-lapping slices within the same underlying array. This lets you use ``copy`` to move items around inside a slice. For example:: s := []int{0, 1, 2, 3, 4} // s is a slice of ints copy(s[3:5], s[2:4]) // move [2 3] one position up in s fmt.Printf("s = %v\n", s) // prints: [0 1 2 2 3] The append Function ------------------- What happens if you want a slice to grow longer than it's capacity? As mentioned above, once a slice is created it's capacity can't change because it's limited by the length of the underlying array (which can't change). So you need to create a new, longer slice, and this is what the built-in ``append`` function does (when necessary). For example:: s := []int{0, 1, 2, 3, 4} fmt.Printf("len(s) is %d\n", len(s)) fmt.Printf("cap(s) is %d\n", cap(s)) s = append(s, 5) // append one new element onto the end of s fmt.Printf("\ns = %v\n", s) fmt.Printf("len(s) is %d\n", len(s)) fmt.Printf("cap(s) is %d\n", cap(s)) Running this code prints this:: len(s) is 5 cap(s) is 5 s = [0 1 2 3 4 5] len(s) is 6 cap(s) is 12 Appending 1 new element to ``s`` increases its length by 1, but its capacity has increased from 5 to 12. That's because when ``append`` is forced to increase the slice's capacity, it allocates more memory than is strictly needed so that future appends will be faster. Note that `the documentation for append <https://golang.org/pkg/builtin/#append>`_ does **not** say exactly how much the capacity of the array will be increased, and so it is possible that different versions of Go_ might print different capacities in this example. The implementation of ``append`` is something like this:: func Append(s []int, elements ...int) []int { n := len(s) total := len(s) + len(elements) if total > cap(s) { // reallocate newSize := 2 * total + 1 newSlice := make([]int, total, newSize) copy(newSlice, s) s = newSlice } s = s[:total] copy(s[n:], elements) return s } (This only works with ``int`` slices, while the built-in ``append`` works with any type of slice. But the essential code is the same.) As you can see, a new slice is created only when there is no more capacity in ``s`` to hold all everything in ``elements``. Thus appending is often extremely efficient, and a new slice is only created when necessary. One new feature used here is the argument type ``...int``. This lets you pass in a single ``int``, or an entire slice of ints. For example:: s := []int{0, 1, 2, 3, 4} t := []int{5, 6, 7} s = append(s, t...) // the ... is necessary fmt.Printf("\ns = %v\n", s) fmt.Printf("len(s) is %d\n", len(s)) fmt.Printf("cap(s) is %d\n", cap(s)) Exercises --------- #. Write a function called ``shrink(s []int)`` that returns a new slice with the same elements as ``s``, except the capacity of the newly returned slice is the same as its length. .. _Go: http://golang.org/