Chapter 7 Notes

Please read chapter 7 of the textbook. Note that we have already covered vectors and strings (chapter 8), and so in these notes we will occasionally make reference to them.

Arrays

an array in C++ (and C) is, essentially, a contiguous chunk of memory that can hold a fixed number of values of a specific type

arrays are similar to vectors, but much lower-level

indeed, the C++ vector class itself is typically implemented using arrays

if you are using C (instead of C++), then you must use arrays since C does not have the vector class (or the string class)

as we will see, C arrays are, in a sense, very simple and thus easy to use

but they are also very error prone, and for many (perhaps most) applications vector (and string) are better choices

Arrays

here’s an example of declaring an array and printing its values

double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

// regular for-loop
for(int i = 0; i < 5; ++i) {
        cout << temps[i] << "\n";
}

cout << "\n";

// for-each loop
for(double t : temps) {
        cout << t << "\n";
}

notice how the array temps is declared using []-brackets

double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

this declares temps to be an array of double values whose size (length) is 5

in other words, temps can hold exactly 5 double values

the double_s are guaranteed to be stored contiguously in memory

and so, assuming a double is 8 bytes (i.e. 64 bits), then the array temps takes up \(5 \cdot 8 = 40\) bytes of memory

once created, the size of an array never changes

Arrays

we could also have defined temps like this

double temps[5] = {3.3, 3.1, -4, 1.5, 2.6};

being explicit like this is often a good idea, although (with g++, at least), you can run into confusing situations like this:

// temps is size 6, but is initialized with only 5 values
    double temps[6] = {3.3, 3.1, -4, 1.5, 2.6};

sometimes it may be necessary if you don’t know the initial values for the array, e.g.

cout << "How many temperatures do you have? ";
int n;
cin >> n;
double temps[n];
for(int i = 0; i < n; ++i) {
        cout << "temperature " << (i + 1) << " = ";
        cin >> temps[i];
}

for(double x : temps) {  // note the for-each loop
        cout << x << " ";
}
cout << "\n";

Array Indexing

arrays are index like vectors and strings

e.g. temps[0] is the first element, temps[1] is the second element, and so on

   0     1     2     3     4
 ----- ----- ----- ----- -----
| 3.3 | 3.1 | -4  | 1.5 | 2.6 |
 ----- ----- ----- ----- -----

C++ (and C) do not do any range-checking for arrays

so if you write code like this you typically don’t get any warning until something goes wrong in your program

cout << temps[-1]  // oops: -1 is not a valid index
     << temps[5];  // oops: 5 is not a valid index

the value returned by temps[-1] is whatever happens to be in the memory location just before temps[0]

  -1     0     1     2     3     4     5
 ----- ----- ----- ----- ----- ----- -----
| ??? | 3.3 | 3.1 | -4  | 1.5 | 2.6 | ??? |
 ----- ----- ----- ----- ----- ----- -----

such errors can be hard to track down because the values in the ??? positions could be different on different runs of the program

there are also security concerns with this

it’s possible that, say, temp[5] is a region of memory that your program is supposed to have access to

experience has shown the clever hackers can exploit this sort of issue to help break into computers

this particular flaw is known as a “buffer over-run”, and is a well-known weakness of many C/C++ programs

Array Indexing

on the plus side, the lack of range-checking makes array accesses fast and simple

it’s instructive to understand hte idea of how array indexing works

double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

when this code runs, C++ allocates \(5 \cdot 8 = 40\) bytes of memory

temps stores the address of the first element of this array

  654   662   670   678   686
 ----- ----- ----- ----- -----
| 3.3 | 3.1 | -4  | 1.5 | 2.6 |
 ----- ----- ----- ----- -----

lets assume the address of the first double is at the memory location whose address is 654

that the second double (i.e. 3.1) is at byte \(654 + 8 = 662\) (we are assuming a double is 8 bytes long)

the third double (i.e. -45) is at byte \(654 + 2\cdot 8 = 662\)

and so on

the exact starting address is usually different from run to run of this code, since the exact placement in memory of a program is up to your computer’s operating system

for an array to work, we need to know its starting address

suppose 654 (the address of the first double in the array) at memory location 879

        879
       -----
temps | 654 |
       -----

now, what is temps[0]?

temps[0] is the double at memory location \(654 + 8 \cdot 0 = 654\)

temps[1] is the double at memory location \(654 + 1 \cdot 8 = 662\)

temps[2] is the double at memory location \(654 + 2 \cdot 8 = 670\)

and so on

in general, temps[i] refers to the memory location \(654 + i \cdot 8\)

notice that this indexing scheme requires that the elements in the array all be of the same size (i.e. 8 bytes in this example)

that’s important!

this won’t work (as easily, at least) with values of different sizes

note also that the we know to multiply i by 8 because temps stores double values, and that a double is 8 bytes

The Size of an Array

C++ arrays don’t store their size

so it’s up to the programmer to keep track of an array’s size “by hand”

// n is the number of elements in arr
    void print(double arr[], int n) {
            for(int i = 0; i < n; ++i) {
                    cout << arr[i] << " ";
            }
    }

// ...

    double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
    print(temps, 5);

notice that the parameter in the function header is double arr[] with no size in the []-brackets

that allows arr to be of any size

n is passed as a second parameter, and it is the number of elements in arr

if n is negative, or greater than the true size of the array, then bad values can be printed

double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; print(temps, 6); // oops: 6 is not the size of temps

this will print temps, plus the value at location temps[5], which has some unknown value

in contrast, C++ vectors keep track of their own size, and so (as we’ve seen previously), a function that prints the elements of a vector only need the vector

void print(vector<double> arr) {
        for(int i = 0; i < arr.size(); ++i) {
                cout << arr[i] << " ";
        }
}

this is simpler, and also means you will never pass the wrong size to the function

Passing Arrays to Functions

consider these functions

double sum(double arr[], int n) {
        double total = 0.0;
        for(int i = 0; i < n; ++i) {
                total += arr[i];
        }
        return total;
}

double mean(double arr[], int n) {
        return sum(arr, n) / n;
}

// ...

double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
cout << sum(temps, 5) << "\n"
     << mean(temps, 5) << "\n";

C++ passes arr by value

but that doesn’t mean the array elements are copied!

the value stored in arr is the address of the first element of arr

it’s that address that is copied when arr is passed by value

so that means you have access to the actual array locations

this lets you write functions like this

void set_all_zero(double arr[], int n) {
        for(int i = 0; i < n; ++i) {
                arr[i] = 0.0;
        }
}

// ...

double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
set_all_zero(temps, 5);
print(temps, 5);  // 0 0 0 0

Multi-dimensional Arrays

temps is a 1-dimensional array

double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

you can create arrays with more than 1 dimensions, e.g.

double matrix[3][3] = {
        {1, 3, 4},
        {4, 3, 1},
        {0, 8, 2}
};

for(int i = 0; i < 3; ++i) {
        for(int j = 0; j < 3; ++j) {
                cout << matrix[i][j] << " ";
        }
        cout << "\n";
}

// 1 3 4
// 4 3 1
// 0 8 2

matrix is an array of arrays, and so you an write code like this

// sum of the 2nd row
cout << sum(matrix[1], 3)  // sum was defined previously
     << "\n";

C-style Strings

a C-style string is an array of characters that ends with the special character '\0'

they are more difficult and error-prone to work with than C++ strings

Comparing Arrays and Vectors

pros of C++ vectors include

  • they know their size; arrays don’t
  • use the same []-bracket notation as arrays for indexing
  • they are dynamic, i.e. their size can shrink/grow as a program is running; arrays have a fixed size that never changes
  • vector is class, and so you can use it to quickly create your vector- like classes; arrays are not classes

cons of C++ vectors include

  • they may use more memory than necessary to store its values, and in some situations, the memory used could be twice the size of the memory needed to store the values; the use of extra memory is intentional, and is meant to make re-sizing the vector more efficient
  • you can’t use them in C
  • in functions that pass/return arrays, you can’t use vectors in their place