.. highlight:: c++ Chapter 7 Notes =============== Please read chapter 7 of the textbook. Note that we have already covered vectors and strings (chapter 8), and so in these notes we will occasionally make reference to them. Arrays ------ an array in C++ (and C) is, essentially, a contiguous chunk of memory that can hold a fixed number of values of a specific type arrays are similar to vectors, but much lower-level indeed, the C++ ``vector`` class itself is typically implemented using arrays if you are using C (instead of C++), then you must use arrays since C does not have the ``vector`` class (or the ``string`` class) as we will see, C arrays are, in a sense, very simple and thus easy to use but they are also very error prone, and for many (perhaps most) applications ``vector`` (and ``string``) are better choices Arrays ------ here's an example of declaring an array and printing its values :: double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; // regular for-loop for(int i = 0; i < 5; ++i) { cout << temps[i] << "\n"; } cout << "\n"; // for-each loop for(double t : temps) { cout << t << "\n"; } notice how the array ``temps`` is declared using []-brackets :: double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; this declares ``temps`` to be an array of ``double`` values whose size (length) is 5 in other words, ``temps`` can hold exactly 5 ``double`` values the ``double``\_s are guaranteed to be stored contiguously in memory and so, assuming a ``double`` is 8 bytes (i.e. 64 bits), then the array ``temps`` takes up :math:`5 \cdot 8 = 40` bytes of memory once created, **the size of an array never changes** Arrays ------ we could also have defined ``temps`` like this :: double temps[5] = {3.3, 3.1, -4, 1.5, 2.6}; being explicit like this is often a good idea, although (with g++, at least), you can run into confusing situations like this:: // temps is size 6, but is initialized with only 5 values double temps[6] = {3.3, 3.1, -4, 1.5, 2.6}; sometimes it may be necessary if you don't know the initial values for the array, e.g. :: cout << "How many temperatures do you have? "; int n; cin >> n; double temps[n]; for(int i = 0; i < n; ++i) { cout << "temperature " << (i + 1) << " = "; cin >> temps[i]; } for(double x : temps) { // note the for-each loop cout << x << " "; } cout << "\n"; Array Indexing -------------- arrays are index like vectors and strings e.g. ``temps[0]`` is the first element, ``temps[1]`` is the second element, and so on :: 0 1 2 3 4 ----- ----- ----- ----- ----- | 3.3 | 3.1 | -4 | 1.5 | 2.6 | ----- ----- ----- ----- ----- C++ (and C) do **not** do any range-checking for arrays so if you write code like this you typically don't get any warning until something goes wrong in your program :: cout << temps[-1] // oops: -1 is not a valid index << temps[5]; // oops: 5 is not a valid index the value returned by ``temps[-1]`` is whatever happens to be in the memory location just before ``temps[0]`` :: -1 0 1 2 3 4 5 ----- ----- ----- ----- ----- ----- ----- | ??? | 3.3 | 3.1 | -4 | 1.5 | 2.6 | ??? | ----- ----- ----- ----- ----- ----- ----- such errors can be hard to track down because the values in the ??? positions could be different on different runs of the program there are also security concerns with this it's possible that, say, ``temp[5]`` is a region of memory that your program is supposed to have access to experience has shown the clever hackers can exploit this sort of issue to help break into computers this particular flaw is known as a "buffer over-run", and is a well-known weakness of many C/C++ programs Array Indexing -------------- on the plus side, the lack of range-checking makes array accesses fast and simple it's instructive to understand hte idea of how array indexing works :: double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; when this code runs, C++ allocates :math:`5 \cdot 8 = 40` bytes of memory ``temps`` stores the address of the first element of this array :: 654 662 670 678 686 ----- ----- ----- ----- ----- | 3.3 | 3.1 | -4 | 1.5 | 2.6 | ----- ----- ----- ----- ----- lets assume the address of the first ``double`` is at the memory location whose address is 654 that the second ``double`` (i.e. 3.1) is at byte :math:`654 + 8 = 662` (we are assuming a ``double`` is 8 bytes long) the third ``double`` (i.e. -45) is at byte :math:`654 + 2\cdot 8 = 662` and so on the exact starting address is usually different from run to run of this code, since the exact placement in memory of a program is up to your computer's operating system for an array to work, we need to know its starting address suppose 654 (the address of the first ``double`` in the array) at memory location 879 :: 879 ----- temps | 654 | ----- now, what is ``temps[0]``? ``temps[0]`` is the ``double`` at memory location :math:`654 + 8 \cdot 0 = 654` ``temps[1]`` is the ``double`` at memory location :math:`654 + 1 \cdot 8 = 662` ``temps[2]`` is the ``double`` at memory location :math:`654 + 2 \cdot 8 = 670` and so on in general, ``temps[i]`` refers to the memory location :math:`654 + i \cdot 8` notice that this indexing scheme requires that the elements in the array all be of the same size (i.e. 8 bytes in this example) that's important! this won't work (as easily, at least) with values of different sizes note also that the we know to multiply ``i`` by 8 because ``temps`` stores ``double`` values, and that a ``double`` is 8 bytes The Size of an Array -------------------- C++ arrays don't store their size so it's up to the programmer to keep track of an array's size "by hand" :: // n is the number of elements in arr void print(double arr[], int n) { for(int i = 0; i < n; ++i) { cout << arr[i] << " "; } } // ... double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; print(temps, 5); notice that the parameter in the function header is ``double arr[]`` with no size in the []-brackets that allows ``arr`` to be of any size ``n`` is passed as a second parameter, and it is the number of elements in ``arr`` if ``n`` is negative, or greater than the true size of the array, then bad values can be printed double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; print(temps, 6); // oops: 6 is not the size of temps this will print ``temps``, plus the value at location ``temps[5]``, which has some unknown value in contrast, C++ vectors keep track of their own size, and so (as we've seen previously), a function that prints the elements of a vector only need the vector :: void print(vector arr) { for(int i = 0; i < arr.size(); ++i) { cout << arr[i] << " "; } } this is simpler, and also means you will never pass the wrong size to the function Passing Arrays to Functions --------------------------- consider these functions :: double sum(double arr[], int n) { double total = 0.0; for(int i = 0; i < n; ++i) { total += arr[i]; } return total; } double mean(double arr[], int n) { return sum(arr, n) / n; } // ... double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; cout << sum(temps, 5) << "\n" << mean(temps, 5) << "\n"; C++ passes ``arr`` by value but that doesn't mean the array *elements* are copied! the value stored in ``arr`` is the address of the first element of ``arr`` it's *that* address that is copied when ``arr`` is passed by value so that means you have access to the actual array locations this lets you write functions like this :: void set_all_zero(double arr[], int n) { for(int i = 0; i < n; ++i) { arr[i] = 0.0; } } // ... double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; set_all_zero(temps, 5); print(temps, 5); // 0 0 0 0 Multi-dimensional Arrays ------------------------ ``temps`` is a 1-dimensional array :: double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; you can create arrays with more than 1 dimensions, e.g. :: double matrix[3][3] = { {1, 3, 4}, {4, 3, 1}, {0, 8, 2} }; for(int i = 0; i < 3; ++i) { for(int j = 0; j < 3; ++j) { cout << matrix[i][j] << " "; } cout << "\n"; } // 1 3 4 // 4 3 1 // 0 8 2 ``matrix`` is an array of arrays, and so you an write code like this :: // sum of the 2nd row cout << sum(matrix[1], 3) // sum was defined previously << "\n"; C-style Strings --------------- a C-style string is an array of characters that ends with the special character ``'\0'`` they are more difficult and error-prone to work with than C++ strings Comparing Arrays and Vectors ---------------------------- pros of C++ vectors include - they know their size; arrays don't - use the same []-bracket notation as arrays for indexing - they are dynamic, i.e. their size can shrink/grow as a program is running; arrays have a fixed size that never changes - ``vector`` is class, and so you can use it to quickly create your vector- like classes; arrays are not classes cons of C++ vectors include - they may use more memory than necessary to store its values, and in some situations, the memory used could be twice the size of the memory needed to store the values; the use of extra memory is intentional, and is meant to make re-sizing the vector more efficient - you can't use them in C - in functions that pass/return arrays, you can't use vectors in their place