Chapter 7 Notes¶
Please read chapter 7 of the textbook. Note that we have already covered vectors and strings (chapter 8), and so in these notes we will occasionally make reference to them.
Arrays¶
an array in C++ (and C) is, essentially, a contiguous chunk of memory that can hold a fixed number of values of a specific type
arrays are similar to vectors, but much lower-level
indeed, the C++ vector
class itself is typically implemented using arrays
if you are using C (instead of C++), then you must use arrays since C does not
have the vector
class (or the string
class)
as we will see, C arrays are, in a sense, very simple and thus easy to use
but they are also very error prone, and for many (perhaps most) applications
vector
(and string
) are better choices
Arrays¶
here’s an example of declaring an array and printing its values
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
// regular for-loop
for(int i = 0; i < 5; ++i) {
cout << temps[i] << "\n";
}
cout << "\n";
// for-each loop
for(double t : temps) {
cout << t << "\n";
}
notice how the array temps
is declared using []-brackets
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
this declares temps
to be an array of double
values whose size
(length) is 5
in other words, temps
can hold exactly 5 double
values
the double
_s are guaranteed to be stored contiguously in memory
and so, assuming a double
is 8 bytes (i.e. 64 bits), then the array
temps
takes up \(5 \cdot 8 = 40\) bytes of memory
once created, the size of an array never changes
Arrays¶
we could also have defined temps
like this
double temps[5] = {3.3, 3.1, -4, 1.5, 2.6};
being explicit like this is often a good idea, although (with g++, at least), you can run into confusing situations like this:
// temps is size 6, but is initialized with only 5 values
double temps[6] = {3.3, 3.1, -4, 1.5, 2.6};
sometimes it may be necessary if you don’t know the initial values for the array, e.g.
cout << "How many temperatures do you have? ";
int n;
cin >> n;
double temps[n];
for(int i = 0; i < n; ++i) {
cout << "temperature " << (i + 1) << " = ";
cin >> temps[i];
}
for(double x : temps) { // note the for-each loop
cout << x << " ";
}
cout << "\n";
Array Indexing¶
arrays are index like vectors and strings
e.g. temps[0]
is the first element, temps[1]
is the second element,
and so on
0 1 2 3 4
----- ----- ----- ----- -----
| 3.3 | 3.1 | -4 | 1.5 | 2.6 |
----- ----- ----- ----- -----
C++ (and C) do not do any range-checking for arrays
so if you write code like this you typically don’t get any warning until something goes wrong in your program
cout << temps[-1] // oops: -1 is not a valid index
<< temps[5]; // oops: 5 is not a valid index
the value returned by temps[-1]
is whatever happens to be in the memory
location just before temps[0]
-1 0 1 2 3 4 5
----- ----- ----- ----- ----- ----- -----
| ??? | 3.3 | 3.1 | -4 | 1.5 | 2.6 | ??? |
----- ----- ----- ----- ----- ----- -----
such errors can be hard to track down because the values in the ??? positions could be different on different runs of the program
there are also security concerns with this
it’s possible that, say, temp[5]
is a region of memory that your program
is supposed to have access to
experience has shown the clever hackers can exploit this sort of issue to help break into computers
this particular flaw is known as a “buffer over-run”, and is a well-known weakness of many C/C++ programs
Array Indexing¶
on the plus side, the lack of range-checking makes array accesses fast and simple
it’s instructive to understand hte idea of how array indexing works
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
when this code runs, C++ allocates \(5 \cdot 8 = 40\) bytes of memory
temps
stores the address of the first element of this array
654 662 670 678 686
----- ----- ----- ----- -----
| 3.3 | 3.1 | -4 | 1.5 | 2.6 |
----- ----- ----- ----- -----
lets assume the address of the first double
is at the memory location
whose address is 654
that the second double
(i.e. 3.1) is at byte \(654 + 8 = 662\) (we are
assuming a double
is 8 bytes long)
the third double
(i.e. -45) is at byte \(654 + 2\cdot 8 = 662\)
and so on
the exact starting address is usually different from run to run of this code, since the exact placement in memory of a program is up to your computer’s operating system
for an array to work, we need to know its starting address
suppose 654 (the address of the first double
in the array) at memory
location 879
879
-----
temps | 654 |
-----
now, what is temps[0]
?
temps[0]
is the double
at memory location \(654 + 8 \cdot 0 = 654\)
temps[1]
is the double
at memory location \(654 + 1 \cdot 8 =
662\)
temps[2]
is the double
at memory location \(654 + 2 \cdot 8 =
670\)
and so on
in general, temps[i]
refers to the memory location \(654 + i \cdot 8\)
notice that this indexing scheme requires that the elements in the array all be of the same size (i.e. 8 bytes in this example)
that’s important!
this won’t work (as easily, at least) with values of different sizes
note also that the we know to multiply i
by 8 because temps
stores
double
values, and that a double
is 8 bytes
The Size of an Array¶
C++ arrays don’t store their size
so it’s up to the programmer to keep track of an array’s size “by hand”
// n is the number of elements in arr
void print(double arr[], int n) {
for(int i = 0; i < n; ++i) {
cout << arr[i] << " ";
}
}
// ...
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
print(temps, 5);
notice that the parameter in the function header is double arr[]
with no
size in the []-brackets
that allows arr
to be of any size
n
is passed as a second parameter, and it is the number of elements in
arr
if n
is negative, or greater than the true size of the array, then bad
values can be printed
double temps[] = {3.3, 3.1, -4, 1.5, 2.6}; print(temps, 6); // oops: 6 is not the size of temps
this will print temps
, plus the value at location temps[5]
, which has
some unknown value
in contrast, C++ vectors keep track of their own size, and so (as we’ve seen previously), a function that prints the elements of a vector only need the vector
void print(vector<double> arr) {
for(int i = 0; i < arr.size(); ++i) {
cout << arr[i] << " ";
}
}
this is simpler, and also means you will never pass the wrong size to the function
Passing Arrays to Functions¶
consider these functions
double sum(double arr[], int n) {
double total = 0.0;
for(int i = 0; i < n; ++i) {
total += arr[i];
}
return total;
}
double mean(double arr[], int n) {
return sum(arr, n) / n;
}
// ...
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
cout << sum(temps, 5) << "\n"
<< mean(temps, 5) << "\n";
C++ passes arr
by value
but that doesn’t mean the array elements are copied!
the value stored in arr
is the address of the first element of arr
it’s that address that is copied when arr
is passed by value
so that means you have access to the actual array locations
this lets you write functions like this
void set_all_zero(double arr[], int n) {
for(int i = 0; i < n; ++i) {
arr[i] = 0.0;
}
}
// ...
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
set_all_zero(temps, 5);
print(temps, 5); // 0 0 0 0
Multi-dimensional Arrays¶
temps
is a 1-dimensional array
double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
you can create arrays with more than 1 dimensions, e.g.
double matrix[3][3] = {
{1, 3, 4},
{4, 3, 1},
{0, 8, 2}
};
for(int i = 0; i < 3; ++i) {
for(int j = 0; j < 3; ++j) {
cout << matrix[i][j] << " ";
}
cout << "\n";
}
// 1 3 4
// 4 3 1
// 0 8 2
matrix
is an array of arrays, and so you an write code like this
// sum of the 2nd row
cout << sum(matrix[1], 3) // sum was defined previously
<< "\n";
C-style Strings¶
a C-style string is an array of characters that ends with the special
character '\0'
they are more difficult and error-prone to work with than C++ strings
Comparing Arrays and Vectors¶
pros of C++ vectors include
- they know their size; arrays don’t
- use the same []-bracket notation as arrays for indexing
- they are dynamic, i.e. their size can shrink/grow as a program is running; arrays have a fixed size that never changes
vector
is class, and so you can use it to quickly create your vector- like classes; arrays are not classes
cons of C++ vectors include
- they may use more memory than necessary to store its values, and in some situations, the memory used could be twice the size of the memory needed to store the values; the use of extra memory is intentional, and is meant to make re-sizing the vector more efficient
- you can’t use them in C
- in functions that pass/return arrays, you can’t use vectors in their place