.. highlight:: c++

Chapter 7 Notes
===============

Please read chapter 7 of the textbook. Note that we have already covered
vectors and strings (chapter 8), and so in these notes we will occasionally
make reference to them.


Arrays
------

an array in C++ (and C) is, essentially, a contiguous chunk of memory that can
hold a fixed number of values of a specific type

arrays are similar to vectors, but much lower-level

indeed, the C++ ``vector`` class itself is typically implemented using arrays

if you are using C (instead of C++), then you must use arrays since C does not
have the ``vector`` class (or the ``string`` class)

as we will see, C arrays are, in a sense, very simple and thus easy to use

but they are also very error prone, and for many (perhaps most) applications
``vector`` (and ``string``) are better choices


Arrays
------

here's an example of declaring an array and printing its values

::

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

	// regular for-loop
	for(int i = 0; i < 5; ++i) {
		cout << temps[i] << "\n";
	}

	cout << "\n";

	// for-each loop
	for(double t : temps) {
		cout << t << "\n";
	}

notice how the array ``temps`` is declared using []-brackets

::

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

this declares ``temps`` to be an array of ``double`` values whose size
(length) is 5

in other words, ``temps`` can hold exactly 5 ``double`` values

the ``double``\_s are guaranteed to be stored contiguously in memory

and so, assuming a ``double`` is 8 bytes (i.e. 64 bits), then the array
``temps`` takes up :math:`5 \cdot 8 = 40` bytes of memory

once created, **the size of an array never changes**


Arrays
------

we could also have defined ``temps`` like this

::

	double temps[5] = {3.3, 3.1, -4, 1.5, 2.6};

being explicit like this is often a good idea, although (with g++, at least),
you can run into confusing situations like this::

    // temps is size 6, but is initialized with only 5 values
	double temps[6] = {3.3, 3.1, -4, 1.5, 2.6};

sometimes it may be necessary if you don't know the initial values for the
array, e.g.

::

	cout << "How many temperatures do you have? ";
	int n;
	cin >> n;
	double temps[n];
	for(int i = 0; i < n; ++i) {
		cout << "temperature " << (i + 1) << " = ";
		cin >> temps[i];
	}

	for(double x : temps) {  // note the for-each loop
		cout << x << " ";
	}
	cout << "\n";


Array Indexing
--------------

arrays are index like vectors and strings

e.g. ``temps[0]`` is the first element, ``temps[1]`` is the second element,
and so on

::

	   0     1     2     3     4
	 ----- ----- ----- ----- ----- 
	| 3.3 | 3.1 | -4  | 1.5 | 2.6 |
	 ----- ----- ----- ----- ----- 

C++ (and C) do **not** do any range-checking for arrays

so if you write code like this you typically don't get any warning until
something goes wrong in your program

::

	cout << temps[-1]  // oops: -1 is not a valid index
	     << temps[5];  // oops: 5 is not a valid index

the value returned by ``temps[-1]`` is whatever happens to be in the memory
location just before ``temps[0]``

::

	  -1     0     1     2     3     4     5
	 ----- ----- ----- ----- ----- ----- -----
	| ??? | 3.3 | 3.1 | -4  | 1.5 | 2.6 | ??? |
	 ----- ----- ----- ----- ----- ----- -----

such errors can be hard to track down because the values in the ??? positions
could be different on different runs of the program

there are also security concerns with this

it's possible that, say, ``temp[5]`` is a region of memory that your program
is supposed to have access to

experience has shown the clever hackers can exploit this sort of issue to help
break into computers

this particular flaw is known as a "buffer over-run", and is a well-known
weakness of many C/C++ programs


Array Indexing
--------------

on the plus side, the lack of range-checking makes array accesses fast and
simple

it's instructive to understand hte idea of how array indexing works

::

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};

when this code runs, C++ allocates :math:`5 \cdot 8 = 40` bytes of memory

``temps`` stores the address of the first element of this array

::

	  654   662   670   678   686
	 ----- ----- ----- ----- ----- 
	| 3.3 | 3.1 | -4  | 1.5 | 2.6 |
	 ----- ----- ----- ----- ----- 

lets assume the address of the first ``double`` is at the memory location
whose address is 654

that the second ``double`` (i.e. 3.1) is at byte :math:`654 + 8 = 662` (we are
assuming a ``double`` is 8 bytes long)

the third ``double`` (i.e. -45) is at byte :math:`654 + 2\cdot 8 = 662`

and so on

the exact starting address is usually different from run to run of this code,
since the exact placement in memory of a program is up to your computer's
operating system

for an array to work, we need to know its starting address

suppose 654 (the address of the first ``double`` in the array) at memory
location 879

::

	        879
	       -----
	temps | 654 |
	       -----

now, what is ``temps[0]``?

``temps[0]`` is the ``double`` at memory location :math:`654 + 8 \cdot 0 = 654`

``temps[1]`` is the ``double`` at memory location :math:`654 + 1 \cdot 8 =
662`

``temps[2]`` is the ``double`` at memory location :math:`654 + 2 \cdot 8 =
670`

and so on

in general, ``temps[i]`` refers to the memory location :math:`654 + i \cdot 8`

notice that this indexing scheme requires that the elements in the array all
be of the same size (i.e. 8 bytes in this example)

that's important!

this won't work (as easily, at least) with values of different sizes

note also that the we know to multiply ``i`` by 8 because ``temps`` stores
``double`` values, and that a ``double`` is 8 bytes


The Size of an Array
--------------------

C++ arrays don't store their size

so it's up to the programmer to keep track of an array's size "by hand"

::

    // n is the number of elements in arr
	void print(double arr[], int n) {
		for(int i = 0; i < n; ++i) {
			cout << arr[i] << " ";
		}
	}

    // ...

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
	print(temps, 5);

notice that the parameter in the function header is ``double arr[]`` with no
size in the []-brackets

that allows ``arr`` to be of any size

``n`` is passed as a second parameter, and it is the number of elements in
``arr``

if ``n`` is negative, or greater than the true size of the array, then bad
values can be printed

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
	print(temps, 6);  // oops: 6 is not the size of temps

this will print ``temps``, plus the value at location ``temps[5]``, which has
some unknown value

in contrast, C++ vectors keep track of their own size, and so (as we've seen
previously), a function that prints the elements of a vector only need the
vector

::

	void print(vector<double> arr) {
		for(int i = 0; i < arr.size(); ++i) {
			cout << arr[i] << " ";
		}
	}

this is simpler, and also means you will never pass the wrong size to the
function


Passing Arrays to Functions
---------------------------

consider these functions

::

	double sum(double arr[], int n) {
		double total = 0.0;
		for(int i = 0; i < n; ++i) {
			total += arr[i];
		}
		return total;
	}

	double mean(double arr[], int n) {
		return sum(arr, n) / n;
	}

	// ...

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
	cout << sum(temps, 5) << "\n"
	     << mean(temps, 5) << "\n";

C++ passes ``arr`` by value

but that doesn't mean the array *elements* are copied!

the value stored in ``arr`` is the address of the first element of ``arr``

it's *that* address that is copied when ``arr`` is passed by value

so that means you have access to the actual array locations 

this lets you write functions like this

::

	void set_all_zero(double arr[], int n) {
		for(int i = 0; i < n; ++i) {
			arr[i] = 0.0;
		}
	}

	// ...

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};
	set_all_zero(temps, 5);
	print(temps, 5);  // 0 0 0 0


Multi-dimensional Arrays
------------------------

``temps`` is a 1-dimensional array

::

	double temps[] = {3.3, 3.1, -4, 1.5, 2.6};


you can create arrays with more than 1 dimensions, e.g.

::

	double matrix[3][3] = {
		{1, 3, 4},
		{4, 3, 1},
		{0, 8, 2}
	};

	for(int i = 0; i < 3; ++i) {
		for(int j = 0; j < 3; ++j) {
			cout << matrix[i][j] << " ";
		}
		cout << "\n";
	}

	// 1 3 4 
	// 4 3 1 
	// 0 8 2 

``matrix`` is an array of arrays, and so you an write code like this

::

	// sum of the 2nd row
	cout << sum(matrix[1], 3)  // sum was defined previously
	     << "\n";


C-style Strings
---------------

a C-style string is an array of characters that ends with the special
character ``'\0'``

they are more difficult and error-prone to work with than C++ strings


Comparing Arrays and Vectors
----------------------------

pros of C++ vectors include

- they know their size; arrays don't

- use the same []-bracket notation as arrays for indexing

- they are dynamic, i.e. their size can shrink/grow as a program is running;
  arrays have a fixed size that never changes

- ``vector`` is class, and so you can use it to quickly create your vector-
  like classes; arrays are not classes


cons of C++ vectors include

- they may use more memory than necessary to store its values, and in some
  situations, the memory used could be twice the size of the memory needed to
  store the values; the use of extra memory is intentional, and is meant to
  make re-sizing the vector more efficient

- you can't use them in C

- in functions that pass/return arrays, you can't use vectors in their place