Chapter 9 Notes¶
Please read chapter 9 of the textbook.
Warm-up: Hexadecimal¶
hexadecimal bit strings are a popular way of encoding long sequences of bits
hexadecimal (or just hex for short) is essentially a base-16 encoding of numbers
there are 16 “digits” in hexadecimal, 0-9 and then a, b, c, d, e, f
each hexadecimal digit corresponds to a pattern of 4 bits
Base 10 | Base 16 | Binary |
---|---|---|
0 | 0 | 0000 |
1 | 1 | 0001 |
2 | 2 | 0010 |
3 | 3 | 0011 |
4 | 4 | 0100 |
5 | 5 | 0101 |
6 | 6 | 0110 |
7 | 7 | 0111 |
8 | 8 | 1000 |
9 | 9 | 1001 |
10 | a | 1010 |
11 | b | 1011 |
12 | c | 1100 |
13 | d | 1101 |
14 | e | 1110 |
15 | f | 1111 |
you can encode one byte (i.e. 8-bits) using two hex digits, e.g.:
0100 0011
4 3
so the bit string 01000011 can be written as the hex string 43
any 32-bit string can be written using 8 hex digits
any 64-bit string can be written using 16 hex digits
C++ has good support for hexadecimal strings
for example, here is a program that
// hex.cpp
#include <iostream>
#include <iomanip>
#include <vector>
#include <string>
using namespace std;
int main() {
// a table of the binary values for the 16 hex digits, e.g.
// hexbits[11] is "1011", the bit string for hex digit b
vector<string> hexbits = {
"0000", "0001", "0010", "0011", "0100", "0101", "0110", "0111",
"1000", "1001", "1010", "1011", "1100", "1101", "1110", "1111"
};
//
// print a table of decimal, hexadecimal, and binary
//
cout << "Dec Hex Bin\n";
cout << "----------------\n";
for(int i = 0; i < 16; ++i) {
cout << std::dec << std::setw(2) << i << " ";
cout << std::hex << std::setw(5) << i << " ";
cout << std::setw(7) << hexbits[i] << "\n";
}
}
note the cout
statements in the for-loop do some fancy formatting
std::dec
means “print the next number in decimal format”
std::hex
means “print the next number in hexadecimal format”
std::setw(2)
means “print the next thing right-justified in a 2-space
width”
std::setw(5)
means “print the next thing right-justified in a 5-space
width”
Pointers¶
extensive use of pointers is one of the defining features of C (and C++)
in C/C++, pointers represent memory addresses
for example
int n = 3;
we can imagine the computers memory as something like this
5482 5483 5484
----- ----- -----
| ? | 3 | ? |
----- ----- -----
n
when the code fragment above runs the variable n
is assigned to the memory
location 5483
the number 5483 is just made up
C++ and the operating system determines the exact memory location of n
on
each run of the program
Pointers¶
we can get the address of n
using &
, the address-of operator
int n = 3;
cout << &n
<< "\n";
here are three different runs of the above code on my computer:
$ ./pointers
0xbfe4d8e8
$ ./pointers
0xbfbf2068
$ ./pointers
0xbfa73268
the output is in hexadecimal, i.e. base-16
the 0x
at the start indicates this is a hexadecimal number
notice that for each run of the program n
is located at a different place
in memory
generally, you can’t predict where in memory a variable will be stored
Pointer Variables¶
a pointer stores an address of a variable
for example
int n = 3;
int* n_addr = &n;
cout << n_addr
<< "\n";
the variable n_addr
has the type int*
int*
is the type “int
pointer”
that means that n_addr
can store the address of an int
in this example, n_addr
stores the address of n
Pointer Variables¶
to see what’s going on, look at this picture of memory
after this statement runs
int n = 3;
memory looks like this
5482 5483 5484
----- ----- -----
| ? | 3 | ? |
----- ----- -----
n
then after this statement runs
int* n_addr = &n;
5482 5483 5484
----- ----- -----
| ? | 3 | ? |
----- ----- -----
n
3939
------
| 5483 |
------
n_addr
the variable n_addr
is stored at (made up) memory location 3939
n_addr
holds 5483, i.e. the address of where in memory n
is stored
we usually say n_addr
points to n
it’s often useful to draw arrow diagrams showing pointers as arrows
Dereferencing a Pointer¶
pointers let us access the memory they refer to
int n = 3;
int* n_addr = &n;
cout << *n_addr // note the *
<< "\n";
n_addr
points to an int
*n_addr
is the value that n_addr
points to
i.e. 3 in this case
Dereferencing a Pointer¶
here’s another example
string s = "apple";
string* p = &s;
string* q = p;
cout << p << ", " << *p << "\n"
<< q << ", " << *q << "\n";
p
and q
are string pointers
they both point to the same address, i.e. the location where s
is stored
you can access the string there via s
, or *p
, or *q
careful!
they all share the same underlying string (no copies of the characters are made)
string s = "apple";
string* p = &s;
string* q = p;
cout << p << ": " << *p << "\n"
<< q << ": " << *q << "\n";
s = "orange";
cout << p << ": " << *p << "\n"
<< q << ": " << *q << "\n";
s[0] = 'g';
cout << p << ": " << *p << "\n"
<< q << ": " << *q << "\n";
(*p)[5] = 'y';
cout << p << ": " << *p << "\n"
<< q << ": " << *q << "\n";
Passing Pointers to Functions¶
you can pass pointers to functions
it acts like passing by reference
void set_zero(double* x) {
*x = 0.0;
}
// ...
int main() {
double a = 54;
set_zero(&a); // note use of &
}
void get_double(double* x, const string& prompt) {
cout << prompt;
cin >> *x;
}
// ...
int main() {
double t = 0.0;
get_double(&t, "Please enter a number: "); // note use of &
}
Passing Pointers to Functions¶
// swapping using pointers
void swap(int* a, int* b) {
int temp = *a;
*a = *b;
*b = temp;
}
you use this version of swap
like this
int x = 2;
int y = 3;
cout << x << " " << y << "\n"; // 2 3
swap(&x, &y); // note the use of &
cout << x << " " << y << "\n"; // 3 2
the programmer must remember to write the ampersands when calling this
swap
, i.e. &x
and &y
also, the code inside swap
must use *
throughout to dereference the
pointers
this is messier and more error-prone than the pass-by-reference version C++ lets you write:
// how you could write swap in C
void swap(int& a, int& b) {
int temp = a;
a = b;
b = temp;
}
especially in C, pointers are used frequently in functions, both as parameters and return values (although be careful with return values — see the discussion of the free store below)
Defining a Pointer¶
here are two common syntax styles for defining a pointer
int n = 5;
int* a = &n; // * attached to int
int *b = &n; // * attached to b
the compiler doesn’t care where you put the *
when a
are b
defined
it’s up the programmer
some programmers like the int* a
way, because it emphasizes that int*
is an “int pointer” data type
some programmers like the int *b
way, because it makes it clear that
*b
is of type int
be careful if you declare more than one variable at a time, e.g.
double *c, d; // c is double*, d is double
c
is of type double*
but d
is of type double
if you want both to be points you’d have to write it like this
double *c, *d;
now both c
and d
are of type double*
a good rule of thumb is to avoid declaring multiple variables like this
instead, explicitly declare each variable on its own line, thus avoiding this problem entirely
or, if you like, you can use typedef
to avoid pointer declaration errors
typedef int* IntPtr;
int n = 6;
IntPtr p, q; // both p and q are type IntPtr
p = &n;
q = &n;
cout << *p // 6
<< *q // 6
<< "\n";
Null Pointers¶
if you declare a pointer variable without giving it an initial value, then, in general, you can’t be sure what value it has
double* p; // p has some unknown value
cout << *p; // oops: you don't know where p points,
// so you have no idea what this will print
if you need a pointer variable, but don’t have a value for it to point to yet,
use nullptr
double* q = nullptr; // q points to no value
cout << q
<< "\n";
nullptr
is a new C++11 keyword, and in C and older C++ versions you would
write 0
instead of nullptr
but don’t use 0
(or NULL
) for the null pointer in this course!
always use nullptr
note that it’s always an error to de-reference a null pointer
double* q = nullptr; // q points to no value
cout << *q // oops: can't dereference a null pointer
<< "\n";
on my computer, executing *q
causes the program to crash with a
Segmentation fault
error
C++ Memory Organization¶
before we move on to the next topic, lets discuss for a moment how C++ organizes a programs memory
basically, all C++ programs are divided into three memory regions
static memory: global variables (variables declared outside of any function) are stored here
stack memory: variables declared inside functions (local variables) are stored here, along with function parameters and return values; stack memory is automatically managed by C++, growing and shrinking as needed, i.e. when a function ends all the local variables in that function are automatically de- allocated
free store (heap) memory: this is memory that can be manually allocated and de-allocated by the programmer when needed; it’s big advantage over stack memory is that it allows for memory to persist between function calls
the great thing about stack memory is that it works automatically without any special effort from the programmer
but is not flexible enough for many applications
so we also need free store memory, which is manually allocated and de- allocated by the user
Free Store Memory¶
in C++, we will use new
to allocate memory on the free store and
delete
or delete[]
(for arrays) to de-allocate memory
warning: don’t use malloc
and free
functions for free store
management in this course! those functions are used for memory management in C
warning: mixing malloc
/free
and new
/delete
in the same
program can corrupt the free store!
stick to new
/delete
in this course!
// allocate an int, initialized to 3, on the free store
int* p = new int(3);
cout << *p
<< "\n";
// de-allocate the memory when we're done with it
delete p;
the expression new int(3)
allocated a new int
on the free store,
initialized to 3
it then returns the address of that int
so p
points to an int
on the free store
*p
is the int
stored there (in this case, 3)
you can use *p
just like a regular int
but, and this is surprisingly hard to do correctly in general, you must
remember to de-allocate the memory p
points to (using the statement
delete p;
) when you are done with it
Memory Leaks¶
consider this code
void test() {
int* p = new int(5);
int n = 2;
p = &n;
cout << *p // 2
<< "\n";
}
this compiles, runs, and prints 2
so it might seem to work
but it has a subtle bug known as a memory leak
when the test
function ends, the variable p
is automatically deleted
but the free store memory p
points to is not deleted
the memory is still there and, because p
is gone, it can no longer be
accessed
in particular, the memory p
pointed to cannot be de-allocated
so every time you call the test()
function, one int
is allocated on
the free store and then left there for the rest of the running time of the
program
if you call test()
a lot of times, this could eat up a lot of memory and
cause your program’s performance to degrade
Memory Leaks¶
memory leaks are very common programming errors
in practice, they can be extremely difficult to find and fix
this is in part because it is not always clear when a free store memory can be safely deleted
also, most programs can run at least for a little while suffering from a memory leak
serious problems might not occur until a lot of memory is wasted, which for some programs, might take a while (or even not occur at all)
many (most?) other languages provide automatic memory managers called garbage collectors that can automatically delete unused free store memory
garbage collected languages are nice because then the programmer never needs to worry about when to delete unused memory — the language takes care of it automatically
but you pay a performance price for automated garbage collection, and for some applications that price might be too high
Arrays and Pointers¶
arrays and pointers are closely related in C/C++
in this code, a
is an array of 5 int
values (with unknown initial
values)
a
is local to the test()
function, so it will be automatically de-
allocated when test()
ends
void test() {
int a[5];
for(int i = 0; i < 5; ++i) {
a[i] = i;
}
// we can also treat a is if it were of type int*, and so access
// elements of a like this
cout << *a << "\n"
<< *(a + 1) << "\n"
<< *(a + 2) << "\n"
<< *(a + 3) << "\n"
<< *(a + 4) << "\n";
cout << "---\n";
// p is an int pointer that points to the first element of a
// we can use it to access individual elements of a
int* p = a;
cout << *p << "\n"
<< *(p + 1) << "\n"
<< *(p + 2) << "\n"
<< *(p + 3) << "\n"
<< *(p + 4) << "\n";
}
the point here is that a
is essentially a pointer to an int
, and so we
can use it like a pointer
Arrays and Pointers¶
we can also write code like this
int a[5];
for(int i = 0; i < 5; ++i) {
a[i] = 0;
}
for(int* p = a; p < a + 5; ++p) {
cout << "*p = " << *p << "\n";
}
the second loop iterates over the array a
using the pointer p
instead
of an index variable
we could have written the first loop that initializes a
like this
for(int* p = a; p < a + 5; ++p) {
*p = 0;
}
this style of iteration is actually quite common in C++, and is used throughout the STL (the C++ standard template library)
Dynamic Arrays¶
the arrays we’ve seen so far have all been allocated locally
thus, they are automatically de-allocated when the function they are defined within ends
but we can also allocate arrays on the free store for when we want an array to persist beyond a single function
double* temp = new double[6];
for(int i = 0; i < 6; ++i) {
temp[i] = i;
}
for(double* p = temp; p < temp + 6; ++p) {
cout << "*p = " << *p << "\n";
}
delete[] temp; // de-allocate the array temp points to
the expression new double[6]
create an array of 6 double
values on the
free store (with unknown initial values)
it returns a value of type double*
that points to the first element of
this newly created array
thus, temp
points to the first value of this newly created array
we can access individual elements of temp
using the []-bracket notation,
or pointer notation (as shown in the loops above)
when we are done with the array, it is up to us, the programmer to de-allocate
the memory used by the array by calling delete[] temp
this gives the memory used by the array back to the freestore so that can be
allocated again in future calls to new
because temp
points to an array (and not a single double
), we must use
delete[]
, i.e. we need to []
after delete
if we called just delete temp
that would be an error and may corrupt the
freestore
if we don’t call delete[]
on the array, then we have a memory leak — the
array sits there taking up memory for the rest of the lifetime of the program
Dynamic Arrays¶
this function has a serious bug
void bad_get_median() {
cout << "How many numbers do you want to enter? ";
int n = 0;
cin >> n;
if (n <= 0) cmpt::error("n must be positive");
cout << "Please enter " << n << " numbers: ";
double* nums = new double[n];
for(int i = 0; i < n; ++i) {
cin >> nums[i];
}
// re-arrange the numbers to be in ascending sorted order
sort(nums, nums + n);
cout << "median = " << nums[n / 2] << "\n";
}
note that sort
is from the standard C++ algorithm
library
the input to sort
is a pointer to the first element, and a pointer to
one past the last element we want sorted
the problem with this code is that it has a memory leak
this line creates a new array on the freestore
double* nums = new double[n];
but the array is not deleted anywhere, i.e. nowhere is delete[]
called
the variable nums
is local to bad_get_median()
, and so it disappears
when bad_get_median()
finishes
but the array that nums
pointed to stays untouched on the freestore
we no longer have a pointer to it, so there’s no way to delete it once
bad_get_median
ends
so every time we call bad_get_median()
, a little bit of memory is forever
lost to the program
call it enough times, the program might run out of memory and thus crash
Dynamic Arrays¶
we can fix the memory leak by adding delete[] nums
to the end
void fixed_get_median() {
cout << "How many numbers do you want to enter? ";
int n = 0;
cin >> n;
if (n <= 0) cmpt::error("n must be positive");
cout << "Please enter " << n << " numbers: ";
double* nums = new double[n];
for(int i = 0; i < n; ++i) {
cin >> nums[i];
}
// re-arrange the numbers to be in ascending sorted order
sort(nums, nums + n);
cout << "median = " << nums[n / 2] << "\n";
delete[] nums;
}
Dynamic Arrays¶
another issue with dynamic memory is when you have two pointers to the same array
int* arr = new int[3];
arr[0] = 0;
arr[1] = 0;
arr[2] = 0;
int* p = arr;
int* q = arr;
p
, q
, and arr
all point to the same underlying array
a tricky issue that arises here is how do you de-allocate the array when you are done using it?
you must call delete
, and you could call delete arr
, delete p
, or
delete q
the problem is that in C++ it’s an error to call delete
on memory that
has already been deleted
so this is wrong
delete p;
delete q; // oops: the array q points to has already been deleted
in general, it can be very hard to know how many pointers are pointing to an array
whenever you call delete
you need to be certain that no other pointers
point to it
in modern C++, this (and related) problems are often best solved by using smart pointers”, such as shared_ptr, which automatically delete memory and guarantee that double-deletion is impossible