In C++, vector is a standard container for holding a sequence of values (of the same type). On the surface C++ vectors look and behave like ordinary C-style arrays, but vectorss are much more sophisticated.
Here’s how we might use a vector to store temperature data:
vector<double> temps(5); // make a vector of 5 doubles
temps[0] = -2; // temperatures are in Celsius
temps[1] = 2;
temps[2] = 5;
temps[3] = 7;
temps[4] = 5;
for(int i = 0; i < temps.size(); ++i) { // print temps
cout << temps[i] << "\n";
}
This code fragment declares a vector of doubles called temps. It is initialized to hold exactly 5 doubles. Square-bracket notation is used to access individual elements of a vector, e.g. temps[0] is the first element of the vector, temps[1] is the second, and so on. The number inside the square-brackets is called an index, the first index for a C++ vector is always 0.
It’s often useful to draw diagrams representing vectors:
0 1 2 3 4
+-----+-----+-----+-----+-----+
temps | -2 | 2 | 5 | 7 | 5 |
+-----+-----+-----+-----+-----+
The numbers in the boxes are the values stored in the vector, while the numbers on to of the boxes are the index values. Essentially, an index is a name for the value in the corresponding box.
It’s important understand index numbering for vectors because it is a common source of bugs. As mentioned, the first index of a vector is always 0. That means if you your vector has n elements, then the index of the last element is n - 1 — not n!
So in our example, the last element of temps is temps[4]. There is no temps[5] in our array.
The confusion on this point is so common that it even has a name: off-by-one error. Most programmers can tell you stories about getting stuck on off-by-one errors as they can be surprisingly hard to find.
Unfortunately, the C++ vector does not check if you go outside the index bounds of an array. For example, this code compiles and runs without complaint:
cout << temps[-1] << "\n" // temps[-1] doesn't exist
<< temps[5] << "\n"; // temps[5] doesn't exist
This prints two unknown values. -1 and 5 are not index values for temps, and so junk values are printed. It’s up to the programmer to make sure that the index values are never wrong.
C++ vectors are dynamic: they’re size can change. Probably the most common way to change a vector’s size is to add new items to the end. For example:
vector<string> pets; // initially empty, i.e. pets.size() == 0
pets.push_back("Fluffy");
pets.push_back("Fido");
pets.push_back("Newton");
pets.push_back("Glubglub");
// print the contents of pets
for(int i = 0; i < pets.size(); ++i) {
cout << pets[i] << "\n";
}
The statement pets.push_back("Fluffy") appends the string "Fluffy" to the right end of the array pets. It also increases the size of the vector by 1.
Note
A C++ vector sometimes uses more memory than is strictly necessary to store all of the elements it contains. This extra memory helps functions like push_back be more efficient by not always needing to allocate new memory each time they’re called. Thus, it is possible that for certain memory-sensitive applications a vector may use more memory than you would like, in which case some other data structure (such as an array) might be better.
A common task is to arrange the elements of a vector into ascending sorted order. That turns out to be quite easy to do in C++ since it has a standard sorting function:
vector<double> temps(5); // make a vector of 5 doubles
temps[0] = -2; // temperatures are in Celsius
temps[1] = 2;
temps[2] = 5;
temps[3] = 7;
temps[4] = 5;
sort(temps.begin(), temps.end());
for(int i = 0; i < temps.size(); ++i) { // print temps
cout << temps[i] << "\n";
}
The statement sort(temps.begin(), temps.end()) efficiently re-arranges the elements of temps to be in order from smallest to largest.
Here’s another example. This program reads in strings from the user and then prints them in sorted order:
cout << "Please enter some words: ";
vector<string> words;
string w;
while (cin >> w) {
words.push_back(w);
}
sort(words.begin(), words.end());
for(int i = 0; i < words.size(); ++i) {
cout << words[i] << "\n";
}
Notice that we don’t worry about the size of the words vector: it automatically grows to the necessary size when push_back is called. In other words, a vector automatically manages its own memory (in contrast to, say, a C-style array where the programmer is responsible for managing memory).
What if you want to sort a vector into descending order? The simplest approach is probably to use sort and then reverse, e.g.:
sort(temps.begin(), temps.end());
reverse(temps.begin(), temps.end());
Note
Later in the course we’ll see how to write our own sorting function. In practice, however, it is almost always better to use C++’s standard sort, since it is both extremely efficient and an quite general-purpose.
Here’s how you can sum the elements of a vector:
vector<double> v; // define and initialize v
v.push_back(6);
v.push_back(-1);
v.push_back(4);
v.push_back(5);
double sum = 0.0; // sum elements of v
for(int i = 0; i < v.size(); ++i) {
sum += v[i];
}
cout << sum;
The for-loop that does the summing scans through v one element at a time. It adds each element to sum, which keeps a running sum. It’s important to remember to initialize sum to 0 at the start.
This is a useful code idiom to remember:
for(int i = 0; i < v.size(); ++i) {
// ... do something to v[i] ...
}
It is applicable to many different situations.
For instance, suppose you want to print just the negative values of a vector. Then you could do this:
vector<double> v; // define and initialize v
v.push_back(6);
v.push_back(-1);
v.push_back(4);
v.push_back(5);
for(int i = 0; i < v.size(); ++i) {
if (v[i] < 0)
cout << v[i] << "\n";
}
Or suppose you wanted to separate positive and negative values into their own vectors:
vector<double> v; // define and initialize v
v.push_back(6);
v.push_back(-1);
v.push_back(4);
v.push_back(5);
vector<double> positive;
vector<double> negative;
for(int i = 0; i < v.size(); ++i) {
if (v[i] < 0) {
negative.push_back(v[i]);
} else {
positive.push_back(v[i]);
}
}
Printing out a vector is a common task, and so it makes sense to create a function that does that for us. One not-so-good way to do that is like this:
void print(vector<double> v) { // v is copied!
for(int i = 0; i < v.size(); ++i)
cout << v[i] << " ";
}
// ...
vector<double> x; // define and initialize x
x.push_back(6);
x.push_back(-1);
x.push_back(4);
x.push_back(5);
print(x);
While this works, it is unnecessarily inefficient: the vector that is passed to print is copied, which needlessly wastes time and memory. This kind of parameter-passing is known as pass by value. Pass by value always makes a copy of the thing being passed. On the plus side, this means the original object can’t be modified in any way. But since print doesn’t change the vector in any way (it just reads it), there’s no need to make a copy of it.
So instead of pass by value, C++ lets you use pass by reference:
void print(vector<double>& v) {
for(int i = 0; i < v.size(); ++i)
cout << v[i] << " ";
}
vector<double> x; // define and initialize x
x.push_back(6);
x.push_back(-1);
x.push_back(4);
x.push_back(5);
print(x);
The only change here is the addition of the & in the header. This means that when print is called, v refers to the very same data that x refers to. No copy is made, and so passing by reference in this case is much more efficient.
One more refinement is useful. Since print only reads the values of v, and never changes them, we can use pass by constant reference:
void print(const vector<double>& v) {
for(int i = 0; i < v.size(); ++i)
cout << v[i] << " ";
}
Here we’ve added const in front of vector<double>&. It tells the compiler that we don’t intend for any code inside print to change v, and so there’s no need to make a copy of the vector. Plus, the compiler can check that no code in print modifies v. Thus it is both efficient and safe.
As a rule of thumb, we will pass vectors by constant reference whenever possible.
Of course, you can only use pass by constant reference for functions that don’t change the vectors you pass to them. If you do need to change a vector in a function you must use regular pass by reference, e.g.:
void read_vector(vector<double>& v) {
double x;
while (cin >> x) {
v.push_back(x);
}
}
If you included const in front of vector<double>, then you would get a compiler error indicating you aren’t allowed to modify v.
Suppose you forgot the & and passed v by value:
void read_vector(vector<double> v) { // v passed by value
double x;
while (cin >> x) {
v.push_back(x);
}
}
There is no compiler error or runtime error, but this does not work. The problem is that v is a copy of the passed-in vector, and it is this copy that we modify — not the original vector. When read_vector ends, v is automatically deleted.
Like vectors, we also have to carefully choose how we want to pass strings to functions. We can pass a string by value like this:
double get_double(string prompt) { // pass by value
cout << prompt;
double x;
cin >> x;
return x;
}
Here prompt is a copy of whatever string is passed to get_double. If the string is long, then this could needlessly waste time and memory.
Instead we could pass it by reference:
double get_double(string& prompt) { // pass by reference
cout << prompt;
double x;
cin >> x;
return x;
}
But this won’t work in some cases. For example, this line of code causes a compiler error:
double x = get_double("x? ");
The problem is that passing by reference allows for the possibility that the passed-in string might be modified (even if it is not actually modified). But the literal string "x? " is a constant value that cannot be modified, and so you get a compiler error saying the string can’t be changed.
So the best solution is to use pass by constant reference:
double get_double(const string& prompt) { // pass by constant reference
cout << prompt;
double x;
cin >> x;
return x;
}
Since get_double doesn’t modify prompt in any way, the best approach is to use pass by constant reference. Including a default value, it would look like this:
double get_double(const string& prompt = "Please enter a double: ") {
cout << prompt;
double x;
cin >> x;
return x;
}
As a test of our understanding of the difference between pass-by-value and pass-by-reference, consider this code:
int x = 3;
int y = 5;
swap(x, y);
cout << x << " " << y << endl; // prints 3 5
The swap function is supposed to exchange the values of x and y; this turns out to be quite handy when writing sorting algorithms.
Here’s how you could write it in C++:
void swap(int& a, int& b) { // a and b passed by reference: correct!
int temp = a;
a = b;
b = temp;
}
The vital detail here is that the parameters are passed by reference. That is, a is another name for the value x refers to, and so if you change the value of a you are changing the value of x. Same for b and y.
Here’s a wrong way to implement this function:
void swap(int a, int b) { // wrong: a and b passed by value
int temp = a;
a = b;
b = temp;
}
The problem here is that a and b are copies of x and y. If you modify a, then this has no effect whatsoever on x, because they are distinct variables.