12. Introduction to Object-oriented Programming (OOP)

Object-oriented programming, usually abbreviated OOP, is a popular style of programming that most modern languages support. OOP is particularly useful for creating code libraries, and for organizing large programs.

C++ was one of the languages that helped make OOP popular in practice. In C++, OOP is essentially a way of creating user-defined types. OOP has many technical details that you should eventually learn, although for this introduction we will focus on a few main concepts.

12.1. The Idea of OOP

The basic idea of OOP is to organize programs using objects. An object can be almost anything, such as:

  • a string
  • a vector
  • a stream
  • a date
  • a document
  • a letter in a document
  • a record in a data base
  • a web page
  • a tree (in a file system, say)
  • a circle
  • a particle in an animated explosion
  • etc.

For example, C++ strings are objects. Like all objects in C++, a string is made up of two kinds of things:

  1. The data for representing the string. Usually — but not always — this data is stored as an array of char values (i.e. a C-style string).
  2. Common functions that operate on that data. For instance, if s is s string, then you can call these functions on it:
    • s.size() returns the number of characters in s
    • s.empty() returns true if the s is the empty string, and false otherwise
    • s.substr(i, n) returns a new string of size n consisting of the n characters s[i], s[i + 1], ..., s[i + n - 1]

An object is a collection of data, plus functions that operate on that data. Well-designed objects are often easy to use because they contain everything you need in easy-to-access place.

12.2. Creating Your Own Objects

Suppose we want to create our own object for representing two-dimensional (x, y) points (such objects turn out to have many practical uses in graphics).

In C++, like many OOP languages, before you can create an object you need to create a class that describes the parts of the object. A class is like a factory that creates objects.

Note

Not all OOP languages use classes to create objects. Some, such as JavaScript, instead create objects by copying existing objects. That is, you create an initial object by hand, and then copy it and change its values to make new objects. This style of object creation is know as prototyping, and it can be more flexible than class-based object creation. However, we will only be discussing class-based object creation in this course.

Here is the class for our point objects:

struct Point {
  double x;
  double y;
}; // Point

int main() {
  Point p;
  p.x = 4;
  p.y = 2;
  cout << p.x << " " << p.y << endl;
}

Note

Even though we write this code using struct, we will generally refer to it as a class as a reminder that it is used for creating objects. We will see a bit later in these notes that the keyword class can be used instead of struct, although with a few technical differences.

Here, Point is a struct that describes the contents of Point objects. Creating a Point object is easy:

Point p;
Point t;

To access the variables in p and t we use dot notation:

p.x = 4;
p.y = 2;

t.x = 0;
t.y = 0;

cout << p.x << ' ' << p.y << endl  // prints: 4 2
     << t.x << ' ' << t.y << endl; // prints: 0 0

p.x, p.y, t.x, and t.y are all variables of type double, and so you can use them anywhere you could use a double variable.

Notice that p and t get their own personal copies of x and y. The variable p.x is different than the variable t.x.

12.3. Adding Functions

Now lets add some printing functions to our Point class:

struct Point {
  double x;
  double y;

  void print() {
    cout << "(" << x << ", " << y << ")";
  }

  void println() {
    print();
    cout << endl;
  }
}; // Point

int main() {
  Point p;
  p.x = 4;
  p.y = 2;
  p.println();
}

Look at the print function. It’s body is allowed to use x and y directly without any dot-notation. Similarly, note that in the println function we call print. This shows that any function defined inside a class can access any of the variables or functions also defined within it.

Note

Sometimes functions defined inside a class are called methods. However, we will usually just refer to them as functions.

12.4. Constructors

One problem with Point objects is that they don’t initialize their x and y values to anything sensible:

Point p;
p.println();   // prints unknown values

It’s up to the programmer to remember to give them initial values, e.g.:

Point p;
p.x = 0;
p.y = 0;

The problem with this is that it is inconvenient to have to write two assignment statements every time you create a point. It is to forget, or to do incorrectly.

A better approach is to use constructors to initialize our Points. A constructor is a special function designed specifically for intializing objects. For example:

struct Point {
  double x;
  double y;

  Point() : x(0), y(0) {    // default constructor
    // empty body
  }

  // ...

}; // Point

In C++, a constructor always has the same name as the class it resides in. The constructor we’ve added here does not take any input, and so is used like this:

Point p;
p.println();   // prints (0, 0)

When p is created, the default constructor for Point is automatically called, which sets the values of x and y to 0.

This is extremely useful: now there is no way to create a Point without initializing its variables. Errors due to random initialization values are no longer an issue.

Constructors look like functions, but have some important differences you need to know:

  • Constructors do not have a return type, not even void.

  • The name of a constructor is always the name of the class.

  • After the constructor’s input parameter list comes an initializer list:

    Point()        // default constructor
    : x(0), y(0)   // initializer list
    {
      // empty body
    }
    

    The purpose of the initializer list is to assign values to the variables of the object before any other code is executed. You can also put whatever code you like inside the code block if any further initialization is required.

Classes often have more than one constructor. Lets add a second constructor to Point that lets the programmer provide x and y values:

struct Point {
  double x;
  double y;

  Point() : x(0), y(0) {   // default constructor
    // empty body
  }

  Point(int a, int b) : x(a), y(b) {   // constructor
    // empty body
  }

  // ...

}; // Point

Now we can write code like this:

Point p(4, 2);
p.println();  // prints (4, 2)

Notice the notation for how p is initialized. We pass in 4 and 2, and they get assigned to p.x and p.y.

The other constructor, the default constructor that takes no parameters:

Point origin;
origin.println();  // prints (0, 0)

Of course, it is not strictly necessary any more since we could write this:

Point origin(0, 0);
origin.println();  // prints (0, 0)

Another useful type of constructor is a copy constructor. As the name suggests, a copy constructor is used to make a copy of another object:

struct Point {
  double x;
  double y;

  Point() : x(0), y(0) {   // default constructor
    // empty body
  }

  Point(int a, int b) : x(a), y(b) {   // constructor
    // empty body
  }

  Point(const Point& p) : x(p.x), y(p.y) {  // copy constructor
    // empty body
  }

  // ...

}; // Point

The copy constructor lets us write code like this:

Point start(4, 2);
Point home(start);  // make a copy of start
start.println();    // prints (4, 2)
home.println();     // prints (4, 2)

Keep in mind that start and home are separate objects with their own personal x and y variables.

12.5. Testing for Equality

Another useful function to add to Point is equals, which tests if two Point objects are the same:

struct Point {
  double x;
  double y;

  // ...

  bool equals(const Point& p) {
    return p.x == x && p.y == y;
  }

  // ...

}; // Point

Recall that && means “and”, and so the expression p.x == x && p.y == y returns true if both p.x == x is true, and p.y == y is true. If one, or both, are false, then the entire expression is false.

Now you can write code like this:

Point p;       // (0, 0)
Point target;  // (0, 0)

cin >> target.x >> target.y;  // read in target's value

if (p.equals(target)) {
   cout << "same";
} else {
   cout << "different";
}

But there’s a problem with equals: double arithmetic is not exact, i.e. calculations done with doubles suffer from small but unavoidable round-off errors. That means that you might have two doubles that are, for all practical purposes, equal, but are not exactly the same according to ==. For example, in pretty much any practical program we would like to treat 0.0 and 0.000000000000001 as being the same.

To solve this problem, lets re-write equals:

// If the absolute value of the difference of two doubles is the less
// than min_diff, then they will be considered equal.
const double min_diff = 0.00000000001;

struct Point {
  // ...

  bool equals(const Point& p) {
    return abs(p.x - x) < min_diff && abs(p.y - y) < min_diff;
  }

  // ...

}; // Point

Here, two doubles, x and y, are considered the same if the absolute value of their differences is less than the constant min_diff. While it takes a little more time to do this equality check, it is more accurate and probably more useful in general.

Note

Dealing with the round-off errors inherent in floating-point computer arithmetic turns out to be highly non-trivial in general. Long, complicated calculations can suffer from huge amounts of error if they are not done carefully.

Numeric computation is an import sub-topic of computer science, although we won’t go into it any further here.

12.6. Using Points in Other Functions

We can pass Point objects to other functions similarly to how built-in data types are passed. For example:

// calculates the distance between p and q
double dist(const Point& p, const Point& q) {
  double dx = p.x - q.x;
  double dy = p.y - q.y;
  return sqrt(dx * dx + dy * dy);
}

Note that this function is not inside the Point class.

12.7. Destructors

A destructor is a special kind of function that an object calls automatically when it is destroyed, i.e. when it goes out of scope, or is given back to the free store (using delete).

Our Point objects don’t have any practical need for a destructor, but lets add one anyways to see how they work:

struct Point {

  // ...

  ~Point() {
    cout << "(Point destructor called)\n";
  }
}; // Point

C++ destructors start with the ~ symbol followed by the name of the class. Destructors never take any input parameters, and are usually used for “cleaning up” resources the object used. In this case, all we are doing is printing a message when the destructor is called. This can be quite useful for debugging: it tells you when an object no longer exists.

It’s often useful to think of constructors and destructors as working together to manage some computer resource: the constructor initializes the resource, and the destructor de-initializes it. Since both constructors and destructors are automatically called, the programmer will never forget to initialize/de- initialize the resource.

For instance, suppose you’ve created an object for a printer:

struct Printer {

   // ...

   Printer() {
      printer.open();
   }

   // ...

   ~Printer() {
       printer.close();
    }

};

Now Printer objects will automatically open and close the printer without the programmer needing to anything more than create a Printer object.

12.8. Public and Private

Lets create a new class for representing a person:

struct Person {
   string name;
   int age;

  Person(const string& n, int a) : name(n), age(a) {
      if (age <= 0) error("illegal age");
   }

}; // struct Person

What’s interesting here is that the constructor checks to see if the age is positive. If not, it throws an error. While this stops us from creating a Person object with nonsensical age, it doesn’t stop code from later setting age to be a bad value:

Person p("Harry Potter", 14);
p.age = -5; // oops: age should never be negative!

C++ provides a solution to this problem. Lets re-write the Person class like this:

struct Person {
private:
   string name;
   int age;

public:
  Person(const string& n, int a) : name(n), age(a) {
      if (age <= 0) error("illegal age");
   }

}; // struct Person

Here we divided the class into two separate regions, one labeled private and the other labeled public. Now name and age cannot be accessed outside of ``Person``. For example, this code now causes a compiler error:

Person p("Harry Potter", 14);
p.age = -5;   // compiler error: age is private

The public part of Person contains everything we want code outside of Person to have access to. In this case, the only public thing is the constructor (otherwise how can you create the object?).

By default, everything in a struct is public. C++ also has a construct called class which is just like struct except by default everything is private. For instance, we could re-write Person like this:

class Person {
   string name;
   int age;

public:
  Person(const string& n, int a) : name(n), age(a) {
      if (age <= 0) error("illegal age");
   }

}; // class Person

We’ve made two changes here: struct has been replaced with class, and the private label has been removed. We don’t need it because in a class variables and functions are private by default.

12.9. Setters and Getters

While making age private stops us from ever giving it a nonsensical value, it is too strict: we have no way to make sensible changes to age. Even worse, we have no way to read age, i.e. this code causes a compiler error:

cout << p.age;   // compiler error: age is private

In OOP, the general solution to this problem is to use functions known as getters and setters. Roughly, getters return the value of a variable, and setters write the value of a variable. Since getters and setters are functions you can add whatever protection code you need or want.

So lets add a setter and getter for age to Person:

class Person {
  string name;
  int age;

public:
  Person(const string& n, int a) : name(n), age(a) {
      if (age <= 0) error("illegal age");
   }

  int get_age() {                      // getter
    return age;
  }

  int set_age(const int a) {           // setter
    if (a <= 0) error("illegal age");
    age = a;
  }

}; // class Person

It’s essential that the setters and getters be in the public part of the class so that code outside the class can access them.

Now to access the age of a Person we call get_age():

Person p("Harry Potter", 14);
cout << p.get_age() << "\n";

If we try to set the age to a nonsensical value an error is thrown:

p.set_age(-5);  // error thrown at runtime

But sensible ages cause no error:

p.set_age(15);   // ok

To be complete, we should also add a setter and getter for the name:

class Person {
   string name;
   int age;

public:
  Person(const string& n, int a) : name(n), age(a) {
      if (age <= 0) error("illegal age");
   }

  int get_age() {                      // getter
    return age;
  }

  int set_age(const int a) {           // setter
    if (a <= 0) error("illegal age");
    age = a;
  }

  string get_name() {                  // getter
    return name;
  }

  void set_name(const string& n) {     // setter
    if (n.empty()) error("illegal name");
    name = n;
  }

}; // class Person

Now we can write code like this:

Person p("Harry Potter", 14);
cout << p.get_name() << ", " << p.get_age() << "\n";

Setters and getters are important because they give the programmer complete control over how the variables of an object are accessed. Often objects have special variables that code outside the object either doesn’t need to know about, or should not be able to change. In such a case the variable should be declared private, and no setter/getter should be created for it.

This general technique of keeping variables and functions hidden from the rest of a program is called information hiding, and experience shows that it is a very useful technique for creating large, complex programs. By hiding implementation details we not only reduce the mental burden on the programmer using the object, but we also make it hard for them to accidentally “mess it up” by assigning a variable a nonsensical value.

12.10. Constant Objects

Recall that you can declare a variable to be const, which means it is read-only. Suppose we do this with a Person:

const Person baby("Emily", 1);  // error!
cout << baby.get_name() << ", " << baby.get_age() << "\n";

Unfortunately, when you compile this it doesn’t work: you get a compiler error indicating that you are not allowed to call get_name() on baby because of the const.

In other words, C++ is saying that calling get_name() might change baby somehow. We know it won’t, but C++ doesn’t.

So to get const to work with Person, we need to indicate which of its functions can be called with a constant Person object:

class Person {
   string name;
   int age;

public:
  Person(const string& n, int a) : name(n), age(a) {
      if (age <= 0) error("illegal age");
   }

  int get_age() const {                // const added
    return age;
  }

  int set_age(const int a) {
    if (a <= 0) error("illegal age");
    age = a;
  }

  string get_name() const {            // const added
    return name;
  }

  void set_name(const string& n) {
    if (n.empty()) error("illegal name");
    name = n;
  }

}; // class Person

Only get_age() and get_name() are declared to be const because they are the only functions in Person that don’t change one of its variables.

After we mark get_age() and get_name() as const this code works as expected:

const Person baby("Emily", 1);
cout << baby.get_name() << ", " << baby.get_age() << "\n";

However, this would (correctly) give a compiler error:

const Person baby("Emily", 1);
baby.set_age(2);  // compiler error!
cout << baby.get_name() << ", " << baby.get_age() << "\n";

12.11. Operator Overloading

Lets return to our Point class:

// If the absolute value of the difference of two doubles is the less
// than min_diff, then they will be considered equal.
const double min_diff = 0.00000000001;

struct Point {
  double x;
  double y;

  Point() : x(0), y(0) {   // default constructor
    // empty body
  }

  Point(int a, int b) : x(a), y(b) {   // constructor
    // empty body
  }

  Point(const Point& p) : x(p.x), y(p.y) {  // copy constructor
    // empty body
  }

  bool equals(const Point& p) {
    return abs(p.x - x) < min_diff && abs(p.y - y) < min_diff;
  }

  void print() {
    cout << "(" << x << ", " << y << ")";
  }

  void println() {
    print();
    cout << endl;
  }

  ~Point() {
    cout << "(Point destructor called)\n";
  }
}; // Point

While this is useful, it is a bit awkward. For instance, writing p.equals(q) is not as nice as the using the == operator the way we can with other C++ values. And using print and println is not as convenient as using the cout and <<.

To deal with this sort problem C++ lets you overload built-in operators to work with your own objects. For instance, here’s how we make == and !== work with points:

struct Point {

  // ...

  bool operator==(const Point& p) {
    return equals(p);
  }

  bool operator!=(const Point& p) {
    return !equals(p);
  }

  // ...

}; // Point

This lets us write code like this:

Point start(4, 2);
Point home(start);

if (start == home) {
  cout << "They're the same!\n";
} else {
  cout << "They're different!\n";
}

It makes the code a little easier to read, which is always a good thing in programming.

Now lets do something about the print functions. To print a Point directly onto cout with <<, we need to overload the << operator:

struct Point {

  // ...

}; // Point


ostream& operator<<(ostream& out, const Point& p) {
 out << "(" << p.x << ", " << p.y << ")";
 return out;
}

The << operator is not a part of Point, and so is defined outside of the struct.

Having a << for Point is quite convenient:

Point start(4, 2);
Point home(start);

cout << start << endl << home << endl;

Now printing Point objects works just like printing any other kind of object.

12.12. Putting Point in its Own File

Points are quite useful in many different kinds of programs, and so it makes sense to do a little more work to make them easily re-usable.

What we’ll do here is put the Point class, and its related functions and variables, in a file called Point.h. The .h indicates this is a header file, which, strictly speaking, should not contain implementation code but instead just header information.

Here is the contents of Point.h:

// Point.h

// By defining point_cmpt125, we avoid problems caused by including
// this file more than once: if point_cmpt125 is already defined,
// then the code is *not* included.
#ifndef point_cmpt125
#define point_cmpt125 201201L

#include "std_lib_cmpt125.h"

// If the absolute value of the difference of two doubles is the less
// than min_diff, then they will be considered equal.
const double min_diff = 0.00000000001;

struct Point {
  double x;
  double y;

  Point() : x(0), y(0) {   // default constructor
    // empty body
  }

  Point(int a, int b) : x(a), y(b) {   // constructor
    // empty body
  }

  Point(const Point& p) : x(p.x), y(p.y) {  // copy constructor
    // empty body
  }

  bool equals(const Point& p) {
    return abs(p.x - x) < min_diff && abs(p.y - y) < min_diff;
  }

  bool operator==(const Point& p) {
    return equals(p);
  }

  bool operator!=(const Point& p) {
    return !equals(p);
  }

  void print() {
    cout << "(" << x << ", " << y << ")";
  }

  void println() {
    print();
    cout << endl;
  }

  ~Point() {
    cout << "(Point destructor called)\n";
  }
}; // Point

ostream& operator<<(ostream& out, const Point& p) {
  out << "(" << p.x << ", " << p.y << ")";
  return out;
}

double dist(const Point& p, const Point& q) {
  double dx = p.x - q.x;
  double dy = p.y - q.y;
  return sqrt(dx * dx + dy * dy);
}

#endif

To use it we do this:

#include "Point.h"

int main() {
   Point p;
   // ...
}

#include is a pre-processor command that textually includes Point.h, i.e. the #include statement gets replaced by the contents of the file.

While #include is simple and straightforward to understand, it causes a problem if you try to include the same file more than once, e.g.:

#include "Point.h"

// ...

#include "Point.h"

Now everything in Point.h is included two times, and so you get compile- time errors when you try to run the program.

In large programs consisting of dozens of files, it is surprisingly easy to accidentally include the same file more than once. It is not always obvious if, and where, a file has been included.

So to deal with this problem of multiple inclusion, Points.h uses the standard trick of defining a unique pre-processor symbol the first time the file is included:

#ifndef point_cmpt125
#define point_cmpt125 201201L

// ... code for Point.h ...

#endif

#ifndef is a pre-processor command that checks to see if the pre-processor symbol point_cmpt125 is undefined. If it is undefined, that means this is the first time the file has been included, and so point_cmpt125 is immediately defined to be a long integer.

Now if Point.h is ever included again, point_cmpt125 will be defined, and the code between #ifndef and #endif will not be included.

Note

Commands beginning with a # are pre-processor commands, and not C++ commands. The pre-processor is a program that is automatically run before compilation that modifies the source file texts in some ways. The pre-processor is necessary in C++ for things like #include, and this #ifndef trick, and can also be used to do other sorts of things such as creating macros. However, over-use of the pre-processor is generally considered bad practice (most other modern languages dispense with it) in C++ because it is quite primitive: it manipulates the source code written by the programmer, and knows little about the rules of C++.