Lecture 13

Notes on Object-Oriented Programming

Object-oriented programming (OOP) was first implemented in the language Simula 67, and was more fully explored in SmallTalk. It is now the major form of abstraction in most programming languages.

In these notes we will look at three different approaches to OOP in the languages C++, Python, Go, JavaScript, and Dart.

C++

C++ use a standard class/object approach to OOP. In C++ source code, you write classes that specify the data and methods for an object of the classes type.

For example, here is a class that lets you create object of type Person:

class Person {
private:
    string name;
    int age;

public:
    Person(const string& n, int a) : name(n), age(a) {}

    string get_name() const { return name; }
    int get_age() const { return age; }

    virtual void print() const {
        cout << name << " is "
             << age << " years old.\n";
    }
};

The private: and public: labels divide the functions and variables of Person into two kinds: the private things that can only be accessed with the class itself, and the public things that can be accessed outside the class.

The idea of making data private is to allow the methods in Person to have complete control over them. Any function in the program can change a public variable, which could lead to bugs or errors due to unintentional changes. Private variables restrict access in a useful way: you’ll get a compiler error if you try to access a private variable outside the class.

The keyword virtual at the start of the function header for print() tells C++ that classes that inherit from Person are permitted, if they wish, to supply their own version of print. We’ll see below how this is used.

We can use Person like this:

Person p("Mary", 67);
p.print();

Person requires that you create a Person object by supplying a name and age. There’s no way to create an uninitialized Person object, which helps prevent errors.

get_name() and get_age() are called getters because what they do is get the value of some variable in the class. If you write p.age or p.name directly because they are private variables, which means such access is forbidden.

Inside the print method don’t need to call get_name() and get_age() because methods in a class are allowed to refer directly to the private variables of that class.

Notice that there is no way to change a person’s name or age once a Person object is created. It only has getters, which makes it a read-only object. Whether or not you want a particular class of object to be read-only is a design decision (and in this case it makes little sense, because people’s ages change every year, and, occasionally, so too do their names).

An important technique in class-based OOP is inheritance. Inheritance is a way to create a new class based on some other class. For example:

class Student : public Person {
private:
    string school;
public:
    Student(const string& n, int a, const string& s)
    : Person(n, a), school(s)
    { }

    string get_school() const { return school; }

    void print() const {
        cout << get_name() << " is "
             << get_age() << " years old and attends "
             << school << ".\n";
     }
};

We say that Student subclasses, or extends, the Person class. That means that all the data and methods in Person are automatically put into Student.

Now you can write code like this:

Person p("Mary", 67);
p.print();

Student s("Barry", 12, "Sun Ray Elementary");
s.print();

Notice that the Student class defines its own version of print, and so when s.print() is called, it is the Student version of print that is executed.

When C++ encounters the statement s.print(), how does it decide what version of print to call? Since s is of type Student, it calls the print associated with Student. C++ can determine this fact at compile- time because the compiler can see everything it needs to know to infer it.

But things get more interesting in this example:

vector<Person*> people = {new Person{"Mary", 67},
                          new Student{"Barry", 12, "Sun Ray Elementary"}
                         };
for(Person* p : people) {
    p->print();   // same as (*p).print()
}

Here, people is a vector of pointers to Person objects. A new expression, such as new Person{"Mary", 67}, returns a pointer to a newly allocated object.

Inside the for-loop, the statement p->print() is executed. What version of print is called? The one for Person, or the one for Student? The variable p is of type Person*, so we know that p is pointing to either a Person object or a Student object. Both of those kinds of objects have a method called print(), and so what code gets executed by p->print().

The answer is that it depends on the type of object p points to. If p points to a Person object, then the Person print() function is called. If instead p points to a Student object, then the Student print() function is called.

What’s interesting here is that C++ does not know until run-time what type of object p points to. By looking at the statement p->print() alone, it is impossible to tell which version of print() is called. Thus, the compiler, which only has access to the source code, cannot know whether it is the Person print() or Student print() that will be executed here.

Even though we don’t know for sure which print() is called, there is no type error because we do that both Person objects and Student objects have a method named print(). So we can be certain that a print() can be called at that point.

Notice also that we must call get_name() and get_age() inside the Student class. That’s because the variables name and age are private, and so cannot be directly accessed outside of the Person class. Even though Student inherits those variables from Person, code in student does not have the ability to directly access it’s private variables.

Warning

In C++, you must write the above code using pointers. The following code runs, but won’t work the way we would like:

vector<Person> people = {Person{"Mary", 67},
                         Student{"Barry", 12, "Sun Ray Elementary"}
                        };
for(Person p : people) {
    p.print();
}

Here, the people vector contains Person objects instead of pointers to Person objects. That means when p.print() is called, the code for the Person version of print is executed no matter what because the choice is made based on the type of p (instead of the type of the object p points to).

Thus, for most practical purpose, OOP in C++ requires that you use pointers to objects instead of objects themselves. The problem with this approach is that it is up to the programmer to remember to use the correct techniques, and, also, to deal with any pointer errors.

Some other languages, such as Java, avoid this problem by making all object variables pointers (references). Thus, in Java, you simply cannot make a variable that directly names an object; it is always a pointer. This avoids the above sort of problem.