Lecture 13¶
Notes on Object-Oriented Programming¶
Object-oriented programming (OOP) was first implemented in the language Simula 67, and was more fully explored in SmallTalk. It is now the major form of abstraction in most programming languages.
In these notes we will look at three different approaches to OOP in the languages C++, Python, Go, JavaScript, and Dart.
C++¶
C++ use a standard class/object approach to OOP. In C++ source code, you write classes that specify the data and methods for an object of the classes type.
For example, here is a class that lets you create object of type Person
:
class Person {
private:
string name;
int age;
public:
Person(const string& n, int a) : name(n), age(a) {}
string get_name() const { return name; }
int get_age() const { return age; }
virtual void print() const {
cout << name << " is "
<< age << " years old.\n";
}
};
The private:
and public:
labels divide the functions and variables of
Person
into two kinds: the private things that can only be accessed with
the class itself, and the public things that can be accessed outside the
class.
The idea of making data private is to allow the methods in Person
to have
complete control over them. Any function in the program can change a public
variable, which could lead to bugs or errors due to unintentional changes.
Private variables restrict access in a useful way: you’ll get a compiler error
if you try to access a private variable outside the class.
The keyword virtual
at the start of the function header for print()
tells C++ that classes that inherit from Person
are permitted, if they
wish, to supply their own version of print
. We’ll see below how this is
used.
We can use Person
like this:
Person p("Mary", 67);
p.print();
Person
requires that you create a Person
object by supplying a name
and age. There’s no way to create an uninitialized Person
object, which
helps prevent errors.
get_name()
and get_age()
are called getters because what they do
is get the value of some variable in the class. If you write p.age
or
p.name
directly because they are private variables, which means such
access is forbidden.
Inside the print
method don’t need to call get_name()
and
get_age()
because methods in a class are allowed to refer directly to the
private variables of that class.
Notice that there is no way to change a person’s name or age once a Person
object is created. It only has getters, which makes it a read-only object.
Whether or not you want a particular class of object to be read-only is a
design decision (and in this case it makes little sense, because people’s ages
change every year, and, occasionally, so too do their names).
An important technique in class-based OOP is inheritance. Inheritance is a way to create a new class based on some other class. For example:
class Student : public Person {
private:
string school;
public:
Student(const string& n, int a, const string& s)
: Person(n, a), school(s)
{ }
string get_school() const { return school; }
void print() const {
cout << get_name() << " is "
<< get_age() << " years old and attends "
<< school << ".\n";
}
};
We say that Student
subclasses, or extends, the Person
class.
That means that all the data and methods in Person
are automatically put into
Student
.
Now you can write code like this:
Person p("Mary", 67);
p.print();
Student s("Barry", 12, "Sun Ray Elementary");
s.print();
Notice that the Student
class defines its own version of print
, and so
when s.print()
is called, it is the Student
version of print
that
is executed.
When C++ encounters the statement s.print()
, how does it decide what
version of print
to call? Since s
is of type Student
, it calls the
print
associated with Student
. C++ can determine this fact at compile-
time because the compiler can see everything it needs to know to infer it.
But things get more interesting in this example:
vector<Person*> people = {new Person{"Mary", 67},
new Student{"Barry", 12, "Sun Ray Elementary"}
};
for(Person* p : people) {
p->print(); // same as (*p).print()
}
Here, people
is a vector
of pointers to Person
objects. A new
expression, such as new Person{"Mary", 67}
, returns a pointer to a newly
allocated object.
Inside the for-loop, the statement p->print()
is executed. What version of
print
is called? The one for Person
, or the one for Student
? The
variable p
is of type Person*
, so we know that p
is pointing to
either a Person
object or a Student
object. Both of those kinds of
objects have a method called print()
, and so what code gets executed by
p->print()
.
The answer is that it depends on the type of object p
points to. If p
points to a Person
object, then the Person
print()
function is
called. If instead p
points to a Student
object, then the Student
print()
function is called.
What’s interesting here is that C++ does not know until run-time what type of
object p
points to. By looking at the statement p->print()
alone, it
is impossible to tell which version of print()
is called. Thus, the
compiler, which only has access to the source code, cannot know whether it is
the Person
print()
or Student
print()
that will be executed
here.
Even though we don’t know for sure which print()
is called, there is no
type error because we do that both Person
objects and Student
objects
have a method named print()
. So we can be certain that a print()
can
be called at that point.
Notice also that we must call get_name()
and get_age()
inside the
Student
class. That’s because the variables name
and age
are
private, and so cannot be directly accessed outside of the Person
class.
Even though Student
inherits those variables from Person
, code in
student does not have the ability to directly access it’s private
variables.
Warning
In C++, you must write the above code using pointers. The following code runs, but won’t work the way we would like:
vector<Person> people = {Person{"Mary", 67},
Student{"Barry", 12, "Sun Ray Elementary"}
};
for(Person p : people) {
p.print();
}
Here, the people
vector contains Person
objects instead of pointers
to Person
objects. That means when p.print()
is called, the code
for the Person
version of print
is executed no matter what because
the choice is made based on the type of p
(instead of the type of the
object p
points to).
Thus, for most practical purpose, OOP in C++ requires that you use pointers to objects instead of objects themselves. The problem with this approach is that it is up to the programmer to remember to use the correct techniques, and, also, to deal with any pointer errors.
Some other languages, such as Java, avoid this problem by making all object variables pointers (references). Thus, in Java, you simply cannot make a variable that directly names an object; it is always a pointer. This avoids the above sort of problem.