Polymorphism
The word polymorphism literally means “many forms.” In the context of programming, polymorphism refers to the ability of a piece of code to behave differently depending on the context in which it is used. Appropriately, there are several forms of polymorphism:
ad hoc polymorphism, which refers to function overloading
parametric polymorphism in the form of templates
subtype polymorphism, which allows a derived-class object to be used where a base-class object is expected
The unqualified term “polymorphism” usually refers to subtype polymorphism.
We proceed to discuss ad hoc and subtype polymorphism, deferring parametric polymorphism until later.
Function Overloading
Ad hoc polymorphism refers to function overloading, which is the ability to use a single name to refer to many different functions in a single scope. C++ allows both top-level functions and member functions to be overloaded. The following is an example of overloaded member functions:
class Base {
public:
void foo(int a);
int foo(string b);
};
int main() {
Base b;
b.foo(42);
b.foo("test");
}
When we invoke an overloaded function, the compiler resolves the
function call by comparing the types of the arguments to the
parameters of the candidate functions and finding the best match. The
call b.foo(42)
calls the member function foo()
with parameter
int
, since 42 is an int
. The call b.foo("test")
calls the
function with parameter string
– "test"
actually has type
const char *
, but a string
parameter is a better match for a
const char *
than int
.
In C++, functions can only be overloaded when defined within the same scope. If functions of the same name are defined in a different scope, then those that are defined in a closer scope hide the functions defined in a further scope:
class Derived : public Base {
public:
int foo(int a);
double foo(double b);
};
int main() {
Derived d;
d.foo("test"); // ERROR
}
When handling the member access d.foo
, under the name-lookup
process we saw last time, the compiler finds the name
foo
in Derived
. It then applies function-overload resolution;
however, none of the functions with name foo
can be invoked on a
const char *
, resulting in a compile error. The functions
inherited from Base
are not considered, since they were defined in
a different scope.
Function overloading requires the signatures of the functions to differ, so that overload resolution can choose the overload with the most appropriate signature. Here, “signature” refers to the function name and parameter types – the return type is not part of the signature and is not considered in overload resolution.
class Person {
public:
void greet();
void greet(int x); // OK
void greet(string x); // OK
void greet(int x, string s); // OK
void greet(string s, int x); // OK
bool greet(); // ERROR: signature the same as the first overload
void greet() const; // OK: implicit this parameter different
};
For member functions, the const
keyword after the parameter list
is part of the signature – it changes the implicit this
parameter
from being a pointer to non-const to a pointer to const. Thus, it is
valid for two member-function overloads to differ solely in whether or
not they are declared as const
.
Subtype Polymorphism
Subtype polymorphism allows a derived-class object to be used where
a base-class one is expected. In order for this to work, however, we
need indirection. Consider what happens if we directly copy a
Chicken
object into a Bird
:
int main() {
Chicken chicken("Myrtle");
// ...
Bird bird = chicken;
}
While C++ allows this, the value of a Chicken
does not necessarily
fit into a Bird
object, since a Chicken
has more member
variables than a Bird
. The copy above results in object slicing – the members
defined by Bird
are copied, but the Chicken
ones are not, as
illustrated in Figure 37.
To avoid slicing, we need indirection through a reference or a pointer, so that we avoid making a copy:
Bird &bird_ref = chicken;
Bird *bird_ptr = &chicken;
The above initializes bird_ref
as an alias for the chicken
object. Similarly, bird_ptr
is initialized to hold the address of
the chicken
object. In either case, a copy is avoided.
C++ allows a reference or pointer of a base type to refer to an object
of a derived type. It allows implicit upcasts, which are conversions
that go upward in the inheritance hierarchy, such as from Chicken
to Bird
, as in the examples above. On the other hand, implicit
downcasts are prohibited:
Chicken &chicken_ref = bird_ref; // ERROR: implicit downcast
Chicken *chicken_ptr = bird_ptr; // ERROR: implicit downcast
The implicit downcasts are prohibited by C++ even though bird_ref
and bird_ptr
actually refer to Chicken
objects. In the general
case, they can refer to objects that aren’t of Chicken
type, such
as Duck
or just plain Bird
objects. Since the conversions may
be unsafe, they are disallowed by the C++ standard.
While implicit downcasts are prohibited, we can do explicit
downcasts with static_cast
:
Chicken &chicken_ref = static_cast<Chicken &>(bird_ref);
Chicken *chicken_ptr = static_cast<Chicken *>(bird_ptr);
These conversions are unchecked at runtime, so we need to be certain
from the code that the underlying object is a Chicken
.
In order to be able to bind a base-class reference or pointer to a derived-class object, the inheritance relationship must be accessible. From outside the classes, this means that the derived class must publicly inherit from the derived class. Otherwise, the outside world is not allowed to take advantage of the inheritance relationship. Consider this example:
class A {
};
class B : A { // default is private when using the class keyword
};
int main() {
B b;
A *a_ptr = &b; // ERROR: inheritance relationship is private
}
This results in a compiler error:
main.cpp:9:16: error: cannot cast 'B' to its private base class 'A'
A *a_ptr = &b; // ERROR: inheritance relationship is private
^
main.cpp:4:13: note: implicitly declared private here
class B : A { // default is private when using the class keyword
^
1 error generated.
Static and Dynamic Binding
Subtype polymorphism allows us to pass a derived-class object to a function that expects a base-class object:
void Image_init(Image* img, istream& is);
int main() {
Image *image = /* ... */;
istringstream input(/* ... */);
Image_init(image, input);
}
Here, we have passed an istringstream
object to a function that
expects an istream
. Extracting from the stream will use the
functionality that istringstream
defines for extraction.
Another common use case is to have a container of base-class pointers, each of which points to different derived-class objects:
void all_talk(Bird *birds[], int length) {
for (int i = 0; i < length; ++i) {
array[i]->talk();
}
}
int main() {
Chicken c1 = /* ... */;
Duck d = /* ... */;
Chicken c2 = /* ... */;
Bird *array[] = { &c1, &d, &c2 };
all_talk(array, 3);
}
Unfortunately, given the way we defined the talk()
member function
of Bird
last time, this code will not use the derived-class
versions of the function. Instead, all three calls to talk()
will
use the Bird
version:
$ ./main.exe
tweet
tweet
tweet
In the invocation array[i]->talk()
, the declared type of the
receiver, the object that is receiving the member-function call, is
different from the actual runtime type. The declared or static type
is Bird
, while the runtime or dynamic type is Chicken
when
i == 0
. This disparity can only exist when we have indirection,
either through a reference or a pointer.
For a particular member function, C++ gives us the option of either static binding where the compiler determines which function to call based on the static type of the receiver, or dynamic binding, where the program also takes the dynamic type into account. The default is static binding, since it is more efficient and can be done entirely at compile time.
In order to get dynamic binding instead, we need to declare the member function as virtual in the base class:
class Bird {
...
virtual void talk() const {
cout << "tweet" << endl;
}
};
Now when we call the all_talk()
function above, the compiler will
use the dynamic type of the receiver in the invocation
array[i]->talk()
:
$ ./main.exe
bawwk
quack
bawwk
The virtual
keyword is necessary in the base class, but optional
in the derived classes. It can only be applied to the declaration
within a class; if the function is subsequently defined outside of
the class, the definition cannot include the virtual
keyword:
class Bird {
...
virtual void talk() const;
};
void Bird::talk() const {
cout << "bawwk" << endl;
}
dynamic_cast
With dynamic binding, the only change we need to make to our code is
to add the virtual
keyword when declaring the base-class member
function. No changes are required to the actual function calls (e.g.
in all_talk()
).
Consider an alternative to dynamic binding, where we manually check
the runtime type of an object to call the appropriate function. In
C++, a dynamic_cast
conversion checks the dynamic type of the
receiver object:
Chicken chicken("Myrtle");
Bird *b_ptr = &chicken;
Chicken *c_ptr = dynamic_cast<Chicken *>(b_ptr);
if (c_ptr) { // check for null
// do something chicken-specific
}
If the dynamic type is not actually a Chicken
, the conversion
results in a null pointer. Otherwise, it results in the address of the
Chicken
object. Thus, we can check for null after the conversion
to determine if it succeeded.
There are two significant issues with dynamic_cast
:
It generally results in messy and unmaintainable code. For instance, we would need to modify
all_talk()
as follows to usedynamic_cast
rather than dynamic binding:void all_talk(Bird * birds[], int length) { for (int i = 0; i < length; ++i) { Chicken *c_ptr = dynamic_cast<Chicken*>(birds[i]); if (c_ptr) { c_ptr->talk(); } Duck *d_ptr = dynamic_cast<Duck*>(birds[i]); if (d_ptr) { d_ptr->talk(); } Eagle *e_ptr = dynamic_cast<Eagle*>(birds[i]); if (e_ptr) { e_ptr->talk(); } ... } }
We would need a branch for every derived type of
Bird
, and we would have to add a new branch every time we wrote a new derived class. The code also takes time that is linear in the number of derived classes.In C++,
dynamic_cast
can only be applied to classes that are polymorphic, meaning that they define at least one virtual member function. Thus, we need to usevirtual
one way or another.
Code that uses dynamic_cast
is usually considered to be poorly
written. Almost universally, it can be rewritten to use dynamic
binding instead.
Member Lookup Revisited
We have already seen that when a member is accessed on an object, the compiler first looks in the object’s class for a member of that name before proceeding to its base class. With indirection, the following is the full lookup process:
The compiler looks up the member in the static type of the receiver object, using the lookup process we discussed before (starting in the class itself, then looking in the base class if necessary). It is an error if no member of the given name is found in the static type or its base types.
If the member found is an overloaded function, then the arguments of the function call are used to determine which overload is called.
If the member is a variable or non-virtual function (including static member functions, which we will see later), the access is statically bound at compile time.
If the member is a virtual function, the access uses dynamic binding. At runtime, the program will look for a function of the same signature, starting at the dynamic type of the receiver, then proceeding to its base type if necessary.
As indicated above, dynamic binding requires two conditions to be met to use the derived-class version of a function:
The member function found at compile time using the static type must be virtual.
The derived-class function must have the same signature as the function found at compile time.
When these conditions are met, the derived-class function overrides the base-class one – it will be used instead of the base-class function when the dynamic type of the receiver is the derived class. If these conditions are not met, the derived-class function hides the base-class one – it will only be used if the static type of the receiver is the derived class.
As an example, consider the following class hierarchy:
class Top {
public:
int f1() const {
return 1;
}
virtual int f2() const {
return 2;
}
};
class Middle : public Top {
public:
int f1() const {
return 3;
}
virtual int f2() const {
return 4;
}
};
class Bottom : public Middle {
public:
int f1() const {
return 5;
}
virtual int f2() const {
return 6;
}
};
Each class has a non-virtual f1()
member function; since the
function is non-virtual, the derived-class versions hide the ones in
the base classes. The f2()
function is virtual, so the
derived-class ones override the base-class versions.
The following are some examples of invoking these functions:
int main() {
Top top;
Middle mid;
Bottom bot;
Top *top_ptr = ⊥
Middle *mid_ptr = ∣
cout << top.f2() << endl; // prints 2
cout << mid.f1() << endl; // prints 3
cout << top_ptr->f1() << endl; // prints 1
cout << top_ptr->f2() << endl; // prints 6
cout << mid_ptr->f2() << endl; // prints 4
mid_ptr = ⊥
cout << mid_ptr->f1() << endl; // prints 3
cout << mid_ptr->f2() << endl; // prints 6
}
We discuss each call in turn:
There is no indirection in the calls
top.f1()
andmid.f1()
, so there is no difference between the static and dynamic types of the receivers. The former calls theTop
version off1()
, resulting in 2, while the latter calls theMiddle
version, producing 3.The static type of the receiver in
top_ptr->f1()
andtop_ptr->f2()
isTop
, while the dynamic type isBottom
. Sincef1()
is non-virtual, static binding is used, resulting in 1. On the other hand,f2()
is virtual, so dynamic binding uses theBottom
version, producing 6.In the first call to
mid_ptr->f2()
, both the static and dynamic type of the receiver isMiddle
, soMiddle
‘s version is used regardless of whetherf2()
is virtual. The result is 4.The assignment
mid_ptr = &bot
changes the dynamic type of the receiver toBottom
in calls onmid_ptr
. The static type remainsMiddle
, so the callmid_ptr->f1()
results in 3. The second call tomid_ptr->f2()
, however, uses dynamic binding, so theBottom
version off2()
is called, resulting in 6.
The override
Keyword
A common mistake when attempting to override a function is to inadvertently change the signature, so that the derived-class version hides rather than overrides the base-class one. The following is an example:
class Chicken : public Bird {
...
virtual void talk() {
cout << "bawwk" << endl;
}
}
int main() {
Chicken chicken("Myrtle");
Bird *b_ptr = &chicken;
b_ptr->talk();
}
This code compiles, but it prints tweet
when run. Under the lookup
process above, the program looks for an override of Bird::talk()
at runtime. However, no such override exists – Chicken::talk()
has a different signature, since it is not const
. Thus, the
dynamic lookup finds Bird::talk()
and calls it instead.
Rather than having the code compile and then behave incorrectly, we
can ask the compiler to detect bugs like this with the override
keyword. Specifically, we can place the override
keyword after the
signature of a member function to let the compiler know we intended to
override a base-class member function. If the derived-class function
doesn’t actually do so, the compiler will report this:
class Chicken : public Bird {
...
void talk() override {
cout << "bawwk" << endl;
}
}
Here, we have removed the virtual
keyword, since it is already
implied by override
– only a virtual function can be overridden,
and the “virtualness” is inherited from the base class. Since we are
missing the const
, the compiler reports the following:
main.cpp:39:15: error: non-virtual member function marked 'override' hides
virtual member function
void talk() override {
^
main.cpp:20:16: note: hidden overloaded virtual function 'Bird::talk' declared
here: different qualifiers (const vs none)
virtual void talk() const {
^
1 error generated.
Adding in the const
fixes the issue:
class Chicken : public Bird {
...
void talk() const override {
cout << "bawwk" << endl;
}
}
int main() {
Chicken chicken("Myrtle");
Bird *b_ptr = &chicken;
b_ptr->talk();
}
The code now prints bawwk
.
Abstract Classes and Interfaces
In some cases, there isn’t enough information in a base class to
define a particular member function, but we still want that function
to be part of the interface provided by all its derived classes. In
the case of Bird
, for example, we may want a get_wingspan()
function that returns the average wingspan for a particular kind of
bird. There isn’t a default value that makes sense to put in the
Bird
class. Instead, we declare get_wingspan()
as a pure
virtual function, without any implementation in the base class:
class Bird {
...
virtual int get_wingspan() const = 0;
};
The syntax for declaring a function as pure virtual is to put = 0;
after its signature. This is just syntax – we aren’t actually setting
its value to 0.
Since Bird
is now missing part of its implementation, we can no
longer create objects of Bird
type. The Bird
class is said to
be abstract. We can still declare Bird
references and pointers,
however, since that doesn’t create a Bird
object. We can then have
such references and pointers refer to derived-class objects:
Bird bird("Big Bird"); // ERROR: Bird is abstract
Chicken chicken("Myrtle"); // OK, as long as Chicken is not abstract
Bird &bird_ref = chicken; // OK
Bird *bird_ptr = &chicken; // OK
In order for a derived class to not be abstract itself, it must provide implementations of the pure virtual functions in its base classes:
class Chicken : public Bird {
...
int get_wingspan() const override {
return 20; // inches
}
};
With a virtual function, a base class provides its derived classes with the option of overriding the function’s behavior. With a pure virtual function, the base class requires its derived classes to override the function, since the base class does not provide an implementation itself. If a derived class fails to override the function, the derived class is itself abstract, and objects of that class cannot be created.
We can also define an interface, which is a class that consists only of pure virtual functions. Such a class provides no implementation; rather, it merely defines the interface that must be overridden by its derived classes. The following is an example:
class Shape {
public:
virtual double area() const = 0;
virtual double perimeter() const = 0;
virtual void scale(double s) = 0;
};
With subtype polymorphism, we end up with two use cases for inheritance:
implementation inheritance, where a derived class inherits functionality from a base class
interface inheritance, where a derived class inherits the interface of its base class, but not necessarily any implementation
Deriving from a base class that isn’t an interface results in both implementation and interface inheritance. Deriving from an interface results in just interface inheritance. The latter is useful to work with a hierarchy of types through a common interface, using a base-class reference or pointer, even if the derived types don’t share any implementation.