We continue our discussion of container ADTs by examining several
forms of polymorphism in the context of containers. We will start
by looking at operator overloading, a form of ad hoc polymorphism,
and then proceed to discuss parametric polymorphism.
C++ follows the philosophy that user-defined types should have the
same access to language facilities as built-in types. Since operators
can be used with built-in atomic types, C++ allows operators to be
applied to class types through operator overloading.
Most operators in C++ can be overloaded. An operator overload
requires at least one of the operands to be of class type [1] –
the behavior of operators on atomic types cannot be changed.
When the compiler encounters an operator where at least one operand is
of class type, it looks for a function whose name is operator
followed by the symbol for the actual operator. For example, if +
is applied to two IntSets, the compiler looks for a function
with name operator+ that can be applied to two IntSet objects.
For most operators, the function can either be a top-level function or
a member of the type of the left-most operand, if it is of class type.
The following is a member function that defines the + operation
to compute the union of two IntSets:
The function first copies over the receiver object into a new local
IntSet. It then iterates over the elements in the other
IntSet, inserting them into the result. As we saw last time, the
insert() function only inserts an element if the set does not
already contain the item.
We can now call the member function directly, or we can apply the
+ operator instead. When we do so, the compiler finds the
overloaded operator+() and calls it for us.
intmain(){IntSetset1;set1.insert(32);set1.insert(42);set1.insert(7);IntSetset2;set2.insert(12);set2.insert(-3);set2.insert(42);IntSetset3=set1.operator+(set2);set3.print(cout);// prints { 32, 42, 7, 12, -3 }IntSetset4=set1+set2;set4.print(cout);// prints { 32, 42, 7, 12, -3 }}
In theory, we could instead define the overloaded operator as a
top-level function, declared as follows:
IntSetoperator+(constIntSet&lhs,constIntSet&rhs);
However, we did not define an interface for iterating through the
elements of an IntSet from outside the class, so we cannot
actually implement this function.
Overloaded operators can take arguments by value or by reference, like
any other function. In most cases, we pass the arguments by reference
to avoid making a copy.
Though most operators can be overloaded either as top-level or member
functions, there are some cases where we must use a top-level
function:
The first operand is of atomic type. Atomic types are not classes,
so they do not have member functions.
The first operand is of class type, but we do not have access to the
class definition, so we cannot define a new member function.
An example of the latter is overloading the stream-insertion operator,
where we do not have access to the definition of ostream. The
following overloads insertion of an IntSet:
We saw previously that inserting to a stream evaluates back to the
stream object. To support this properly, our overload returns the
given stream object. It must return the object by reference:
Streams cannot be copied, so the code would not compile if it
returned a stream by value.
Even if streams could be copied, we want to return the original
stream object itself, not a copy.
Even if a copy would work, we would end up with object slicing,
since os actually will refer to an object of a class that
derives from ostream.
The parameters are in the same order as the operands, from left to
right. The function need only call the print() member function on
the IntSet and then return the given ostream object. Then
we can insert an IntSet directly into a stream:
intmain(){IntSetset;set.insert(32);set.insert(42);cout<<set<<endl;// prints { 32, 42 }}
In other cases, we need to define an operator as a member function:
If the overload needs access to private members, a member function
would have access because it is part of the class. [2]
Some operators can only be overloaded as member functions: the
assignment operator (=), the function-call operator (()),
the subscript operator ([]), and the arrow operator (->).
(We will see examples of overloading the first two operators later.)
As an example, let’s overload the subscript operator to check whether
the given value is in the set. The following does so:
The IntSet container only holds elements that are of type int.
Suppose we wanted another container that holds char values. One
solution is to copy and paste the IntSet code, then change int
to char everywhere it refers to the element type:
classCharSet{public:// Maximum size of a set.staticconstintMAX_SIZE=10;// EFFECTS: Initializes this set to be empty.CharSet();// REQUIRES: size() < MAX_SIZE// MODIFIES: *this// EFFECTS: Adds value to the set, if it isn't already in the set.voidinsert(charvalue);// MODIFIES: *this// EFFECTS: Removes value from the set, if it is in the set.voidremove(charvalue);// EFFECTS: Returns whether value is in the set.boolcontains(charvalue)const;// EFFECTS: Returns the number of elements.intsize()const;// EFFECTS: Prints out the set in an arbitrary order.voidprint(std::ostream&os)const;private:charelements[MAX_SIZE];intnum_elements;// INVARIANTS:// 0 <= num_elements && num_elements <= MAX_SIZE// the first num_elements items in elements are the items in the set// the first num_elements items in elements contain no duplicates};
This is not a very satisfying solution. It leads to duplication of
nearly identical code. Furthermore, if we then wanted a set of
doubles, we would have to define an almost identical
DoubleSet, and so on.
We already know how to avoid code duplication when we have a value
that can be different: add a function parameter that allows the user
to specify the value they care about. The problem here is that our
entity that differs is not a value, but a type. While function
parameters can represent different argument values, we need another
mechanism for specifying type arguments. The mechanism that C++
provides is a template.
A template is a model for producing code. We write a generic version,
parameterized by one or more template parameters. The compiler then
instantiates a specific version of the code by substituting
arguments for the parameters and compiling the resulting code. We
specify a template and its parameters by placing a template header
before the entity that we are defining.
template<typenameT>classUnsortedSet{...};
The template header can go on the same line or the previous one:
whitespace generally does not matter in C++. The header begins with
the template keyword, followed by a parameter list surrounded by
angle brackets. Within the parameter list, we introduce a template
parameter by specifying the kind of entity the parameter can
represent. The typename keyword indicates that the parameter is a
type parameter. [3] The parameter name then follows. Here, we have
chosen the name T, since we don’t have further information about
the type. (Value_type or Element_type are other common names
to use with a container.)
A template may have more than one parameter, and it can also have a
parameter that is of integral type. The following is how the
std::array template is declared:
template<typenameT,std::size_tN>classarray;
Since the entity that follows the template header is a class, it is a
class template. It takes two arguments, one for the element type and
the other for the size of the container, which must be a compile-time
constant. We can then create a std::array of 10 ints as
follows:
std::array<int,10>items;
The syntax for using a class template is to follow the name of the
template with an argument list enclosed by angle brackets. We can
similarly create and use an unsorted set of chars:
UnsortedSet<char>char_set;char_set.insert('e');char_set.insert('a');char_set.insert('e');cout<<char_set<<endl;// prints { e, a }
In order for this to work, we write the definition of UnsortedSet
to use the template parameter T for the element type. The scope
of a template parameter is the entire entity that follows; if it is
a class, then the scope is the entire class definition.
template<typenameT>classUnsortedSet{public:// Maximum size of a set.staticconstintMAX_SIZE=10;// EFFECTS: Initializes this set to be empty.UnsortedSet();// REQUIRES: size() < MAX_SIZE// MODIFIES: *this// EFFECTS: Adds value to the set, if it isn't already in the set.voidinsert(constT&value);// MODIFIES: *this// EFFECTS: Removes value from the set, if it is in the set.voidremove(constT&value);// EFFECTS: Returns whether value is in the set.boolcontains(constT&value)const;// EFFECTS: Returns the number of elements.intsize()const;// EFFECTS: Prints out the set in an arbitrary order.voidprint(std::ostream&os)const;private:Telements[MAX_SIZE];intnum_elements;// INVARIANTS:// 0 <= num_elements && num_elements <= MAX_SIZE// the first num_elements items in elements are the items in the set// the first num_elements items in elements contain no duplicates};
The actual type argument used to instantiate UnsortedSet may be
something small like int, or it may be a large class type such as
string. Thus, it is good practice to pass objects of a template
parameter by reference rather than by value, avoiding copying
potentially large objects.
When we use UnsortedSet with a particular type argument, the
compiler actually plugs the argument in for T, generating code for
that instantiation and compiling it with the rest of the program. For
example, UnsortedSet<string> is instantiated as follows:
classUnsortedSet<string>{public:// Maximum size of a set.staticconstintMAX_SIZE=10;// EFFECTS: Initializes this set to be empty.UnsortedSet();// REQUIRES: size() < MAX_SIZE// MODIFIES: *this// EFFECTS: Adds value to the set, if it isn't already in the set.voidinsert(conststring&value);// MODIFIES: *this// EFFECTS: Removes value from the set, if it is in the set.voidremove(conststring&value);// EFFECTS: Returns whether value is in the set.boolcontains(conststring&value)const;// EFFECTS: Returns the number of elements.intsize()const;// EFFECTS: Prints out the set in an arbitrary order.voidprint(std::ostream&os)const;private:stringelements[MAX_SIZE];intnum_elements;// INVARIANTS:// 0 <= num_elements && num_elements <= MAX_SIZE// the first num_elements items in elements are the items in the set// the first num_elements items in elements contain no duplicates};
We can also define a function as a template, resulting in a function
template. For example, the following computes the maximum of two
items of the same type:
As with a class template, we define a function template by preceding
it with a template header, which introduces one or more template
parameters. We can then use the parameter anywhere in the function
definition. Here, our function template takes two arguments of the
same type and compares them. The ?: operator is a conditional. It
evaluates its first operand, and if the result is true, it produces
the second operand. Otherwise, it produces the third operand.
Similar to a class template, we can use a function template by
following its name with an argument list enclosed by angle brackets:
[4]
Unlike a class template, however, the compiler is able in most cases
to deduce the template argument from the arguments to the function
call. In the first call above, the arguments are both of type int,
so the compiler can deduce that we want int as the template
argument. Similarly, the arguments are both of type double in the
second call, so the compiler can deduce we want double. Thus, we
can leave off the explicit template arguments:
intmain(){inti1=3;inti2=-3;cout<<max(i1,i2)<<endl;// deduced as max<int>(i1, i2)cout<<max(3.1,7.5)<<endl;// deduced as max<double>(3.1, 7.5)}
The max() function template can only be applied to a type that
supports the > operator. If we try to call it on objects that
don’t support the operator, the result is a compile error:
main.cpp: In instantiation of 'T max(const T&, const T&) [with T = Duck]':
main.cpp:20:30: required from here
main.cpp:14:17: error: no match for 'operator>' (operand types are 'const Duck' and 'const Duck')
return value2 > value1 ? value2 : value1;
~~~~~~~^~~~~~~~
The offending line of code is the one marked as “required from here”
– it is the line that attempts to call max() on two Duck
objects. The compiler actually instantiates max<Duck>(), but the
resulting instantiation produces errors when the compiler tries to
compile it.
We saw previously that the C++ compiler only needs access to
declarations when compiling code that uses a
class or a function defined in some other source file. However, this
is not the case for class and function templates. The compiler must
actually instantiate the definitions for each set of template
arguments, so it must have access to the full definition of a
template.
To provide the compiler with access to the full definition of a
template wherever it is used, we must arrange for the header file to
contain the full definition. We can just define the template directly
in the header file itself; it is still good practice to separate the
declarations from the definitions for the benefit of anyone using our
code.
// max.hpp// EFFECTS: Returns the maximum of value1 and value2.template<typenameT>Tmax(constT&value1,constT&value2);...template<typenameT>Tmax(constT&value1,constT&value2){returnvalue2>value1?value2:value1;}
A better organization is to separate the definitions into their own
file; common convention is to use a suffix like .tpp for this
file. We can then use a #include directive to pull the code into
the header file:
// max.hpp// EFFECTS: Returns the maximum of value1 and value2.template<typenameT>Tmax(constT&value1,constT&value2);#include"max.tpp"
In complex programs, it is often the case that a single file
inadvertently #includes the same header file more than once. For
instance, it may be that module A depends on module B and thus
includes its header. Then module C depends on both A and B, so it
includes both their headers. This results in module C including B’s
header twice, which can cause compiler errors that a function or a
class is defined more than once.
To avoid problems with a header being included more than once, headers
generally have include guards that arrange for the compiler to
ignore the code if the header is included a second time. The following
is an example of include guards:
#ifndef MAX_HPP#define MAX_HPP// max.hpp// EFFECTS: Returns the maximum of value1 and value2.template<typenameT>Tmax(constT&value1,constT&value2);#include"max.tpp"#endif /* MAX_HPP */
The #ifndef and #define directives are the opening of the
include guard, and the #endif closes it. The code within is
ignored if the header is included again. [5]
Defining a member function within the definition for a class template
is no different than a member function within a class. Defining a
member function outside the definition of a class template differs
from that of a regular class; we must inform the compiler that the
function is a member of a class template.
Here, we must tell the compiler that contains() is a member of
UnsortedSet<T>. However, before we can use the name T, we need
to introduce it with a template header. The scope of a template header
is the entity that immediately follows; thus, each member-function
definition needs its own template header.
Now that we have an UnorderedSet class template, we can write an
insertion operator that works on any instantiation by defining it as
a function template:
The function template is parameterized by the element type of an
UnorderedSet, and we use the template parameter as part of the
type for the second function parameter. The compiler is able to deduce
the template argument when we insert a set into a stream:
UnsortedSet<char>char_set;char_set.insert('e');char_set.insert('a');char_set.insert('e');cout<<char_set<<endl;// prints { e, a }
Let us write a function template that copies elements from an array
into a set. We would like to use it as follows:
intmain(){UnsortedSet<int>set1;intarr1[4]={1,2,3,2};fill_from_array(set1,arr1,4);cout<<set1<<endl;// prints { 1, 2, 3 }UnsortedSet<char>set2;chararr2[3]={'a','b','a'};// prints { a, b }fill_from_array(set2,arr2,3);}
The fill_from_array() template must be parameterized by the
element type, and we use that type for both the set and array function
parameters:
The set is passed by reference, to avoid making a copy and to allow
the function to modify the original set. The array decays into a
pointer when it is passed, and we declare the pointer as a pointer to
const, since we do not need to modify the array elements. The body of
the function template just iterates over the array, inserting each
element into the set; the set ignores insertion of duplicate values,
so there is no need to check whether the set already contains a value
before inserting it.
The template mechanism gives us parametric polymorphism; a template
type parameter can take on the form of any type. The compiler
instantiates the code at compile time, so we also refer to this as
static polymorphism. Compare this with subtype polymorphism, where
the program resolves virtual function calls at runtime; we therefore
use dynamic polymorphism to also refer to it.
In designing an ADT, we have the choice of static polymorphism,
dynamic polymorphism, or both. For instance, SortedSet and
UnsortedSet share the same interface, so we could define a common
base interface and make use of dynamic binding. However, dynamic
polymorphism comes with a runtime performance cost; a virtual function
call generally requires two extra pointer dereferences compared to a
non-virtual call. For a container, a program may make many calls to
insert items, remove items, check whether an item is in the container,
and so on. Thus, the general convention in C++ is to avoid dynamic
polymorphism with containers.
Another reason why dynamic polymorphism is not a good fit for a
container is that we usually make the decision of which container to
use when we write our code, not when we run it. However, it is still
good practice to write our code in such a way that we can easily
change the container type we are using. Ideally, we would have a
single location in our code that needs to change if we decide to use
something else. We’ve seen this pattern already when it comes to a
value that is used in many places: define a constant or variable, and
use the associated name rather than the value directly. Thus,
introducing a new name for an entity is a powerful form of
abstraction, allowing us to avoid hardcoding the entity everywhere in
our code. C++ gives us this capability for types with type aliases.
A type alias is a new name for an existing type. We can introduce
a type alias with a using declaration:
usingUnsortedIntSet=UnsortedSet<int>;
The syntax for introducing a type alias with a using declaration is
the using keyword, followed by the name we want to introduce, the
= symbol, and the existing type that we want the name to alias.
The example above introduces the name UnsortedIntSet as an alias
for the type UnsortedSet<int>.
A simple alias can also be defined with the typedef keyword:
typedefUnsortedSet<int>UnsortedIntSet;
The syntax is in the reverse order of a using declaration: the
existing type first, then the new alias.
A using declaration (but not a typedef) can also define an alias
template, which introduces a new name that is a template for a
set of types. The following is an example:
template<typenameT>usingSet=UnsortedSet<T>;
Similar to any template, we introduce an alias template by placing a
template header with the requisite parameters before the using
declaration. Then we can use Set in the same way we would a class
template:
The compiler treats Set<int> and Set<char> as
UnsortedSet<int> and UnsortedSet<char>, since Set is an
alias for UnsortedSet. If we decide to use sorted sets instead,
there is just one place we need to change:
template<typenameT>usingSet=SortedSet<T>;
Now the compiler will treat Set<int> and Set<char> as
SortedSet<int> and SortedSet<char>.
With dynamic polymorphism, we can defer the decision of which derived
class to use until runtime. For example, the following code asks the
user for what color bird they want, then calls a factory function to
create the actual object:
intmain(){stringcolor;cin>>color;Bird*bird=Bird_factory(color,"Myrtle");cout<<"Bird "<<bird->get_name()<<" is "<<bird->get_age()<<" and says: "<<bird->talk()<<endl;deletebird;cin>>color;bird=Bird_factory(color,"Heihei");cout<<"Bird "<<bird->get_name()<<" is "<<bird->get_age()<<" and says: "<<bird->talk()<<endl;deletebird;}
The factory function creates an object of the appropriate derived type
and returns a pointer to it. It cannot create the object in local
memory – a local object dies when the function that created it
returns, so it would not live past the call to Bird_factory().
Instead, we create it in dynamic memory with the new operator:
When designing an inheritance hierarchy, an important property is
substitutability of a derived-class object for a base-class one.
This notion is formalized by the Liskov substitution principle,
which states that in order for type \(S\) to be a subtype of
type \(T\), the following requirement must be met:
Any property of objects of type \(T\) should also be a property
of objects of type \(S\).
—Barbara Liskov, MIT
This implies that in any code that depends on \(T\)’s interface,
an object of type \(S\) can be substituted without any undesirable
effects. If the requirement above is not satisfied, \(S\) would be
a derived type of \(T\) but not a subtype.
A classic example is a ToyDuck class that derives from Duck:
classToyDuck:publicDuck{...public:// EFFECTS: Prints out quack if battery level >= 10.voidtalk()constoverride{if(battery_level>=10){cout<<"quack"<<endl;--battery_level;}}};
Code that uses a Duck may assume that a quacking noise gets
printed out each time talk() is called. A ToyDuck would
violate this expectation, since it does not print anything out if the
ToyDuck has insufficient battery. Thus, ToyDuck is a derived
type of Duck, but not a subtype.