Machine Model II
Pointer Errors
A pointer is an atomic type, since it cannot be subdivided into smaller objects. As with other atomic types, a variable of pointer type that isn’t explicitly initialized is default initialized to an undefined value:
int x = 3;
int *ptr; // undefined value
ptr = &x; // well-defined value -- the address of x
Dereferencing a default-initialized pointer results in undefined behavior – the program may crash, or it may not; reading the dereferenced value can result in zero, or some other random value. Undefined behavior is notoriously difficult to debug, as the behavior can be different on different machines or in different runs of the program on the same machine. Tools like Valgrind or an address sanitizer can help detect undefined behavior.
The following is another example of default initializing pointers:
int main() {
int *x;
int *y;
int a = -1;
x = &a;
cout << *x << endl; // prints -1
*x = 42;
cout << a << endl; // prints 42
*y = 13; // UNDEFINED BEHAVIOR
}
Figure 13 illustrates the execution of this
code. Both x
and y
are default initialized to indeterminate
values. The code proceeds to assign the address of a
into x
,
so x
now has a well-defined value, and it is valid to dereference
x
. Assigning to the object it is pointing at changes the value of
that object, the one named by a
. Dereferencing y
, on the other
hand, produces undefined behavior, since its value is a junk address.
The program may crash, or it may not. It may overwrite a memory
location that is in use by something else, or some location that is
not in use. With undefined behavior, anything is possible.
A null pointer is a pointer that holds an address value of 0x0
.
No object can be located at that address, making the null value a
useful value for a pointer that does not point to a valid object. In
C++, the nullptr
literal represents the null value, and it can be
used with any pointer type:
int *ptr1 = nullptr;
double *ptr2 = nullptr;
cout << (ptr1 == nullptr) << endl; // prints 1 (i.e. true)
Dereferencing a null pointer also results in undefined behavior. However, in most implementations, doing so will crash the program, which is generally easier to debug than other forms of undefined behavior.
A null pointer is sometimes used to indicate the lack of a value, as in the following:
// EFFECTS: Returns a pointer to the first string in vec of the given
// length, or a null pointer if no such string is in vec.
string * find_by_length(vector<string> &vec, int length) {
for (size_t i = 0; i < vec.size(); ++i) {
if (vec[i].size() == length) {
return &vec[i];
}
}
return nullptr; // no string of given length
}
int main() {
vector<string> v = { "hello", "world", "from",
"EECS", "280" };
string *found = find_by_length(v, 3);
if (found) { // null pointer has false value
cout << "found string: " << *found << endl;
} else {
cout << "no such string" << endl;
}
}
In this example, the find_by_length()
function returns a pointer
to a string of the given length. However, if no such string exists,
the function returns a null pointer to indicate this. The caller must
check whether the return value is null before attempting to
dereference it.
Since a pointer is an object in its own right, it can live past the lifetime of the object it is pointing at. The following is an example:
int * get_address(int x) {
return &x;
}
void print(int val) {
cout << val << endl;
}
int main() {
int a = 3;
int *ptr = get_address(a);
print(42);
cout << *ptr << endl; // UNDEFINED BEHAVIOR
}
In this code, the parameter of the get_address()
function is
passed by value. So the x
parameter is a new object in the
activation record for get_address()
, and it is initialized as a
copy of a
. The function returns the address of x
, and that
value is placed in the ptr
object in main()
. However, x
dies along with the activation record of get_address()
, so ptr
is now pointing at a dead object. The code then calls print()
, and
it so happens that its activation record is placed in the same
location previously used by get_address()
, as shown in
Figure 14. At this point, ptr
happens
to point at the val
object, whose value is 42. When the
print()
function returns, its activation record is also reclaimed.
Proceeding to dereference ptr
produces undefined behavior. It so
happens in this implementation that 42 is printed, but other
implementations may have different behavior.
We can fix the code above by passing the parameter to
get_address()
by reference:
int * get_address(int &x) {
return &x;
}
void print(int val) {
cout << val << endl;
}
int main() {
int a = 3;
int *ptr = get_address(a);
print(42);
cout << *ptr << endl; // prints 3
}
Now x
aliases the object a
in main()
, as shown in
Figure 15. Thus, get_address()
returns the address of a
, which is still alive when *ptr
is
printed.
Function Calls and the Call Stack
Previously, we saw a basic machine model in which the program places each object at a different location in memory. We now examine a more structured model, stack-based memory management, that is used by many language implementations.
In most implementations, the data for a function call are collectively stored within an activation record, which contains space for each of the function’s parameters and local variables, temporary objects, the return address, and other items that are needed by the function. In this course, we will generally only consider the parameters and local variables in an activation record, ignoring the other pieces of data there.
In stack-based memory management, activation records are stored in a data structure called a stack. A stack works just like a stack of pancakes: when a new pancake is made, it is placed on top of the stack, and when a pancake is removed from the stack, it is the top pancake that is taken off. Thus, the last pancake to be made is the first to be eaten, resulting in last-in, first-out (LIFO) behavior. Activation records are similarly stored: when an activation record is created, it is placed on top of the stack, and the first activation record to be destroyed is the last one that was created. This gives rise to an equivalent term stack frame for an activation record.
As an example, consider the following program:
void bar() {
}
void foo() {
bar();
}
int main() {
foo();
}
When the program is run, the main()
function is called, so an
activation record is created and added to the top of the stack. Then
main()
calls foo()
, which places an activation record for
foo()
on the top of the stack. Then bar()
is called, so its
activation record is put on the stack. When bar()
returns, its
activation record is removed from the stack. Then foo()
completes,
removing its activation record. Finally, the activation record for
main()
is destroyed when the function returns. Figure 16
shows the state of the stack after each call and return.
In many implementations, the stack is actually stored upside down in memory, so that it grows downward rather than upward as shown in Figure 17. However, it still has the same LIFO behavior as a right-side-up stack.
As a more complex example, consider the following program:
int plus_one(int x) {
return x + 1;
}
int plus_two(int x) {
return plus_one(x + 1);
}
int main() {
int result = 0;
result = plus_one(0);
result = plus_two(result);
cout << result; // prints 3
}
At program startup, the main()
function is called, creating an
activation record that holds the single local variable result
. The
declaration of result
initializes its value to 0, and the program
proceeds to call plus_one(0)
. This creates an activation record
for plus_one()
that holds the parameter x
. The program
initializes the value of x
to the argument value 0 and runs the
body of plus_one()
. The body computes the value of x + 1
by
obtaining the value of x
and adding 1 to it, resulting in a value
of 1, which the function then returns. The return value replaces the
original call to plus_one(0)
, and the activation record for
plus_one
is discarded before main()
proceeds. The code then
assigns the return value of 1 to result
.
Figure 18 illustrates the activation records
up to this point.
The program then proceeds to call plus_two(result)
. First,
result
is evaluated to obtain the value 1. Then an activation
record is created for plus_two()
, with space for its parameter
x
. Observe that this activation record is located in memory where
the previous activation record for plus_one()
was – the latter is
no longer in use, so the memory can be reused. After the new
activation record is created, the parameter x
is initialized with
the argument value 1. Then the program runs the body of
plus_two()
.
The body of plus_two()
in turn calls plus_one(x + 1)
. This
evaluates x + 1
to obtain the value 2, creates an activation
record for plus_one()
, initializes the value of x
in the new
activation record to be 2, and then runs the body of plus_one()
.
The state of memory at this point is shown in
Figure 19.
Observe that the new activation record for plus_one()
is distinct
from the previous one – each invocation of a function gets its own
activation record. In addition, there are now two variables x
in
the program. Within the scope of plus_one()
, x
refers to the
object located in the activation record for plus_one()
, and its
value is 2. Within plus_two()
, x
refers to the object in the
activation record for plus_two()
, and its value is 1.
The invocation of plus_one()
computes 3 as its return value, so
that value replaces the call to plus_one()
, and the activation
record for plus_one()
is discarded. plus_two()
returns that
same value 3, so the value 3 replaces the call to plus_two()
in
main()
, and the activation record for plus_two()
is discarded.
Then main()
proceeds to assign the value 3 to result
and print
it out. Finally, when main()
returns, its activation record too is
discarded.
Function-Call Process
To summarize, the following steps occur in a function call:
For pass-by-value parameters, the argument expressions are evaluated to obtain their values.
For a pass-by-reference parameter, the corresponding argument expression is evaluated to obtain an object [1] rather than its value.
The order in which arguments are evaluated is unspecified in C++.
A new and unique activation record is created for the call, with space for the function’s parameters, local variables, and metadata. The activation record is pushed onto the stack.
The parameters are passed, using the corresponding arguments to initialize the parameters. For a pass-by-value parameter, the corresponding argument value is copied into the parameter. For a pass-by-reference parameter, the parameter is initialized as an alias of the argument object.
The body of the called function is run. This transfer of control is often called active flow, since the code actively tells the computer which function to run.
When the called function returns, if the function returns a value, that value replaces the function call in the caller.
The activation record for the called function is destroyed. In simple cases, implementations will generally just leave in memory what is already there, simply marking the memory as no longer in use.
Execution proceeds from the point of the function call in the caller. This transfer of control is often called passive flow, since the code does not explicitly tell the computer which function to run.
The following program is an example of pass by reference:
void swap(int &x, int &y) {
int tmp = x;
x = y;
y = tmp;
}
int main() {
int a = 3;
int b = 7;
cout << a << ", " << b << endl; // prints 3, 7
swap(a, b);
cout << a << ", " << b << endl; // prints 7, 3
}
The program starts by creating an activation record for main()
,
with space for the local variables a
and b
. It initializes
a
to 3 and b
to 7 and then prints out their values. The
program then calls swap(a, b)
, which evaluates the expressions
a
and b
to obtain their objects, creates an activation record
for swap()
, and initializes the parameters from the argument
objects. Since the two parameters are references, the activation
record does not contain user-accessible memory for the parameters.
(The metadata for the function, however, may include information about
the parameters.) The activation record does contain explicit space for
the local variable tmp
, since it is not a reference.
Figure 20 illustrates the activation record.
Here, the reference parameters are depicted with a dotted line between the names and the objects they reference.
The program proceeds to run the body of swap()
. First, tmp
is
initialized to the value of x
– since x
refers to the same
object as a
in main()
, the value 3 of that object is copied
into the memory for tmp
. Then the assignment x = y
copies the
value 7 from the object associated with y
(b
in main()
) to
the object associated with x
(a
in main()
). Finally, the
assignment y = tmp
copies the value 3 from tmp
into the object
associated with y
(b
in main()
). When swap()
returns,
its activation record is destroyed, and execution continues in
main()
. The latter prints out the new values of 7 and 3 for a
and b
, respectively.
A function can also return an object by reference, though we need to ensure that the lifetime of the object extends beyond the function invocation. The following is an example:
int & larger(int &x, int &y) {
if (y > x) {
return y;
}
return x;
}
int main() {
int a = 3;
int b = 7;
cout << a << ", " << b << endl; // prints 3, 7
larger(a, b) = -1;
cout << a << ", " << b << endl; // prints 3, -1
}
The call larger(a, b)
returns the object a
or b
whose
value is larger, and we then assign the new value -1 to this object.
If larger()
were to return by value instead, attempting to assign
to the resulting value would result in a compiler error.
Returning by reference is most common in data structures, where we wish to access an element by reference. For instance, this is how the indexing (or subscript) operator works on a vector:
std::vector<int> data = /* ... */;
data[0] = -1;
This operator is implemented using operator overloading, which we will discuss later.