Managing Dynamic Memory

We saw previously that the compiler and runtime automatically manage the lifetime of objects associated with static, local, and member variables. For instance, if we declare an array as a local variable, it will die when the variable goes out of scope:

int main() {
  int arr[10];

  for (int i = 0; i < 10; ++i) {
    arr[i] = i;
  }

  ...
} // arr dies here

An automatically managed array must have a size that is known at compile time, so that the compiler can properly manage it. The compiler does not automatically manage dynamic arrays:

int main() {
  int size;
  cout << "Enter a size:" << endl;
  cin >> size;

  int *darr = new int[size];

  for (int i = 0; i < size; ++i) {
    darr[i] = i;
  }

  ...
} // the pointer darr dies here, but not the memory it is pointing to

The code above contains a memory leak, since we did not delete the memory we dynamically allocated.

We also have seen that when a class-type object dies, its destructor is run:

class C {
public:
  ~C() {
    cout << "C dtor" << endl;
  }
};

int main() {
  C c;

  ...

  cout << "end of c's scope" << endl;
} // c dies here

This prints out the following when run:

end of c's scope
C dtor

Here, the compiler automatically manages the lifetime of the object associated with the local variable c, and since it is of class type, its destructor is run when the object dies.

RAII

We can leverage automatic memory management and destructors by wrapping up the management of a dynamic resource in a class. In particular, we will arrange for the constructor of a class-type object to allocate a resource and for the destructor to release that resource. Doing so ties the management of the resource to the lifetime of the class-type object. This strategy is called resource acquisition is initialization (RAII), and it is also known as scope-based resource management.

The following is a class that manages a dynamic array:

class DynamicIntArray {
public:
  DynamicIntArray(int size_in)
    : elements(new int[size_in]), num_elements(size_in) {}

  ~DynamicIntArray() {
    delete[] elements;
  }

  int size() const {
    return num_elements;
  }

  int & operator[](int i) {
    return elements[i];
  }

  const int & operator[](int i) const {
    return elements[i];
  }

private:
  int *elements;
  int num_elements;
};

When a DynamicIntArray object is created, it allocates a dynamic array of the specified size. The subscript operator is overloaded to index into the array. There are two overloads, one that returns a modifiable element by reference if applied to a non-const DynamicIntArray, and another that returns a non-modifiable element by reference to const when applied to a const DynamicIntArray. The class also provides a size() member function to query the size of the array. When the DynamicIntArray object dies, it deletes the dynamic array that it had allocated, so that memory does not leak.

The following is an example that uses DynamicIntArray:

int main() {
  int size;
  cout << "Enter a size:" << endl;
  cin >> size;

  DynamicIntArray darr(size);      // internally allocates an array

  for (int i = 0; i < darr.size(); ++i) { // size() member function
    darr[i] = i;                            // overloaded subscript
  }

  ...
}     // darr dies here, causing its destructor to run and free the
      // array it allocated

When the object associated with darr is created, it allocates a dynamic array, storing the address of the first element in its elements member.

_images/14_dynamicintarray.svg — Figure 56 A `DynamicIntArray` object manages an array in dynamic memory.

When darr goes out of scope, its associated object dies, and the DynamicIntArray destructor is run. The destructor deletes the array, cleaning up the resource that it was using.

_images/14_dynamicintarray_destructor.svg — Figure 57 The destructor for `DynamicIntArray` deletes the array that it is managing.

Thus, with the RAII pattern, we have leveraged automatic memory management and a class-type object to successfully manage a dynamic array.

Growable Set

Let’s use a similar strategy to write a new version of UnsortedSet without a fixed-capacity limitation. We allow the set to have an arbitrary number of elements by decoupling the storage of the elements from that of the set object, using dynamic memory for the former.

We modify the data representation so that the member variable elements is now a pointer to the first element of a dynamic array. We also add a capacity member variable to keep track of the size of that dynamic array. The resulting class definition for UnsortedSet is as follows:

template <typename T>
class UnsortedSet {
public:
  // EFFECTS:  Initializes this set to be empty.
  UnsortedSet();

  // MODIFIES: *this
  // EFFECTS:  Adds value to the set, if it isn't already in the set.
  void insert(const T &value);

  // MODIFIES: *this
  // EFFECTS:  Removes value from the set, if it is in the set.
  void remove(const T &value);

  // EFFECTS:  Returns whether value is in the set.
  bool contains(const T &value) const;

  // EFFECTS:  Returns the number of elements.
  int size() const;

  // EFFECTS:  Prints out the set in an arbitrary order.
  void print(std::ostream &os) const;

private:
  // MODIFIES: *this
  // EFFECTS:  Doubles the capacity of this set.
  void grow();

  // Initial size of a set.
  static const int INITIAL_SIZE = 5;

  T *elements;
  int capacity;
  int num_elements;
  // INVARIANTS:
  // elements points to the sole dynamic array associated with this set
  // capacity is the size of the array that elements points to
  // 0 <= num_elements && num_elements <= capacity
  // the first num_elements items in elements are the items in the set
  // the first num_elements items in elements contain no duplicates
};

We have added two representation invariants:

elements points to the start of the sole dynamic array associated with the set, and each set has its own dynamic array
capacity is the size of dynamic array that elements points to

We have also added two private members:

grow() doubles the capacity of the set
INITIAL_SIZE is a constant whose value is the initial size of a set

The following is the new constructor for UnsortedSet:

template <typename T>
UnsortedSet<T>::UnsortedSet()
  : elements(new T[INITIAL_SIZE]),
    capacity(INITIAL_SIZE),
    num_elements(0) {}

The constructor allocates a dynamic array of size INITIAL_SIZE and stores the address of its first element in elements. We satisfy the remaining representation invariants by setting capacity to INITIAL_SIZE and num_elements to 0.

We modify insert() so that if the set is out of space, it calls the grow() member function to increase the available capacity:

template <typename T>
void UnsortedSet<T>::insert(const T &value) {
  if (contains(value)) {
    return;
  }
  if (num_elements == capacity) {
    grow(); // double the capacity;
  }
  elements[num_elements] = value;
  ++num_elements;
}

The grow() member function doubles the capacity of the set. In C++, an array has a fixed size, even if it is dynamic, so we cannot change the size of the existing elements array. Instead, we allocate a new dynamic array, copy over the elements, and discard the old array, as shown in Figure 58.

_images/14_grow.svg — Figure 58 The `grow()` operation creates a new, larger array, copies the elements to the new array, and deletes the old array.

The following implements this:

template <typename T>
void UnsortedSet<T>::grow() {
  T *new_arr = new T[2 * capacity]; // allocate a new array that is twice as big
  for (int i = 0; i < num_elements; ++i) { // copy the elements to the new array
    new_arr[i] = elements[i];
  }
  capacity *= 2;                                              // update capacity
  delete[] elements;                                       // free the old array
  elements = new_arr; // set elements to point to first element of the new array
}

The function allocates a new array and deletes the old one, satisfying the invariant that there is exactly one dynamic array associated with the set. It sets elements to point to that array and updates capacity to be the array’s size. Copying the elements to the new array satisfies the representation invariant that the first num_elements items in elements are the ones in the set.

The grow() operation demonstrates the power of indirection. By decoupling the storage for the elements from that of the UnsortedSet object, we have also decoupled their lifetimes. Thus, if the old storage no longer meets our needs, we can replace it with different storage that does, killing the old array and creating a new one. Compare this to storing the array directly within the memory for the set, which both limits the array to be of a fixed size and ties its lifetime to that of the set.

Since UnsortedSet now manages the resource of dynamic memory, we need to write a destructor that frees the resource when the UnsortedSet object dies.

Before we proceed to write the destructor, a clarification is in order. The destructor does not cause the object to die. The object dies when its lifetime ends:

if it is associated with a local variable, when the variable goes out of scope
if it is associated with a static variable, when the program ends
if it is a member of a class-type object, when the latter object dies
if it is an element of an array, when the array dies
if it is a dynamic object, when delete is applied to its address

The destructor merely runs when the object dies – it gives the object a chance to put its affairs in order while it is on its deathbed. If the object is managing dynamic memory, it needs to release that memory.

template <typename T>
UnsortedSet<T>::~UnsortedSet() {
  delete[] elements;
}

The representation invariant that there is exactly one dynamic array associated with each UnsortedSet ensures that after the destructor runs, there is no longer a live array associated with the dying set.

With the destructor, we have ensured that in simple cases, UnsortedSet properly manages memory. However, there is one case that we missed: copying a set. Consider the following code:

int main() {
  UnsortedSet<int> s1;
  for (int i = 0; i < 5; ++i) {
    s1.insert(i);
  }

  UnsortedSet<int> s2 = s1;
  cout << s1 << endl;    // prints { 0, 1, 2, 3, 4 }
  cout << s2 << endl;    // prints { 0, 1, 2, 3, 4 }

  s2.insert(5);          // causes a grow

  cout << s2 << endl;    // prints { 0, 1, 2, 3, 4, 5 }
  cout << s1 << endl;    // UNDEFINED BEHAVIOR
}

The code creates a set s1 and adds five items to the set, filling it to capacity. It then creates s2 as a copy of s1; by default, this does a member-by-member copy, so that both s1.elements and s2.elements point to the same dynamic array. We then add an item into s2, causing it to grow and delete its old array. This is the same array that s1.elements is pointing to, so that when we proceed to print out s1, it accesses a dead object. The result is undefined behavior.

Had we not caused a grow, there would have been a double delete when s2 and s1 die, since they would both delete the same array. This still results in undefined behavior. The fundamental problem is that the copy violates the representation invariant that each set has its own array. We will see how to fix this next time.