Structural Recursion

Last time, we discussed the concept of recursion, where an abstraction uses itself as part of its implementation, and we saw its application to procedural abstractions. We turn our attention now to structural recursion, where we use recursion in defining the representation of an abstract data type.

Recursive Lists

The data representation of a linked list is an example of structural recursion: the Node type uses itself in its own representation:

struct Node {
  int datum;
  Node *next;
};

This representation satisfies the requirements for a recursive abstraction:

  • The empty list is the base case, represented by a null pointer.

  • A non-empty list is a recursive case; it can be subdivided into the first node and the rest of the list, which represents a list in its own right.

_images/19_list_self_similarity.svg

Figure 80 The representation of a non-empty list consists of a node followed by the representation of a smaller list.

Independent of its representation, a linked list can actually be defined recursively as either an empty list, or a datum followed by a smaller list.

_images/19_recursive_list.svg

Figure 81 Recursive definition of a list.

Given the recursive definition of a list, it is natural to process a list with a recursive function. The base case of the recursive function will be the minimum-size list allowed by the function, and larger lists will be handled in the recursive case. As an example, the following is a recursive function to compute the length of a list:

// REQUIRES: node represents a valid list
// EFFECTS:  Computes the length of the list that starts at the
//           given node.
int length(const Node *list) {
  if (list == nullptr) {               // empty list
    return 0;
  } else {                             // non-empty list
    return 1 + length(list->next);     // list->next is a smaller list
  }
}

The length of an empty list is 0. The length of a non-empty list is one more than the length of the rest of the list; the elements in the list consist of the initial datum and the elements in the rest of the list. We use the length() function itself as an abstraction to compute the number of elements in the remainder of the list, taking the recursive leap of faith that it will compute the right answer.

As another example, consider the problem of finding the maximum element in a list. Unlike for length(), the minimal list is not an empty one, since an empty list has no elements in it. Instead, the minimum required size is a list with a single datum, in which case the maximum element is just that lone datum, constituting our base case. For a larger list, we can break it down recursively as follows:

  1. Find the maximum element in the rest of the list. This element is at least as large as any other element in the rest of the list.

  2. Compare the first element to the max of the remainder. The larger of the two is transitively at least as large as the elements in the rest of the list.

The following implements this algorithm:

// REQUIRES: node represents a valid, non-empty list
// EFFECTS:  Returns the maximum element in the list that starts at
//           the given node.
int list_max(const Node *list) {
  if (list->next == nullptr) {              // list has only one element
    return list->datum;
  } else {                                  // list has more than one element
    return std::max(list->datum,            // compare first datum to
                    list_max(list->next));  // max of rest of list
  }
}

The base case is a list with one element. Such a list has an empty next list, so we check for that and return the list’s lone datum. The recursive case computes the max of the rest of the list and then uses std::max() to determine the maximum of that item and the first item in the list. As always, we take the recursive leap of faith, assuming that the recursive call to list_max() computes the right answer for the smaller list.

Exercise 2

Write a function list_sum() that recursively computes the sum of the elements in a list.

// EFFECTS: Computes the sum of the elements in the list that
//          starts at the given node.
int list_sum(const Node *node) {
  // your code here
}
Exercise 3

Write a function last() that recursively finds and returns the last element of a non-empty list.

// REQUIRES: node represents a valid, non-empty list
// EFFECTS:  Returns the last element in the list that starts at
//           the given node.
int last(const Node *node) {
  // your code here
}

Trees

A list is a linear-recursive data structure: each non-empty list is subdivided into a datum and a single smaller list. A tree is a data structure that is, naturally, tree recursive. A non-empty tree is subdivided into a datum and several smaller trees. In this course, we only consider binary trees, where non-empty trees are subdivided into exactly two smaller trees.

_images/19_tree.svg

Figure 82 Recursive definition of a tree. A non-empty tree consists of a datum and two smaller trees.

The term tree stems [1] from the fact that its branching structure resembles that of a botanical tree. Terminology with respect to tree data structures borrows from both botanical and family trees.

  • The root is the node that originates the branching structure. In our diagrams, the root is pictured at the top of the tree.

  • A non-empty tree consists of a parent node and two child nodes. For a binary tree, there is a left child and a right child. Nodes that have the same parent are siblings.

  • A node whose children are all empty is a leaf.

  • The size of a tree is the number of elements it contains.

  • The height of a tree is the number of levels at which it has elements. Equivalently, it is the length of the longest path from the root to a leaf node.

_images/19_tree_height.svg

Figure 83 The height of a tree is the number of levels that contain elements, or equivalently, the length of the longest path from root to a leaf.

Algorithms on trees are often written as tree-recursive functions, so that the recursive case makes more than one recursive call. The general strategy is to directly compute the result for the smallest tree allowed by the function, constituting the base case. The recursive case makes recursive calls to compute the results for the left and right children, then combines those results with the datum at the root to compute the answer for the whole tree.

As an example, the following algorithm computes the size of a tree:

  • The size of an empty tree is zero.

  • The size of a non-empty tree is the size of the left child, plus the size of the right child, plus one for the root datum.

Before we can implement this algorithm in code, we need a data representation. As with a list, we use a Node struct, but it now has a datum and pointers to left and right Nodes:

struct Node {
  int datum;
  Node *left;
  Node *right;
};

Like a list, we use a null pointer to represent an empty tree.

_images/19_tree_representation.svg

Figure 84 Data representation of a tree, using a node for each element and null pointers for empty trees.

We can now implement the size() function:

// REQUIRES: node represents a valid tree
// EFFECTS:  Returns the number of elements in the tree represented
//           by the given node.
int size(const Node *tree) {
  if (!tree) {                                        // empty tree
    return 0;
  } else {                                            // non-empty tree
    return 1 + size(tree->left) + size(tree->right);
  }
}

As with other recursive functions, we take the recursive leap of faith that size() computes the right answer on smaller trees.

The height of a tree is the number of levels it contains. An empty tree contains no levels, so its height is zero. For a non-empty tree, we exploit the alternate definition of height, that it is the length of the longest path from root to leaf. The longest such patch is just one node longer than the longest path in the child subtrees, since the root adds one additional node to the part of the path contained in a child. Thus, we compute the height as follows:

// REQUIRES: node represents a valid tree
// EFFECTS:  Returns the height of the tree represented by the given
//           node.
int height(const Node *tree) {
  if (!tree) {                                        // empty tree
    return 0;
  } else {                                            // non-empty tree
    return 1 + std::max(height(tree->left),
                        height(tree->right));
  }
}

We use std::max() to obtain the longer path from the two child trees, then add one to the result to account for the root node.

Exercise 4

Write a function tree_contains() that determines whether the given value is contained in the given tree.

// EFFECTS: Returns whether the tree rooted at the given node
//          contains the given value.
bool tree_contains(const Node *node, int value) {
  // your code here
}
Exercise 5

Write a function count_leaves() that counts the number of leaf nodes in the given tree.

Hint: You will likely need two base cases.

// EFFECTS: Returns the number of leaf nodes in the tree rooted
//          at the given node.
int count_leaves(const Node *node) {
  // your code here
}

Tree Traversals

The following function prints every element in a tree to standard output:

// REQUIRES: node represents a valid tree
// MODIFIES: cout
// EFFECTS:  Prints each element in the given tree to standard out,
//           with each element followed by a space.
void print(const Node *tree) {
  if (tree) {              // non-empty tree
    cout << tree->datum << " ";
    print(tree->left);
    print(tree->right);
  }
}

This algorithm processes a root datum before processing any of the data in the children. It thus implements a preorder traversal. For the tree in Figure 83, it prints the elements in the order 6 4 1 7 2 9 5.

Moving the print statement results in alternate orderings. The following implements an inorder traversal:

if (tree) {              // non-empty tree
  print(tree->left);
  cout << tree->datum << " ";
  print(tree->right);
}

The data in the left child is processed first, then the root datum, then the data in the right child. For the tree with height 5, it prints 1 4 6 2 5 9 7.

A postorder traversal processes the root datum after the data in the children:

if (tree) {              // non-empty tree
  print(tree->left);
  print(tree->right);
  cout << tree->datum << " ";
}

For the same tree, the postorder traversal prints 1 4 5 9 2 7 6.

The structure of a binary tree can be uniquely determined by knowing any two of these orderings.