Recursion

We have seen several forms of abstraction, where we use something for what it does rather than how it works. It can be useful to refer to an abstraction when we are still in the middle of implementing it. For instance, our definition of a linked-list node refers to itself:

struct Node {
  int datum;
  Node *next;
};

Such a definition is recursive, where it uses itself as part of its definition.

Functions can also be recursive. As an example, consider the factorial of a nonnegative integer:

\[\begin{split}n! = \begin{cases} 1 &\textrm{if}~ n = 0~ \textrm{or}~ n = 1\\ n * (n - 1) * \cdots * 1 &\textrm{otherwise} \end{cases}\end{split}\]

We can implement an iterative function to compute this:

// REQUIRES: n >= 0
// EFFECTS: Computes and returns n!.
int factorial(int n) {
  int result = 1;
  while (n > 0) {
    result *= n;
    --n;
  }
  return result;
}

On the other hand, we can observe that \((n - 1)! = (n - 1) * \cdots * 1\), so that we can mathematically define factorial in terms of itself:

\[\begin{split}n! = \begin{cases} 1 &\textrm{if}~ n = 0~ \textrm{or}~ n = 1\\ n * (n - 1)! &\textrm{otherwise} \end{cases}\end{split}\]

Such a mathematical definition is called a recurrence relation. It translates into code as follows:

// REQUIRES: n >= 0
// EFFECTS: Computes and returns n!.
int factorial(int n) {
  if (n == 0 || n == 1) {
    return 1;
  } else {
    return n * factorial(n - 1);
  }
}

This function is recursive, since it calls itself as part of its definition. We can then call factorial() as follows:

int main() {
  int result = factorial(3);
  cout << "3! = " << result << endl;
}

The activation records created by this program are shown in Figure 76.

_images/18_recursive_factorial.svg

Figure 76 Activation records for a call to the recursive version of factorial().

When the call factorial(3) is evaluated, an activation record is created, and the parameter n is initialized with value 3. The body of factorial() runs, and it calls factorial(2). This creates another activation record, and its parameter n is initialized to 2. Within the context of that activation record, the body of factorial() is executed, and it calls factorial(1). Another activation record is created, with n initialized to 1. The body runs, returning 1, which replaces the function call in the caller. The activation record for factorial(1) is destroyed, and factorial(2) resumes. The latter computes 2 * 1 and returns 2. The activation record for factorial(2) is destroyed, and factorial(3) continues where it left off. It computes 3 * 2, returning 6. Its activation record is reclaimed, and main() resumes, initializing result to 6, the correct value for \(3!\).

Operationally, the recursive definition works because each invocation of factorial() gets its own activation record, and the body of the function is executed within the context of that activation record. More conceptually, it works by using factorial() as an abstraction, even while we are still in the midst of implementing that abstraction! We call this “the recursive leap of faith” – we trust that the abstraction works, even if we haven’t finished writing it yet.

In general, a recursive abstraction requires the following:

  • base cases, which are cases we can implement directly without recursion

  • recursive cases, where we break down the problem into subproblems that are similar to the problem but “smaller,” meaning closer to a base case

As another example, the following recursive function prints out the integers between a start and end:

// REQUIRES: end >= start
// MODIFIES: cout
// EFFECTS:  Prints the numbers in [start, end) in order.
void print_numbers(int start, int end) {
  // base case
  if (start == end) {
    return;
  }
  // recursive case
  cout << start << endl;
  print_numbers(start + 1, end);
}

The base case is when the start and end are the same, in which case the function immediately returns without printing anything. The recursive case prints out a single number and then recursively computes the subproblem of printing the numbers in \([start + 1, end)\). The latter is smaller than the original problem, since it requires printing out one fewer number.

Let us consider another example, that of computing the number of ducks on a duck farm. Suppose the farm starts with five baby ducklings. Assume that a duck lays three eggs each month, once the duck is at least a month old itself. Furthermore, an egg takes a month to hatch, and our ducks never die. How many ducks does the farm have after \(n\) months?

We start by working out the first few months by hand. In month 0, there are 5 baby ducklings, which are not old enough to lay eggs. In month 1, there are now 5 grown ducks, each of which lays 3 eggs, resulting in 15 eggs in total. Let’s use a table to keep track of what we have:

Month

Adult Ducks

Ducklings

Eggs

Total Ducks

0

0

5

0

5

1

5

0

15

5

2

5

15

15

20

3

20

15

60

35

4

35

60

105

95

5

95

105

285

200

In month 2, the 15 eggs hatch, resulting in 15 ducklings. The 5 adult ducks lay 15 more eggs. In month 3, these eggs hatch, resulting in 15 ducklings. The previous 15 ducklings are now adults, so that we have a total of 20 adult ducks, which lay 60 eggs. And so on in months 4 and 5. The total number of ducks is 200 at month 5.

We observe that the number of adult ducks in any particular month is the same as the total number of ducks in the previous month. The number of ducklings in a month is three times the number of adult ducks in the previous month, which is the same as the total number of ducks in the month before. This results in the following recurrence relation for the total number of ducks:

\[\begin{split}ducks(n) = \begin{cases} 5 &\textrm{if}~ n = 0~ \textrm{or}~ n = 1\\ ducks(n - 1) + 3 * ducks(n - 2) &\textrm{otherwise} \end{cases}\end{split}\]

The base cases are the first two months, when there are 5 total ducks. (We cannot apply the recurrence for these cases, since it would rely on the number of ducks in month -1, which is ill-defined.)

We can now write a function to compute the number of ducks:

// REQUIRES: n >= 0
// EFFECTS:  Computes the number of ducks in month n, assuming 5
//           ducklings to start.
int num_ducks(int n) {
  // base cases
  if (n <= 1) {
    return 5;
  }
  // recursive case
  return num_ducks(n - 1) + 3 * num_ducks(n - 2);
}

Observe that there are two subproblems here, ducks(n - 1) and ducks(n - 2). Both are closer to a base case than ducks(n), so the recursive computation still works.

Recursive algorithms are not just for math problems. The following is an algorithm for lifting a box of heavy books:

void lift_box(Box &box) {
  if (too_heavy(box)) {             // recursive case
    Book book = remove_book(box);
    lift_box(box);                  // "smaller" box
    add_book(box, book);
  } else {                          // base case
    move_box(box);
  }
}

Here, the base case is when we can directly move the box, such as when it is empty. Otherwise, we reduce the problem of moving a heavy box to the subproblem of a box with one less book, using recursion to solve that subproblem. We take the recursive leap of faith that the function will correctly solve that subproblem. We then need only add back the book we removed.

_images/18_books.svg

Figure 77 Recursive algorithm for lifting a box of heavy books.

As another non-math example, we write a recursive function to reverse the contents of an array. For an empty array or an array with one element, nothing needs to be done, constituting the base case. Otherwise, we reverse a smaller array, one that excludes the ends, and then swap the two ends:

// EFFECTS: Reverses the array starting at 'left' and ending at
//          (and including) 'right'.
void reverse(int *left, int *right) {
  if (left < right) {
    reverse(left + 1, right - 1);
    int temp = *left;
    *left = *right;
    *right = temp;
  }
}

The function reverses the elements in \([left, right]\). The subproblem is the set of elements in \([left + 1, right - 1]\), which is closer to the base case of an array with zero or one element. As always, we take the recursive leap of faith that the subproblem will be computed correctly.

We can call reverse() as follows:

int main() {
  const int SIZE = 8;
  int array[SIZE] = { 1, 2, 3, 4, 5, 6, 7, 8 };
  reverse(array, array + SIZE - 1);             // reverse whole array
  reverse(array + 1, array + 3);                // reverse elements 1-3
}

Tail Recursion

Consider the efficiency of the reverse() function above. For an array of size \(n\), it performs \(\lfloor n / 2 \rfloor\) swap operations, taking \(\textrm{O}(n)\) time. The algorithm also uses \(\textrm{O}(n)\) space: each recursive call has an activation record, and there are \(\lfloor n / 2 \rfloor + 1\) total calls.

On the other hand, the following iterative algorithm uses only a single activation record, resulting in constant, \(\textrm{O}(1)\), space:

// EFFECTS: Reverses the array starting at 'left' and ending at
//          (and including) 'right'.
void reverse(int *left, int *right) {
  while (left < right) {
    int temp = *left;
    *left = *right;
    *right = temp;
    ++left;
    --right;
  }
}

The fundamental difference is that this algorithm solves the subproblem after doing the extra work for the original problem, i.e. swapping the two ends. An equivalent recursive implementation would be the following:

// EFFECTS: Reverses the array starting at 'left' and ending at
//          (and including) 'right'.
void reverse(int *left, int *right) {
  if (left < right) {
    int temp = *left;
    *left = *right;
    *right = temp;
    reverse(left + 1, right - 1);
  }
}

Here, the recursive call is a tail call, meaning that the call is the last thing that happens in the function, and no work is done after the tail call returns. [1] Since no work is done after the tail call, there is no need to retain the activation record of the caller, and it can be discarded.

Many compilers recognize tail calls and perform tail-call optimization (TCO), where the activation record for the tail call reuses the space of the caller’s activation record. In the g++ compiler, TCO is enabled at optimization level 2 (-O2). TCO is not restricted to recursive functions; as long as a function call is the last thing that happens in its caller, it is a tail call and the optimization can apply. However, tail-call optimization is most important for recursive functions, since it can reduce the space usage of the computation by a large factor.

A function is said to be tail recursive if it is recursive, and all recursive calls are tail calls. Rather than using a linear amount of space, a tail-recursive computation requires only constant space when tail-call optimization is applied. For instance, the tail-recursive version of reverse() uses space for only a single activation record under TCO.

In order for a computation to be tail recursive, it must do all its work in the active flow of the computation, before making a recursive call. A computation that does its work in the passive flow must wait until a recursive call completes before doing work, so that the recursive call is not a tail call.

As a concrete example, the prior recursive implementation of factorial() is not tail recursive, since it does its work in the passive flow:

int factorial(int n) {
  if (n == 0 || n == 1) {
    return 1;
  } else {
    return n * factorial(n - 1);
  }
}

The multiplication must be done after the recursive call returns, so the function is not tail recursive.

For the function to be tail recursive, we need the multiplication to be in the active flow. To do so, we compute \(n\) in the initial call to factorial() on \(n\), \(n * (n - 1)\) in the call on \(n - 1\), \(n * (n - 1) * (n - 2)\) in the call on \(n - 2\), and so on until we reach \(1\). At each step, we keep track of the product so far:

int factorial(int n, int resultSoFar) {
  if (n == 0 || n == 1) {
    return resultSoFar;
  } else {
    return factorial(n - 1, n * resultSoFar);
  }
}

To call this function, we need to seed resultSoFar with the multiplicative identity of 1:

cout << factorial(5, 1) << endl;

However, this is a different interface than our previous versions of factorial(). To retain the same interface, we move the actual computation to a helper function and abstract the call to it:

static int factorial_helper(int n, int resultSoFar) {
  if (n == 0 || n == 1) {
    return resultSoFar;
  } else {
    return factorial_helper(n - 1, n * resultSoFar);
  }
}

int factorial(int n) {
  return factorial_helper(n, 1);
}

Helper functions are a common pattern for tail-recursive computations that require a seed value. Not all tail-recursive algorithms require a helper function, however; the tail-recursive print_numbers() and reverse() functions above do not need a seed value, so they work without helper functions.

Kinds of Recursion

We have now seen several different kinds of recursion. A function is linear recursive if it is recursive, but each invocation of the function makes at most one recursive call. Such a function reduces each recursive case to a single subproblem. The various recursive factorial() and reverse() functions above are all linear recursive.

A function is tail recursive if it is linear recursive, and every recursive call is the last thing that happens in the invocation that makes the recursive call. We have seen both tail-recursive and non-tail-recursive variants of reverse() and factorial().

A function is tree recursive if a single invocation of the function can make more than one recursive call. Such a function subdivides a recursive case into multiple subproblems. The num_ducks() function above is tree recursive. Drawing out the recursive call graph for a tree-recursive function, we end up with a branching structure that resembles a tree, explaining the nomenclature.

_images/18_tree_recursion.svg

Figure 78 Call structure of a tree-recursive function.

Iteration vs. Recursion

Iteration and recursion are two approaches to solving complex problems. Conceptually, an iterative algorithm often divides a computation into individual discrete steps, the combination of which forms a solution to the whole problem. In contrast, recursive algorithms generally express a computation in terms of smaller versions of the same problem.

Both iteration and recursion have the same computational power; an iterative algorithm can be converted to a tail-recursive implementation and vice versa. A non-tail-recursive computation can also be rewritten iteratively; however, the iterative version may require explicit storage. The non-tail-recursive algorithm can store data in multiple activation records, while the iterative one only has a single activation record to work with, so it may need an explicit data structure such as a vector to store its data. Techniques such as dynamic programming can be used in general to convert a recursive algorithm to an iterative one, but they are beyond the scope of this course.