EECS 280: Debugging Tutorial

How to track a bug in your visual debugger

Let’s say your program isn’t working - maybe it crashes, or maybe it does the wrong thing. Debugging is the process of tracking precisely where your program’s actual behavior deviates from what you intended. It’s the mix of science with a very subtle art to

  1. Ask the right questions about what might be going wrong, and
  2. Pick an appropriate tool to get the information you need to check your suspicions.

Getting Started

We’ve written the spell-checker program from a previous lab with a number of bugs. In this tutorial we’ll show our overall debugging philosophy in action using a visual debugger.

Start by creating a new project with spellchecker.cpp and words.txt, which are available with the Lab 04 starter files. You might want to refer to the setup guide for Project 1.

Note: If you are using VS Code, set externalConsole in launch.json to true to enter input.

Detecting a bug

Before we start debugging, it makes sense to make sure there is actually a bug in our program, so try compiling and running the spell checker to see what happens.

Segmentation fault

Well, that wasn’t supposed to happen. A segmentation fault (often shortened to segfault) basically means your program did something bad with memory. This can be anything from dereferencing a null pointer to running off the end of an array. The challenge for debugging this type of bugs is that the program just crashes - it doesn’t give you any feedback about where the problem occurred.

Fixing Bug 1

However, since we are using a debugger, we can inspect things like the call stack, the contents of memory, and the values of variables. The basic idea is that when we run the spell checker in our debugger, when it crashes, we check where it was when that happened.

Now we have several pieces of information about our program when we reached the segfault. First, we see that the segfault occurred on line 21 of spellcheck.cpp.

We also notice that dereferencing ptr1 and ptr2, the pointers used to traverse the two C-style strings, apparently causes the segfault. Here’s a hypothesis for the bug:

Hypothesis: ptr1 or ptr2 is running off the end of an array

Another piece of information we have is the last value of our variables when it crashed.

We observe that ptr1 and ptr2 hold the same address. We also notice that the addresses stored in both str1 and str2 are actually the same as well. It turns out that the compiler is allowed to set up different string literals with the same content ("lizard" in the testing function) by reusing the same memory. This isn’t guaranteed, but will often happen.

Now, we will verify our hypothesis that we’re going off the end of our array. Take a look at the call stack and select the caller test_strcmp_eecs280 stack frame.

This should take you to the place where this function was called, in this case here:

We can now set a breakpoint at this line. Debuggers allow us to break (i.e. “pause”) a program’s execution at some point of interest and inspect its state. When we run it again, execution will stop at the first breakpoint it reaches.

Note: When paused, the debugger will highlight the next line it will execute.

Once paused at a line, use the step over and step into commands to move through the program. Both move forward to the next piece of code, but the step over command will step over function calls while the step into command will step into function calls.

Now, step into the srcmp_eecs280 function to check its behaviour.

Then step over to the while loop and take a look at how ptr1 and ptr2 are initialized.

We can now keep track of ptr1 and ptr2 as we step through the loop multiple times.

As we suspected, we fail to exit the while loop once we reach the end of the character array. The "strcmp..." is just the next location in memory as you can see from the address. The loop advances the pointers as long as they point to the same character, but since ptr1 and ptr2 point to the same place, this will go on indefinitely. We forgot to check for the sentinel!

If the words are the same, the loop should stop at the null character. Change the condition to:

21  while (*ptr1 && *ptr1 == *ptr2) {

Fixing Bug 2

After fixing the previous bug, we can compile and run the program again. There is no segfault this time, but it still doesn’t work - it claims nothing is spelled correctly and it also fails to recognize that “quit” should exit the program (if in a terminal use ctrl-c to stop it).

Based on a quick inspection of the program’s behavior, we might form these hypotheses:

Hypothesis 1: The find_word function is broken.

Hypothesis 2: The find_word function gets the wrong argument.

The find_word function is responsible for opening the dictionary file and checking to see if a given word is found in that file. Set a breakpoint at the beginning of the function and rerun the program, and you will verify Hypothesis 2 from above (find_word gets the wrong argument).

Now it makes sense to switch focus to the code that called find_word to see why the wrong argument is passed in. Once again click the main in the call stack.

Checking the local variables, we see that user_word is incorrect here as well:

Now by looking at the code in main, we come to a new hypothesis below.

Hypothesis: The get_user_word function is broken.

Investigate get_user_word by setting a breakpoint at the start of that function.

Actually, it might be more useful to break at the end of the function, right before the return. We can set a breakpoint at the appropriate line and use the continue button, which resumes execution of the program until the next breakpoint.

Note: The program will also pause temporarily to wait for you to enter input needed by cin.

Right before the function returns, we see that the actual behavior of our program deviates from the expected, correct behavior. Once you find precisely where this happens, you can just go ahead and fix the bug (make sure that the function is returning the right thing).

Note: More descriptive variable names might have prevented this bug in the first place!

Fixing Bug 3

After fixing the previous bug, run the program again and test it with the following words.

Did you notice anything strange? It claims that “hello” is not a correctly spelled word.

Let’s continue with the same hypothesis as last time:

Hypothesis: find_word is still receiving the wrong input.

This is simple to check by setting a breakpoint at find_word and checking the arguments.

It is clear that it is receiving the correct word now. Some other possibilities we could consider:

Hypothesis 1: “hello”is somehow missing from words.txt.

Hypothesis 2: The strcmp_eecs280 function doesn’t work for "hello".

Hypothesis 3: Something in find_word itself is incorrect.

What do we do next? Generally, we will start by checking the hypotheses that are easier to test and make our way to the ones that require more work until hopefully we have found the bug.

Therefore we can do the following for the first two hypotheses.

Hypothesis 1 - Open words.txt and see that “hello” is there.

Hypothesis 2 - Add a test to check the result of the comparison in question.

Since strcmp_eecs280 passes this test when we run the code (i.e. the assert doesn’t crash our program) we reject Hypothesis 2, so we narrow our focus a bit more.

Hypothesis 1: “hello”is somehow missing from words.txt.

Hypothesis 2: The strcmp_eecs280 function doesn’t work for "hello".

Hypothesis 3: Something in find_word itself is incorrect.

We’ll leave tracking down the final bug for you - somehow find_word never notices that "hello" is in fact in our dictionary file. Think about what information would be useful to have, and then pick the strategy that gets you that information most efficiently.

Once you find this bug, make sure to fix it in spellcheck.cpp!

Final Note

At this point you may be familiar with debugging by adding print statements to your code. This will still work, but using a debugger to check the value of variables is usually faster.

On the other hand, staring at your code without actively examining its behavior, or making random code changes in hopes of fixing the bug, will not get you anywhere.

Happy debugging!