Courses/CS 2124/Lab Manual/Using a Debugger

From A-State Computer Science Wiki
Jump to: navigation, search

Using a Debugger

A portion of this lab is to be done during the scheduled lab time. The take-home programming assignment is to be turned in before the next lab; see the lab website. The in-lab portion is worth 40% of the lab credit; the programming assignment is worth the other 60%. See the website for details on how the programming assignment will be graded. You are not responsible for user errors in input unless specified in the assignment. Feedback will be provided explaining your grade on each assignment.

It is important that you complete each step before going on to the next, as the exercises build upon one another. You will find it helpful to diagram the action of each method or function as you go along. If you have difficulty in some step, DO NOT proceed before resolving it; seek assistance from your lab proctor. You will not be able to fully appreciate the remaining content of the lab and you are likely to compound the problem.

Introduction

Finding and eliminating runtime errors is one of the more challenging and frustrating parts of learning to program. Learning a few basic skills with a debugging tool can help.

Topics Covered in this Lab:
  • debugging tools
  • basic gdb commands
  • debugging terminology
Questions Answered in this Lab:
  • What is a debugger?
  • How do I start a debugging session?
  • What commands can I use in gdb?
Demonstrable Skills Acquired in this Lab:
  • ability to perform a debugging session in gdb
  • understanding of basic gdb commands


Learning to Use a Debugger

First, download the resource files from this assignment's resource site, or by clicking here.

Create a new text document debug_inlab_result.txt (use your programming text editor for this). As you work through this in-lab, you will answer questions and paste debugger output into this document. During checkpoints, and at the end, this is the file you will demonstrate or submit.

Open debug_inlab_1.cpp in your editor and get ready to compile it on the virtual server (or in your local environment if you prefer).

When you compile the program, add the -g -fno-inline flags to your usual compilation command. This will give you something similar to:

g++ -Wall -Wextra -pedantic -std=c++17 -g -fno-inline *.cpp

Compile debug_inlab_1.cpp. It should compile with no warnings or errors.

Note: The Foo.h library that is required by the program also requires the Foo.cpp implementation file to be compiled and linked with the program; the command shown above will accomplish this.


The Program

The idea of the program debug_inlab_1.cpp is that it will create an array of 15 Foo objects, and then perform several common operations on the array (sorting, searching, reversal). The whole process is then repeated with an array of 20 objects.

The main() function uses two "artificial" blocks to facilitate simply repeating the body of the code, but with two different array sizes (this allows all the names to be easily re-used). You would never want to do this in any production-quality software, but it can be a useful technique for writing tests.

Try running the program. Does it work?


Labcheckpoint.png FOR IN-LAB CREDIT: Answer questions posed during this exercise in full sentences in your "result" text file. Copy and paste code or output into the file as instructed, always working from top to bottom.


Copy and paste the output produced by your attempt at running the program into the text file following your answer to the question above. Here is the output produced by the author's machine:

Original values: 
[Foo #1 tag: "aa"], [Foo #2 tag: "hf"], [Foo #3 tag: "ok"], [Foo #4 tag: "vp"], 
[Foo #5 tag: "cu"], [Foo #6 tag: "jz"], [Foo #7 tag: "qe"], [Foo #8 tag: "xj"], 
[Foo #9 tag: "eo"], [Foo #10 tag: "lt"], [Foo #11 tag: "sy"], [Foo #12 tag: "zd"], 
[Foo #13 tag: "gi"], [Foo #14 tag: "nn"], [Foo #15 tag: "us"]

Segmentation fault (core dumped)

It looks like the program is creating the array, and the print() function seems fine. Something went wrong after that. By looking at main(), you can probably make the educated guess that it has something to do with the sort() function, since that is the next non-trivial operation on the array before it would be printed again.

Take a moment to look over the sort() function. Do you see any issues? If not, don't worry (and if you do, don't fix them yet).

The sort() function is performing a Bubble Sort on the array, using observing pointers to the first element and to the next byte beyond the last element. In other words, the pointers form the ends of the range:

[begin,end)

Where the element pointed to by end isn't actually a member of the array being sorted. This is an important consideration going forward. If you are not sure how this is supposed to work, be sure to ask a question now!

Debugging

A debugger is a software tool designed to help programmers track down runtime errors problems in programs. Runtime errors are difficult to track down since the compiler cannot help you find them. Your options for fixing them are:

  • Manually step through the code by hand.
  • Add lots of "debug output" to the program to find where things are going wrong.
  • Use a debugger to examine the program as it is running.

The most direct approach is to use the debugger, but it also involves the steepest learning curve. Good news! Learning a debugger can be simplified when you realize you really only need to know a few commands to get a lot of work done. You can learn all the others later, over time.

There are many debuggers available; here are a few:

gdb 
part of the gnu/gcc toolchain - available on most Linux/UNIX systems.
lldb 
part of the llvm/clang toolchain - standard on all Apple Mac computers and available on Linux/UNIX and Windows platforms as well.
Visual Studio Debugger
part of the Microsoft Visual Studio IDE - a great visual debugger.
Visual Studio Code Debugger
part of the Microsoft Visual Code editor - also great visual debugger, but you must set it up with development tools before using it.
XCode Debugger 
a visual front-end to lldb on the Mac platform.
Nemiver
a visual front-end for gdb on Linux platforms (free, may be abandoned).

This lab will introduce you to gdb. Even though we are compiling with clang++ (llvm/clang toolchain), gdb will still work just fine as a debugging tool. The commands for gdb are somewhat easier to remember (and type). It was designed more for human interaction (lldb was designed more for programatic interaction). You may also want to experiment with a visual debugger on your platform of choice (they are easy to learn once you know the basic concepts).

To start a debugging session with gdb, type:

gdb a.out

This assumes that you have already compiled your program and that you allowed the compiler to produce the default a.out executable; if that is not the case, adjust the command to match your executable file's name.

You should now see something like the following:

$ gdb a.out
GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type <b>"apropos word" to search for comm</b>ands related to "word"...
Reading symbols from a.out...done.
(gdb)

Your program has been loaded by gdb and you are ready to begin debugging.

Type run and press Enter.

This runs your program inside the debugging environment. When it "crashes", you will see some information about what went wrong (or a message saying that the program terminated normally if that was the case).

Copy the message you recieve (just last part -- the part your program did not produce) and paste it into your results file.

Your message may have included some information about "signal SIGSEGV, Segmentation fault". Let's find out what happened

Type bt and press Enter.

The bt command is short for backtrace (which you could also have typed). gdb uses lots of short abbreviations for things; if you know them you can use it much faster, but if you don't, the full commands are usually pretty easy to figure out.

A "backtrace" is a listing of the program's call stack at the moment that execution was paused (or stopped by a crash). Notice that the listing is in a kind of reverse-chronological order. The most "recent" function that was called is listed first, and main() is at the bottom. Also notice that each function that was called has a number to its left (starting from zero). These are called stack frames, and we can use them to choose which part of the call stack we want to investigate.

Copy and paste your backtrace into the results file.

The author's backtrace looked like this:

(gdb) bt
#0  __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:157
#1  0x00007ffff7b78e87 in void std::__cxx11::basic_string<char, std::char_traits<char>, 
      std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007ffff7b78ecf in std::__cxx11::basic_string<char, std::char_traits<char>, 
      std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, 
      std::allocator<char> > const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x0000000000403643 in Foo::tag (this=0x7fffffffd7a8) at Foo.cpp:49
#4  0x0000000000402c4e in sort (begin=0x7fffffffd550, end=0x7fffffffd7a8)
    at debug_inlab_1.cpp:129
#5  0x00000000004018ca in main () at debug_inlab_1.cpp:28

Those first few frames look pretty nasty! Notice that they refer to code that exists in an external library (not your program). That is a strong hint that those frames are not of interest to us for debugging. Most of the time if you are using a well-known library, the error is in your code, not theirs.

In the backtrace shown above, frame #3 is the first one that is part of the source code in this project. It was a call to the Foo::tag() method. For the purpose of this assignment, treat the Foo library as if it is beyond reproach as well; so we will ignore that frame and keep looking.

Frame #4 in the backtrace above shows a call to the sort() function. Gotcha! This one is part of the code we are working on; that seems like a good place to start. Also, we were already suspicious of sort() from our observation earlier that it seems to be the last thing the program tried to do before crashing. (Notice that a line number is shown as well: 129.)

So, we want to select frame #4, which will let us examine the arguments and local variables there in exactly the state they were in when the program crashed. We will be "seeing into" the program at the instant that it crashed!

To select the frame, type frame 4 and press Enter. Note: If your backtrace is different, be sure to select the frame number associated with sort() in your backtrace!

You should see something similar to:

(gdb) frame 4
#4  0x0000000000402c4e in sort (begin=0x7fffffffd550, end=0x7fffffffd7a8)
    at debug_inlab_1.cpp:129
129             if(current->tag() > (current+1)->tag()){

This shows the frame (with parameter values) and the file and line number (line 129), and even shows the code on the offending line. Looks like we crashed in the condition check of that if...

Type info args and press Enter. (This repeats the argument values.)

Then type info locals and press Enter. (This get you local variable values.)

You should now see something similar to the following. Cut and paste your output for the two "info" requests into your results file.

(gdb) info args
begin = 0x7fffffffd550
end = 0x7fffffffd7a8
(gdb) info locals
current = 0x7fffffffd780
did_swap = true
pass = 0
(gdb)

Wow. A lot of those are big hexadecimal numbers. Those are memory addresses; which makes sense, since begin, end, and current are pointers.

The pass counter is still 0 -- we didn't get very far. It looks like begin is less than end (which is good), and current is also less than end (also good).

You might notice though that part of the condition is evaluating (current+1)->tag(). Always be suspicious of any attempt to dereference a pointer (and an arrow operation counts as a dereference). Let's make sure that the things we are trying to dereference here is valid:

Type print current + 1 and press Enter.

You should see something like the following. Copy and paste your command and output into the results text file.

$1 = (Foo *) 0x7fffffffd7a8

Look closely at that number. Compare it to the other values you printed with the "info" commands...

So, current + 1 yields the exact same address as the one stored in end — but from our earlier discussion, we know that end points to an element that is not contained in the array! So, we are trying to call the tag() method on something that isn't a Foo. No wonder we crashed!

Now we've identified the problem; let's see what events led to this outcome.

Type list and press Enter.

This will print the 10 lines around the line that we are stopped on. (You can adjust this number with a more advanced command.)

It looks like the obvious culprit here is the for loop. current is being allowed to run one item too far to the right (it's a classic off-by-one error, but obscured by the pointer notation). The stopping condition should not allow current to take on the value of end, so it should be:
current != end - 1 - pass

Make this change in your code.

Now, we need to quit the debugger so that we can re-compile and start testing again. Type quit (or just q) and press Enter. gdb will warn you that the program is still "active", type "y" to exit anyway.


Labcheckpoint.png FOR IN-LAB CREDIT: Show your work to the lab assistant, and demonstrate the edit you made to the source code, along with the result of running the program with the error fixed.


That fixed one problem, but the program still will not run to completion. You should see the sorted values now, but the program will crash shortly after that (probably with a "Segmentation fault").

Load the program in the debugger:

gdb a.out

Run it:

run

When it crashes, produce the backtrace:

bt
(or backtrace)

Copy the backtrace output into your results file.

Look for the frame with the lowest frame number that is still part of "your" code. On the author's system, this was frame 6, but yours may vary. Select that frame:

frame 6

You can notice the values of the arguments (from the header) and the locals (from info locals)... But it might not be obvious what is wrong. We need some context. List the area around the line we crashed on:

list

It looks like the crash occurred here (note the line number in the backtrace):

        *begin     = *end;

If you don't see what might be wrong, you can print the values involved to look for clues:

print *begin

Looks reasonable...

print *end

Looks very odd...

Hopefully now you realize that end should never actually be de-referenced. Remember that the end pointer points to an element that is beyond the end of the array!

This line contains an off-by-one error as well. Update it so that it assigns the last element that is actually in the array with the element pointed to by begin.

If you are paying attention, you will notice that the same off-by-one error occurs again on the next line. Fix that one as well.

Copy the gdb commands and output you used to identify the problem, and also explain the problem in your results text file, then you are ready for the next checkpoint.


Labcheckpoint.png FOR IN-LAB CREDIT: Show the lab instructor the updated code and the steps from your debugging session.


Quit your gdb session and try to compile-and-run again.

You won't get far... Again the program crashes, so again you should use gdb to find out why.

This will lead you back to the reverse() function again, but for a different reason.

Select the frame where the reverse() function was at work, then take a close look at the frame's information: (Yours will differ slightly...)

#6  0x0804a03e in reverse (begin=0xbffff348, end=0xbffff2d0) at debug_inlab_1.cpp:148
148         auto temp   = *begin;

Do you see anything unusual there? Look closely at the values of begin and end. Yes, these are horrible-looking hexadecimal numbers, but they are still numbers. Keep in mind that begin should be "to the left" (on the number line) of end. Is that the case?

How could begin and end have "passed" each other without stopping the loop? Answer in your results file.

Now update the loop condition to fix this problem. Do not use less-than or greater-than operators with pointers. Fix the loop using only strict equality (or non-equality) comparisons.


Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate the solution for the lab instructor.


In-Lab Exercises

Now you are on your own. Continue working on the program, and use gdb each time the program mis-behaves or crashes.

Any time you get stuck, ask for assistance. For each problem you fix, explain what the problem was and copy-and-paste the debugging commands and output that led you to the problem into your results file.



Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate the working program before leaving the lab.


Labsubmitsinglefile.pngUpload the following files: Submit a zip file containing your source code and the "results" text file you created.