Difference between revisions of "Remote Debugging EMAC OE SDK Projects with gdbserver"
Line 44: | Line 44: | ||
# First, <code>cd</code> to the directory where the targe executable is stored. | # First, <code>cd</code> to the directory where the targe executable is stored. | ||
# Run the EMAC OE SDK GDB: | # Run the EMAC OE SDK GDB: | ||
− | |||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
developer@ldc:~$ /path/to/sdk/EMAC-OE-arm-linux-gnueabi-SDK_4.0/gcc-4.2.4-arm-linux-gnueabi/bin/arm-linux-gnueabi-gdb target_program | developer@ldc:~$ /path/to/sdk/EMAC-OE-arm-linux-gnueabi-SDK_4.0/gcc-4.2.4-arm-linux-gnueabi/bin/arm-linux-gnueabi-gdb target_program | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | |||
# Run the following commands in GDB to prepare for the debug session: | # Run the following commands in GDB to prepare for the debug session: | ||
− | |||
<syntaxhighlight lang="gdb"> | <syntaxhighlight lang="gdb"> | ||
(gdb) target remote target_machine | (gdb) target remote target_machine | ||
Line 189: | Line 186: | ||
# Type <code>n</code>. This will cause the program to step over the next line of source code. The reason for using <code>n</code> rather than <code>s</code> or one of the instruction stepping commands is because the erroneous output indicates that the coding mistake is in the programmer's source code rather than the c library functions <code>atoi()</code> or <code>fprintf()</code>. Stepping over the function will save all the time required to step through every detail of what the library functions are doing. Later passes through the code can be used to step into functions called from within that stack frame if the first pass proves unsuccessful. | # Type <code>n</code>. This will cause the program to step over the next line of source code. The reason for using <code>n</code> rather than <code>s</code> or one of the instruction stepping commands is because the erroneous output indicates that the coding mistake is in the programmer's source code rather than the c library functions <code>atoi()</code> or <code>fprintf()</code>. Stepping over the function will save all the time required to step through every detail of what the library functions are doing. Later passes through the code can be used to step into functions called from within that stack frame if the first pass proves unsuccessful. | ||
# Continue to type <code>n</code> until one of the program's <code>exit()</code> calls is reached, but do not actually step into that <code>exit()</code> call. Judging by the program's output above, this should bring you to the conditional block that checks the value of the local variable n used to store the output of <code>atoi()</code> as shown in Listing 3. Note that once execution reaches line 79 of the source code, GDB will display the output of the <code>fprintf()</code> function from line 76. This may cause display problems within the text-based UI library that GDB uses which will require the command refresh to fix. | # Continue to type <code>n</code> until one of the program's <code>exit()</code> calls is reached, but do not actually step into that <code>exit()</code> call. Judging by the program's output above, this should bring you to the conditional block that checks the value of the local variable n used to store the output of <code>atoi()</code> as shown in Listing 3. Note that once execution reaches line 79 of the source code, GDB will display the output of the <code>fprintf()</code> function from line 76. This may cause display problems within the text-based UI library that GDB uses which will require the command refresh to fix. | ||
− | + | <syntaxhighlight lang="gdb"> | |
− | + | B+ |75 if ((data.num_threads < 1) || (data.num_threads < MAX_THREAD)) { | | |
− | + | |76 fprintf(stderr, | | |
− | + | |77 "The number of thread should between 1 and %d\n", | | |
− | + | |78 MAX_THREAD); | | |
− | + | >|79 exit(EXIT_FAILURE); | | |
− | + | |80 } | | |
− | + | </syntaxhighlight> | |
# Type <code>p/d data→num_threads</code>. <code>p</code> is an alias for <code>print</code>, <code>/d</code> tells GDB to treat the expression requested as an integer in signed decimal, and <code>data→num</code>_threads is the element <code>num_threads</code> within <code>struct thread_data</code>. This should provide the following output: | # Type <code>p/d data→num_threads</code>. <code>p</code> is an alias for <code>print</code>, <code>/d</code> tells GDB to treat the expression requested as an integer in signed decimal, and <code>data→num</code>_threads is the element <code>num_threads</code> within <code>struct thread_data</code>. This should provide the following output: | ||
<syntaxhighlight lang="gdb"> | <syntaxhighlight lang="gdb"> | ||
Line 210: | Line 207: | ||
This was a simple problem to solve but the method used above could apply in any situation where source code compiles and runs without errors yet provides varied or unexpected output. | This was a simple problem to solve but the method used above could apply in any situation where source code compiles and runs without errors yet provides varied or unexpected output. | ||
+ | |||
+ | # test | ||
+ | # test 2 | ||
+ | # test 3 | ||
+ | test 3 | ||
+ | test 3 | ||
+ | test 3 | ||
+ | <code>test 3</code> | ||
+ | # test 4 | ||
+ | # test 5 |
Revision as of 16:39, 10 May 2013
Table 1: Conventions | |
---|---|
target_program |
The name of the application being debugged. This is the result of the Makefile build process. |
target_machine |
Connection information for the target machine. This can either be a serial port (ie. /dev/ttyS2 ) or a TCP connection in the form of HOST:PORT.
|
/path/to/sdk/ |
Represents the development system path to the EMAC OE SDK. |
Sometimes a program has no technical errors that cause the compile to fail, but fails to meet the developer's expectations when run. This is typically due to algorithm or data structure design errors which can be difficult to find with just visual inspection of the code. Because of this, it can be beneficial to run a debugger targeting the binary resulting from the compile process. Debugging is the process of watching what is going on inside of another program while it is running. When a program is compiled with debug symbols included in the binary, it is possible to observe the source code and corresponding assembly while running the debugger.
When working with embedded systems the binary is usually compiled on a development machine with a different CPU architecture than what is on the target machine. This can be a problem when, as is typically the case, the target machine lacks the system resources to run a debugger. In these cases, it is possible to use the GNU debugger, or GDB, on the development machine to remotely debug the target machine provided it has a program called gdbserver. All EMAC OE builds are packaged with gdbserver to simplify the setup process for developers.
This guide is intended to build a basic understanding of how to use gdbserver with EMAC products. It is not intended as a general guide to debugging computer programs. For help with that, see the GDB man pages on the development system or read [this manual] on debugging with GDB.
Contents
Setup
Using gdbserver
involves setting up both the target machine and the development machine. This requires that the binary application be present on both development and target machines. The development machine copy of the application must be compiled with debug flags whereas this is not strictly necessary for the target machine. See the [Optional global.properties
Modifications Section] on the New EMAC OE SDK Project Guide for more information. See the [[[EMAC OE Getting Started Guide]]] for more information on how to connect to the target EMAC product using a serial port or Ethernet connection.
Target Machine
Because EMAC OE builds are distributed with gdbserver
, installation is not a concern. The only setup necessary is to run gdbserver
with target_program
:
- If the target application is already running, use the attachpid option to connect
gdbserver
to the application as shown below. ThePID
argument can be determined usingpidof
.
developer@ldc:~$ pidof target_program
developer@ldc:~$ gdbserver target_machine --attach PID
- If the target application is not already running, the name of the binary may be included as an argument to the
gdbserver
program call.
<snytaxhighlight lang="bash"> developer@ldc:~$ gdbserver target_machine target_program [ARGS] </syntaxhighlight>
This establishes a gdbserver
port on the target machine that listens for incoming connections from GDB on the development machine. In debug terminology, gdbserver
is “attached” to the process ID of the program being debugged. In reality, though, GDB is attached to the process ID of a proxy which passes the messages to and from the remote device under test.
The next step is to run GDB on the development machine using the target_program
/
Development Machine
- First,
cd
to the directory where the targe executable is stored. - Run the EMAC OE SDK GDB:
developer@ldc:~$ /path/to/sdk/EMAC-OE-arm-linux-gnueabi-SDK_4.0/gcc-4.2.4-arm-linux-gnueabi/bin/arm-linux-gnueabi-gdb target_program
- Run the following commands in GDB to prepare for the debug session:
(gdb) target remote target_machine
Note that the location of the GDB in the toolchain may differ from what is shown above depending on which version of the SDK is used. |
If the gdb executable, or any other executable you run, is located in a directory which is in the PATH environment variable, you can simply run that command without the long path prefix. Sourcing the environment variables generated by the EMAC script for this will provide you with such a path. The script which creates the file to source also creates a symbolic link for arm-linux-gnueabi-gdb called, simply, gdb. With your shell environment setup this way, you could simply execute:
developer@ldc:~$ gdb target_program
|
Sample GDB Session
This example GDB session uses the EMAC OE SDK example project named pthread_demo
. It consists of the single source file pthread_demo.c
. The program is called with a single integer argument indicating how many reader
threads the user wishes to create. The following describes the tasks of the main
thread:
- The
main
thread performs user input validation. It prints a usage message according to the argument passed to it on the command line. The function expects the user to pass a number indicating how many threads should be spawned. - The
main
thread initiates a new thread which uses thegenerator()
function to perform the following tasks:- Checks to see if the number of
reader
threads matches the number of times areader
thread has acquired the mutex lock and performed its task. If the two values do match, then thegenerator
thread unlocks the mutex, breaks out of the while loop and moves on to line 167 to gracefully exit. If the two values do not match, then thegenerator
thread continues through the rest of the while loop described in steps 2.2 and 2.3. - Generates random data to be stored in the data struct shared by all the threads. To do this, it protects the data struct with the use of a mutex variable.
- Sleeps after giving up its lock on the mutex so that another thread might have a chance to acquire the lock.
- Checks to see if the number of
- After creating the
generator
thread themain
thread iteratively creates as manyreader
threads as indicated by the single integer argument. Eachreader
thread performs the following tasks:- Waits for a chance to acquire the mutex
lock
. Once the mutexlock
is acquired, it prints the value of the random numbergenerated
by the generator thread in its last run. - Increments an integer in the
data
struct to indicate that it has completed its task. - Gives up its lock on the mutex and exits.
- Waits for a chance to acquire the mutex
- After creating the prescribed number of
reader
threads, themain
thread then waits for each thread created to exit gracefully. - The
main
thread exists.
The SDK version of pthread_demo.c
works according to the description above with a MAX_THREAD
value of 100. However, for the purpose of this example debug session it is instructive to use a faulty version of the same program. Replace lines 75-80 in pthread_demo.c
with the code snippet shown in Listing 1 below.
if ((data.num_threads < 1) || (data.num_threads < MAX_THREAD)) {
fprintf(stderr,
"The number of thread should between 1 and %d\n",
MAX_THREAD);
exit(EXIT_FAILURE);
}
Useful GDB Commands
The following is a brief description of some essential GDB commands. Each description is followed by a link to the official GDB documentation page that has more specific information about what the command does and how to use it. Please note that the official GDB documentation is targeted for the latest GDB release which at the time of writing this documentation is 7.4. The version of GDB that EMAC distributes with the OE products, however, is version 6.8. Because of this, the links to documentation below may provide slightly different information. The biggest difference between the two version of GDB, however, is in the support for debugging programs with multiple threads. This is reflected in the documentation as well. Because of this, EMAC has set up ftp access to GDB 6.8 documentation on its web server. It is highly recommended that the GDB 6.8 documentation be referenced in cases where the program does not seem to support commands or options specified in the current official documentation.
Command | Description |
---|---|
start/run
|
These commands are used to start the debugged program with the only difference being that start automatically pauses execution at the beginning of the program's main function whereas run must be told explicitly where to pause using the breakpoint command listed below.
See also [Debugging with GDB, Section 4.2: Starting your Program] |
kill
|
Used to kill the currently-running instance of target_program.
See also [Debugging with GDB, Section 4.9: Killing the Child Process] |
print
|
Used to print the value of an expression.
See also [Debugging with GDB, Section 10: Examining Data] |
list
|
List contents of function or specified line.
See also [Debugging with GDB, Section 9: Examining Source Files] |
layout
|
This is a TUI (Text User Interface) command that enables the programmer to view multiple debug views at once including source code, assembly, and registers.
See also [Debugging with GDB, Section 25.4: TUI Commands] |
disassemble
|
This command allows the programmer to see assembler instructions.
See also [Debugging with GDB, Section 9.6: Source and Machine Code] |
break
|
This command specifies a function name, line number, or instruction at which GDB is to pause execution.
See also [Debugging with GDB, Section 5.1: Breakpoints] |
next/nexti, step/stepi
|
Allow the programmer to step through a program without specifying breakpoints. The next/nexti commands step over function calls, stopping on the next line of the same stack frame; step/stepi , step into function calls, stopping on the first line in the next stack frame. The difference between step/next and stepi/nexti is that the i indicates instruction-by-instruction stepping at the assembly language level.
See also [Debugging with GDB, Section 5.2: Continuing and Stepping] |
continue
|
Used to continue program execution from the address where it was last stopped.
See the Debugging with GDB link for |
bt
|
Short for "backtrace," which displays to the programmer a brief summary of execution up to the current point in the program. This is useful because it shows a nested list of stack frames starting with the current one.
See also [Debugging with GDB, Section 8.2: Backtrace] |
quit
|
This will quit the debugging session, and return you to the shell. The Control-D key combination is another way to accomplish this. |
Session Walk-through
This debug session walk-through assumes that the program has been compiled using the modified source code above and that both the target machine and the development machine have been set up according to the above Setup section. The walk-through is divided into multiple “lessons” with the intent of first introducing the use of the commands described above and then actually running GDB to debug a known programming problem. Each lesson may be run independently of the others, but it is recommended that each be run in order starting from Lesson 1 for the first time through.
This lesson assumes that gdbserver
has been run as in the [Target Machine Setup] section above with an ARG value of 3. Other values are fine so long as they fall within the range of 1 to 100. The number '3' was arbitrarily chosen to avoid having to use a symbolic variable in the explanations below.
- Type
b main
to set a breakpoint at the main function in the source code. - Type
continue
. This will cause the program to continue from the breakpoint set by GDB at startup. The program was passed an argument of 3, indicating that three threads should be created. - Type
b 73
to set a breakpoint at line 73 in the source code, which should be the line containingdata.num_threads = atoi(argv[1]);
- Type
continue
. The program will continue execution up until line 73 in the source code. At this point, type layout split to view a split screen containing both the source code and the assembly-level machine instructions. Both screens show the program's current location in execution. The assembly-level display shows what the target's processor is actually executing at that point in the source code as shown in the source-level display. To view either of these without the other type layoutasm
for just assembly-level andlayout src
for just source-level. - Type
nexti
. This will cause the program to execute the next instruction in the current stack frame which is a mov instruction beginning to prepare the current stack for a call to the library functionatoi()
. The details of this process are beyond the scope of this tutorial; essentially, the program needs to store information about the current execution location in the stack for when theatoi()
function finishes. Typeni
(alias for nexti) three more times. You should end up on abl
instruction in the assembly view as shown in Listing 2 below. The source layout should still show the program on line 73.
B+ |0x887c <main+112> ldr r3, [r11, #-84] │
|0x8880 <main+116> add r3, r3, #4 ; 0x4 │
|0x8884 <main+120> ldr r3, [r3] │
|0x8888 <main+124> mov r0, r3 │
>|0x888c <main+128> bl 0x86e0 <atoi> │
Listing 2. GDB Assembly Layout - Note that the assembly may look different depending on the target architecture.
- Type
stepi
. This will cause the program to move into the next stack frame and GDB to show the assembly-level instructions of theatoi()
call. Since the library containingatoi()
was likely not compiled with debug symbols, the source-level layout will show the message[ No Source Available ]
. - Type
bt
. This will cause the program to display a human-readable version of the current stack. Each stack “frame” is represented by the name of the function call it represents with that function's location in memory. Typebt full
to get a list of the variables local to each stack frame. - Type
finish
. This will cause the current stack frame to return and execution to pause on the next instruction of the previous stack frame. - Type
kill
. This will cause the current process to be killed bygdbserver
at the target machine.gdbserver
will also terminate at this point. In order to start a new remote debug session, start gdbserver as described in the Target Machine Setup section and re-run step 3 of the [Development Machine Setup] section.
Note: b is an alias for the break command.
ni is an alias for nexti |
Lesson 2: Finding the Bug
Though this sample is contrived, it is still useful to demonstrate how to find a design mistake in an otherwise well-written (no errors or warnings) program. These types of mistakes typically have to do with the array boundary miscalculations, logic and comparison operator mistakes, or other simple mistakes. For the sake of demonstration, assume that the actual mistake is unknown. This lesson assumes that gdbserver has just been started as in the [Target Machine] Setup above with an ARG
value of 5.
- Before starting the program in the debugger again, run it by itself on the target machine to see what the actual program output is:
root@emac-oe:~# /tmp/pthread_demo 5
The number of threads should be between 1 and 100
The program was given an input of '5' yet the output message seems to indicate that this is out of range which is obviously not true.
- Start the debugger again and connect to the target machine as described in the Setup section.
- Type
b main
to set a breakpoint at the main function in the source code. - Type
continue
. This will cause the program to continue from the breakpoint set by GDB at startup. - Type
n
. This will cause the program to step over the next line of source code. The reason for usingn
rather thans
or one of the instruction stepping commands is because the erroneous output indicates that the coding mistake is in the programmer's source code rather than the c library functionsatoi()
orfprintf()
. Stepping over the function will save all the time required to step through every detail of what the library functions are doing. Later passes through the code can be used to step into functions called from within that stack frame if the first pass proves unsuccessful. - Continue to type
n
until one of the program'sexit()
calls is reached, but do not actually step into thatexit()
call. Judging by the program's output above, this should bring you to the conditional block that checks the value of the local variable n used to store the output ofatoi()
as shown in Listing 3. Note that once execution reaches line 79 of the source code, GDB will display the output of thefprintf()
function from line 76. This may cause display problems within the text-based UI library that GDB uses which will require the command refresh to fix.
B+ |75 if ((data.num_threads < 1) || (data.num_threads < MAX_THREAD)) { |
|76 fprintf(stderr, |
|77 "The number of thread should between 1 and %d\n", |
|78 MAX_THREAD); |
>|79 exit(EXIT_FAILURE); |
|80 } |
- Type
p/d data→num_threads
.p
is an alias forprint
,/d
tells GDB to treat the expression requested as an integer in signed decimal, anddata→num
_threads is the elementnum_threads
withinstruct thread_data
. This should provide the following output:
(gdb) p/d data->num_threads
$6 = 5
Note that the integer part of $6
will increment with each call to the gdb command print
. The above output confirms that the argument '5' was successfully passed to the program and read into a variable to be tested, indicating that one of the logical tests for the current conditional block contains a mistake. This merits a closer look at line 75:
B+ |75 if ((data.num_threads < 1) || (data.num_threads < MAX_THREAD)) { |
Line 75 consists of a conditional test which is the logical OR of two arithmetic tests involving the values of data.num_threads
, '1', and MAX_THREAD
. The first test is true the input integer is less than 1–(data.num_threads < 1)
. The second tests whether the input integer is less than the symbolic constant, MAX_THREAD–(data.num_threads < MAX_THREAD)
. Judging by the name of this constant and the result of the test (we know it resolves to true because the value of data.num_threads
in this case is not less than one), we can see that the comparison operator used is the culprit. The correct interpretation is that it should be '>' rather than '<'.
- Type
kill
.
This was a simple problem to solve but the method used above could apply in any situation where source code compiles and runs without errors yet provides varied or unexpected output.
- test
- test 2
- test 3
test 3
test 3
test 3
test 3
- test 4
- test 5