If you’re developing C/C++ on embedded devices, you might already have stumbled upon a corrupt stacktrace like this when trying to debug with gdb:
(gdb) bt #0 0xb38e32c4 in pthread_getname_np () from /home/enrique/buildroot/output5/staging/lib/libpthread.so.0 #1 0xb38e103c in __lll_timedlock_wait () from /home/enrique/buildroot/output5/staging/lib/libpthread.so.0 Backtrace stopped: previous frame identical to this frame (corrupt stack?)
In these cases I usually give up gdb and try to solve my problems by adding printf()s and resorting to other tools. However, there are times when you really really need to know what is in that cursed stack.
ARM devices subroutine calls work by setting the return address in the Link Register (LR), so the subroutine knows where to point the Program Counter (PC) register to. While not jumping into subroutines, the values of the LR register is saved in the stack (to be restored later, right before the current subroutine returns to the caller) and the register can be used for other tasks (LR is a “scratch register”). This means that the functions in the backtrace are actually there, in the stack, in the form of older saved LRs, waiting for us to get them.
So, the first step would be to dump the memory contents of the backtrace, starting from the address pointed by the Stack Pointer (SP). Let’s print the first 256 32-bit words and save them as a file from gdb:
(gdb) set logging overwrite on (gdb) set logging file /tmp/bt.txt (gdb) set logging on Copying output to /tmp/bt.txt. (gdb) x/256wa $sp 0xbe9772b0: 0x821e 0xb38e103d 0x1aef48 0xb1973df0 0xbe9772c0: 0x73d 0xb38dc51f 0x0 0x1 0xbe9772d0: 0x191d58 0x191da4 0x19f200 0xb31ae5ed ... 0xbe977560: 0xb28c6000 0xbe9776b4 0x5 0x10871 <main(int, char**)> 0xbe977570: 0xb6f93000 0xaaaaaaab 0xaf85fd4a 0xa36dbc17 0xbe977580: 0x130 0x0 0x109b9 <__libc_csu_init> 0x0 ... 0xbe977690: 0x0 0x0 0x108cd <_start> 0x0 0xbe9776a0: 0x0 0x108ed <_start+32> 0x10a19 <__libc_csu_fini> 0xb6f76969 (gdb) set logging off Done logging to /tmp/bt.txt.
Gdb already can name some of the functions (like main()
), but not all of them. At least not the ones more interesting for our purpose. We’ll have to look for them by hand.
We first get the memory page mapping from the process (WebKit’s WebProcess in my case) looking in /proc/pid/maps. I’m retrieving it from the device (named metro
) via ssh and saving it to a local file. I’m only interested in the code pages, those with executable (‘x’) permissions:
$ ssh metro 'cat /proc/$(ps axu | grep WebProcess | grep -v grep | { read _ P _ ; echo $P ; })/maps | grep " r.x. "' > /tmp/maps.txt
The file looks like this:
00010000-00011000 r-xp 00000000 103:04 2617 /usr/bin/WPEWebProcess ... b54f2000-b6e1e000 r-xp 00000000 103:04 1963 /usr/lib/libWPEWebKit-0.1.so.2.2.1 b6f6b000-b6f82000 r-xp 00000000 00:02 816 /lib/ld-2.24.so be957000-be978000 rwxp 00000000 00:00 0 [stack] be979000-be97a000 r-xp 00000000 00:00 0 [sigpage] be97b000-be97c000 r-xp 00000000 00:00 0 [vdso] ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors]
Now we process the backtrace to remove address markers and have one word per line:
$ cat /tmp/bt.txt | sed -e 's/^[^:]*://' -e 's/[<][^>]*[>]//g' | while read A B C D; do echo $A; echo $B; echo $C; echo $D; done | sed 's/^0x//' | while read P; do printf '%08x\n' "$((16#"$P"))"; done | sponge /tmp/bt.txt
Then merge and sort both files, so the addresses in the stack appear below their corresponding mappings:
$ cat /tmp/maps.txt /tmp/bt.txt | sort > /tmp/merged.txt
Now we process the resulting file to get each address in the stack with its corresponding mapping:
$ cat /tmp/merged.txt | while read LINE; do if [[ $LINE =~ - ]]; then MAPPING="$LINE"; else echo $LINE '-->' $MAPPING; fi; done | grep '/' | sed -E -e 's/([0-9a-f][0-9a-f]*)-([0-9a-f][0-9a-f]*)/\1 - \2/' > /tmp/mapped.txt
Like this (address in the stack, page start (or base), page end, page permissions, executable file load offset (base offset), etc.):
0001034c --> 00010000 - 00011000 r-xp 00000000 103:04 2617 /usr/bin/WPEWebProcess ... b550bfa4 --> b54f2000 - b6e1e000 r-xp 00000000 103:04 1963 /usr/lib/libWPEWebKit-0.1.so.2.2.1 b5937445 --> b54f2000 - b6e1e000 r-xp 00000000 103:04 1963 /usr/lib/libWPEWebKit-0.1.so.2.2.1 b5fb0319 --> b54f2000 - b6e1e000 r-xp 00000000 103:04 1963 /usr/lib/libWPEWebKit-0.1.so.2.2.1 ...
The addr2line
tool can give us the exact function an address belongs to, or even the function and source code line if the code has been built with symbols. But the addresses addr2line
understands are internal offsets, not absolute memory addresses. We can convert the addresses in the stack to offsets with this expression:
offset = address - page start + base offset
I’m using buildroot as my cross-build environment, so I need to pick the library files from the staging directory because those are the unstripped versions. The addr2line
tool is the one from the buldroot cross compiling toolchain. Written as a script:
$ cat /tmp/mapped.txt | while read ADDR _ BASE _ END _ BASEOFFSET _ _ FILE; do OFFSET=$(printf "%08x\n" $((0x$ADDR - 0x$BASE + 0x$BASEOFFSET))); FILE=~/buildroot/output/staging/$FILE; if [[ -f $FILE ]]; then LINE=$(~/buildroot/output/host/usr/bin/arm-buildroot-linux-gnueabihf-addr2line -p -f -C -e $FILE $OFFSET); echo "$ADDR $LINE"; fi; done > /tmp/addr2line.txt
Finally, we filter out the useless [??]
entries:
$ cat /tmp/bt.txt | while read DATA; do cat /tmp/addr2line.txt | grep "$DATA"; done | grep -v '[?][?]' > /tmp/fullbt.txt
What remains is something very similar to what the real backtrace should have been if everything had originally worked as it should in gdb:
b31ae5ed gst_pad_send_event_unchecked en /home/enrique/buildroot/output5/build/gstreamer1-1.10.4/gst/gstpad.c:5571 b31a46c1 gst_debug_log en /home/enrique/buildroot/output5/build/gstreamer1-1.10.4/gst/gstinfo.c:444 b31b7ead gst_pad_send_event en /home/enrique/buildroot/output5/build/gstreamer1-1.10.4/gst/gstpad.c:5775 b666250d WebCore::AppendPipeline::injectProtectionEventIfPending() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WebCore/platform/graphics/gstreamer/mse/AppendPipeline.cpp:1360 b657b411 WTF::GRefPtr<_GstEvent>::~GRefPtr() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/DerivedSources/ForwardingHeaders/wtf/glib/GRefPtr.h:76 b5fb0319 WebCore::HTMLMediaElement::pendingActionTimerFired() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WebCore/html/HTMLMediaElement.cpp:1179 b61a524d WebCore::ThreadTimers::sharedTimerFiredInternal() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WebCore/platform/ThreadTimers.cpp:120 b61a5291 WTF::Function<void ()>::CallableWrapper<WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::{lambda()#1}>::call() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/DerivedSources/ForwardingHeaders/wtf/Function.h:101 b6c809a3 operator() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:171 b6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::{lambda(void*)#1}::_FUN(void*) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:164 b6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::{lambda(void*)#1}::_FUN(void*) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:164 b2ad4223 g_main_context_dispatch en :? b6c80601 WTF::{lambda(_GSource*, int (*)(void*), void*)#1}::_FUN(_GSource*, int (*)(void*), void*) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:40 b6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::{lambda(void*)#1}::_FUN(void*) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:164 b6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::{lambda(void*)#1}::_FUN(void*) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:164 b2adfc49 g_poll en :? b2ad44b7 g_main_context_iterate.isra.29 en :? b2ad477d g_main_loop_run en :? b6c80de3 WTF::RunLoop::run() en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/glib/RunLoopGLib.cpp:97 b6c654ed WTF::RunLoop::dispatch(WTF::Function<void ()>&&) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WTF/wtf/RunLoop.cpp:128 b5937445 int WebKit::ChildProcessMain<WebKit::WebProcess, WebKit::WebProcessMain>(int, char**) en /home/enrique/buildroot/output5/build/wpewebkit-custom/build-Release/../Source/WebKit/Shared/unix/ChildProcessMain.h:64 b27b2978 __bss_start en :?
I hope you find this trick useful and the scripts handy in case you ever to resort to examining the raw stack to get a meaningful backtrace.
Happy debugging!
Excellent! Found out this several times. It would be great if gdb could handle this, or have a tool/script to do it easily.
Thank you!, this was very useful! it helped me out to debug a case with a coredump.
Hello,
I get below error,
-bash: 16#”24dad”: syntax error: invalid arithmetic operator (error token is “”24dad””)
when executing this command,
cat /tmp/bt.txt | sed -e ‘s/^[^:]*://’ -e ‘s/[]*[>]//g’ | while read A B C D; do echo $A; echo $B; echo $C; echo $D; done | sed ‘s/^0x//’ | while read P; do printf ‘%08x\n’ “$((16#”$P”))”; done | sponge /tmp/bt.txt
Can you please help to overcome this error?
Not sure if I posted my previous answer as a reply or not. Just check the list of comments to see it.
Replacing “$((16#”$P”))” with “$((16#$P))” worked for me in zsh. Maybe the same applies to you if it happens in an older bash version. bash 5.2.26 was happy with this code here.
It works for me here. I’ve split the processing of bt.txt in multiple steps. Please check every step in your case (using my example bt.txt) and ensure that there’s no format change in the generated output caused by any weird LANG or LOCALE settings.
This is too much unformatted text for a comment, so I’ve posted everything in this pastebin: https://pastebin.com/W9vgnvu8
Great stuff!
Another way to find the library address mappings, specially useful if the process is already dead (for example, when dealing with a coredump) is to type in Gdb the command “info proc mappings”
I had no idea. That’s really cool!