{"id":553,"date":"2020-10-16T19:07:04","date_gmt":"2020-10-16T18:07:04","guid":{"rendered":"http:\/\/eocanha.org\/blog\/?p=553"},"modified":"2020-10-16T19:07:04","modified_gmt":"2020-10-16T18:07:04","slug":"figuring-out-corrupt-stacktraces-on-arm","status":"publish","type":"post","link":"https:\/\/eocanha.org\/blog\/2020\/10\/16\/figuring-out-corrupt-stacktraces-on-arm\/","title":{"rendered":"Figuring out corrupt stacktraces on ARM"},"content":{"rendered":"\n<p>If you&#8217;re developing C\/C++ on embedded devices, you might already have stumbled upon a corrupt stacktrace like this when trying to debug with gdb:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">(gdb) bt \n#0 \u00a00xb38e32c4 in pthread_getname_np () from \/home\/enrique\/buildroot\/output5\/staging\/lib\/libpthread.so.0\n#1 \u00a00xb38e103c in __lll_timedlock_wait () from \/home\/enrique\/buildroot\/output5\/staging\/lib\/libpthread.so.0 \nBacktrace stopped: previous frame identical to this frame (corrupt stack?)<\/pre>\n\n\n\n<p>In these cases I usually give up gdb and try to solve my problems by adding printf()s and resorting to other tools. However, there are times when you really really need to know what is in that cursed stack.<\/p>\n\n\n\n<p>ARM devices <a href=\"https:\/\/developer.arm.com\/documentation\/ihi0042\/j\/?lang=en#subroutine-calls\">subroutine calls<\/a> work by setting the return address in the Link Register (LR), so the subroutine knows where to point the Program Counter (PC) register to. While not jumping into subroutines, the values of the LR register is saved in the stack (to be restored later, right before the current subroutine returns to the caller) and the register can be used for other tasks (LR is a &#8220;scratch register&#8221;). This means that the functions in the backtrace are actually there, in the stack, in the form of older saved LRs, waiting for us to get them.<\/p>\n\n\n\n<p>So, the first step would be to dump the memory contents of the backtrace, starting from the address pointed by the Stack Pointer (SP). Let&#8217;s print the first 256 32-bit words and save them as a file from gdb:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">(gdb) set logging overwrite on\n(gdb) set logging file \/tmp\/bt.txt\n(gdb) set logging on\nCopying output to \/tmp\/bt.txt.\n(gdb) x\/256wa $sp\n0xbe9772b0:     0x821e \u00a00xb38e103d   0x1aef48\u00a0\u00a0\u00a00xb1973df0\n0xbe9772c0:      0x73d \u00a00xb38dc51f \u00a0\u00a0  \u00a0\u00a0\u00a00x0 \u00a0\u00a0\u00a0\u00a0     0x1\n0xbe9772d0:   0x191d58   \u00a00x191da4   0x19f200 \u00a0\u00a00xb31ae5ed\n...\n0xbe977560: 0xb28c6000 \u00a00xbe9776b4   \u00a0\u00a0\u00a0\u00a0\u00a00x5 \u00a0\u00a0\u00a0 \u00a00x10871 &lt;main(int, char**)>\n0xbe977570: 0xb6f93000\u00a0\u00a00xaaaaaaab\u00a00xaf85fd4a \u00a0\u00a00xa36dbc17\n0xbe977580: \u00a0 \u00a0\u00a0\u00a00x130 \u00a0      \u00a00x0 \u00a0\u00a0\u00a00x109b9 &lt;__libc_csu_init> 0x0\n...\n0xbe977690: \u00a0\u00a0\u00a0\u00a0   0x0 \u00a0\u00a0\u00a0\u00a0    0x0 \u00a0\u00a0\u00a00x108cd &lt;_start> \u00a00x0\n0xbe9776a0: \u00a0\u00a0   \u00a0\u00a00x0 \u00a0\u00a0\u00a0\u00a00x108ed &lt;_start+32>  0x10a19 &lt;__libc_csu_fini> 0xb6f76969  \n(gdb) set logging off\nDone logging to \/tmp\/bt.txt.<\/pre>\n\n\n\n<p>Gdb already can name some of the functions (like <code>main()<\/code>), but not all of them. At least not the ones more interesting for our purpose. We&#8217;ll have to look for them by hand.<\/p>\n\n\n\n<p>We first get the memory page mapping from the process (WebKit&#8217;s WebProcess in my case) looking in \/proc\/<em>pid<\/em>\/maps. I&#8217;m retrieving it from the device (named <code>metro<\/code>) via ssh and saving it to a local file. I&#8217;m only interested in the code pages, those with executable (&#8216;x&#8217;) permissions:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ ssh metro 'cat \/proc\/$(ps axu | grep WebProcess | grep -v grep | { read _ P _ ; echo $P ; })\/maps | grep \" r.x. \"' > \/tmp\/maps.txt<\/pre>\n\n\n\n<p>The file looks like this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">00010000-00011000 r-xp 00000000 103:04 2617 \u00a0\u00a0\u00a0\u00a0\u00a0\/usr\/bin\/WPEWebProcess\n...\nb54f2000-b6e1e000 r-xp 00000000 103:04 1963 \u00a0\u00a0\u00a0\u00a0\u00a0\/usr\/lib\/libWPEWebKit-0.1.so.2.2.1 \nb6f6b000-b6f82000 r-xp 00000000 00:02 816 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\/lib\/ld-2.24.so \nbe957000-be978000 rwxp 00000000 00:00 0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[stack] \nbe979000-be97a000 r-xp 00000000 00:00 0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[sigpage] \nbe97b000-be97c000 r-xp 00000000 00:00 0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[vdso] \nffff0000-ffff1000 r-xp 00000000 00:00 0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0[vectors]<\/pre>\n\n\n\n<p>Now we process the backtrace to remove address markers and have one word per line:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ cat \/tmp\/bt.txt | sed -e 's\/^[^:]*:\/\/' -e 's\/[&lt;][^>]*[>]\/\/g' | while read A B C D; do echo $A; echo $B; echo $C; echo $D; done | sed 's\/^0x\/\/' | while read P; do printf '%08x\\n' \"$((16#\"$P\"))\"; done | sponge \/tmp\/bt.txt<\/pre>\n\n\n\n<p>Then merge and sort both files, so the addresses in the stack appear below their corresponding mappings:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ cat \/tmp\/maps.txt \/tmp\/bt.txt | sort > \/tmp\/merged.txt<\/pre>\n\n\n\n<p>Now we process the resulting file to get each address in the stack with its corresponding mapping:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ cat \/tmp\/merged.txt | while read LINE; do if [[ $LINE =~ - ]]; then MAPPING=\"$LINE\"; else echo $LINE '-->' $MAPPING; fi; done | grep '\/' | sed -E -e 's\/([0-9a-f][0-9a-f]*)-([0-9a-f][0-9a-f]*)\/\\1 - \\2\/' > \/tmp\/mapped.txt<\/pre>\n\n\n\n<p>Like this (address in the stack, page start (or base), page end, page permissions, executable file load offset (base offset), etc.):<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">0001034c --> 00010000 - 00011000 r-xp 00000000 103:04 2617 \/usr\/bin\/WPEWebProcess\n...\nb550bfa4 --> b54f2000 - b6e1e000 r-xp 00000000 103:04 1963 \/usr\/lib\/libWPEWebKit-0.1.so.2.2.1 \nb5937445 --> b54f2000 - b6e1e000 r-xp 00000000 103:04 1963 \/usr\/lib\/libWPEWebKit-0.1.so.2.2.1 \nb5fb0319 --> b54f2000 - b6e1e000 r-xp 00000000 103:04 1963 \/usr\/lib\/libWPEWebKit-0.1.so.2.2.1\n...<\/pre>\n\n\n\n<p>The <code>addr2line<\/code> tool can give us the exact function an address belongs to, or even the function and source code line if the code has been built with symbols. But the addresses <code>addr2line<\/code> understands are internal offsets, not absolute memory addresses. We can convert the addresses in the stack to offsets with this expression:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">offset = address - page start + base offset<\/pre>\n\n\n\n<p>I&#8217;m using buildroot as my cross-build environment, so I need to pick the library files from the staging directory because those are the unstripped versions. The <code>addr2line<\/code> tool is the one from the buldroot cross compiling toolchain. Written as a script:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ cat \/tmp\/mapped.txt | while read ADDR _ BASE _ END _ BASEOFFSET _ _ FILE; do OFFSET=$(printf \"%08x\\n\" $((0x$ADDR - 0x$BASE + 0x$BASEOFFSET))); FILE=~\/buildroot\/output\/staging\/$FILE; if [[ -f $FILE ]]; then LINE=$(~\/buildroot\/output\/host\/usr\/bin\/arm-buildroot-linux-gnueabihf-addr2line -p -f -C -e $FILE $OFFSET); echo \"$ADDR $LINE\"; fi; done > \/tmp\/addr2line.txt<\/pre>\n\n\n\n<p>Finally, we filter out the useless <code>[??]<\/code> entries:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ cat \/tmp\/bt.txt | while read DATA; do cat \/tmp\/addr2line.txt | grep \"$DATA\"; done | grep -v '[?][?]' > \/tmp\/fullbt.txt<\/pre>\n\n\n\n<p>What remains is something very similar to what the real backtrace should have been if everything had originally worked as it should in gdb:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">b31ae5ed gst_pad_send_event_unchecked en \/home\/enrique\/buildroot\/output5\/build\/gstreamer1-1.10.4\/gst\/gstpad.c:5571 \nb31a46c1 gst_debug_log en \/home\/enrique\/buildroot\/output5\/build\/gstreamer1-1.10.4\/gst\/gstinfo.c:444 \nb31b7ead gst_pad_send_event en \/home\/enrique\/buildroot\/output5\/build\/gstreamer1-1.10.4\/gst\/gstpad.c:5775 \nb666250d WebCore::AppendPipeline::injectProtectionEventIfPending() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WebCore\/platform\/graphics\/gstreamer\/mse\/AppendPipeline.cpp:1360 \nb657b411 WTF::GRefPtr&lt;_GstEvent>::~GRefPtr() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/DerivedSources\/ForwardingHeaders\/wtf\/glib\/GRefPtr.h:76 \nb5fb0319 WebCore::HTMLMediaElement::pendingActionTimerFired() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WebCore\/html\/HTMLMediaElement.cpp:1179 \nb61a524d WebCore::ThreadTimers::sharedTimerFiredInternal() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WebCore\/platform\/ThreadTimers.cpp:120 \nb61a5291 WTF::Function&lt;void ()>::CallableWrapper&lt;WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::{lambda()#1}>::call() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/DerivedSources\/ForwardingHeaders\/wtf\/Function.h:101 \nb6c809a3 operator() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:171 \nb6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&amp;)::{lambda(void*)#1}::_FUN(void*) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:164 \nb6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&amp;)::{lambda(void*)#1}::_FUN(void*) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:164 \nb2ad4223 g_main_context_dispatch en :? \nb6c80601 WTF::{lambda(_GSource*, int (*)(void*), void*)#1}::_FUN(_GSource*, int (*)(void*), void*) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:40 \nb6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&amp;)::{lambda(void*)#1}::_FUN(void*) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:164 \nb6c80991 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&amp;)::{lambda(void*)#1}::_FUN(void*) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:164 \nb2adfc49 g_poll en :? \nb2ad44b7 g_main_context_iterate.isra.29 en :? \nb2ad477d g_main_loop_run en :? \nb6c80de3 WTF::RunLoop::run() en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/glib\/RunLoopGLib.cpp:97 \nb6c654ed WTF::RunLoop::dispatch(WTF::Function&lt;void ()>&amp;&amp;) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WTF\/wtf\/RunLoop.cpp:128 \nb5937445 int WebKit::ChildProcessMain&lt;WebKit::WebProcess, WebKit::WebProcessMain>(int, char**) en \/home\/enrique\/buildroot\/output5\/build\/wpewebkit-custom\/build-Release\/..\/Source\/WebKit\/Shared\/unix\/ChildProcessMain.h:64 \nb27b2978 __bss_start en :?<\/pre>\n\n\n\n<p>I hope you find this trick useful and the scripts handy in case you ever to resort to examining the raw stack to get a meaningful backtrace.<\/p>\n\n\n\n<p>Happy debugging!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you&#8217;re developing C\/C++ on embedded devices, you might already have stumbled upon a corrupt stacktrace like this when trying to debug with gdb: (gdb) bt #0 \u00a00xb38e32c4 in pthread_getname_np () from \/home\/enrique\/buildroot\/output5\/staging\/lib\/libpthread.so.0 #1 \u00a00xb38e103c in __lll_timedlock_wait () from \/home\/enrique\/buildroot\/output5\/staging\/lib\/libpthread.so.0 Backtrace stopped: previous frame identical to this frame (corrupt stack?) In these cases I usually &hellip; <a href=\"https:\/\/eocanha.org\/blog\/2020\/10\/16\/figuring-out-corrupt-stacktraces-on-arm\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Figuring out corrupt stacktraces on ARM<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[19,15,16,17,20,14,18],"_links":{"self":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts\/553"}],"collection":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/comments?post=553"}],"version-history":[{"count":5,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts\/553\/revisions"}],"predecessor-version":[{"id":558,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts\/553\/revisions\/558"}],"wp:attachment":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/media?parent=553"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/categories?post=553"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/tags?post=553"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}