GStreamer WebKit debugging by instrumenting source code (1/3)

This is the continuation of the GStreamer WebKit debugging tricks post series. In the next three posts, I’ll focus on what we can get by doing some little changes to the source code for debugging purposes (known as “instrumenting”), but before, you might want to check the previous posts of the series:

Know all the env vars read by a program by using LD_PRELOAD to intercept libc calls

// File getenv.c
// To compile: gcc -shared -Wall -fPIC -o getenv.so getenv.c -ldl
// To use: export LD_PRELOAD="./getenv.so", then run any program you want
// See http://www.catonmat.net/blog/simple-ld-preload-tutorial-part-2/

#define _GNU_SOURCE

#include <stdio.h>
#include <dlfcn.h>

// This function will take the place of the original getenv() in libc
char *getenv(const char *name) {
 printf("Calling getenv(\"%s\")\n", name);

 char *(*original_getenv)(const char*);
 original_getenv = dlsym(RTLD_NEXT, "getenv");

 return (*original_getenv)(name);
}

See the breakpoints with command example to know how to get the same using gdb. Check also Zan’s libpine for more features.

Track lifetime of GObjects by LD_PRELOADing gobject-list

The gobject-list project, written by Thibault Saunier, is a simple LD_PRELOAD library for tracking the lifetime of GObjects. When loaded into an application, it prints a list of living GObjects on exiting the application (unless the application crashes), and also prints reference count data when it changes. SIGUSR1 or SIGUSR2 can be sent to the application to trigger printing of more information.

Overriding the behaviour of a debugging macro

The usual debugging macros aren’t printing messages? Redefine them to make what you want:

#undef LOG_MEDIA_MESSAGE
#define LOG_MEDIA_MESSAGE(...) do { \
  printf("LOG %s: ", __PRETTY_FUNCTION__); \
  printf(__VA_ARGS__); \
  printf("\n"); \
  fflush(stdout); \
} while(0)

This can be done to enable asserts on demand in WebKit too:

#undef ASSERT
#define ASSERT(assertion) \
  (!(assertion) ? \
      (WTFReportAssertionFailure(__FILE__, __LINE__, WTF_PRETTY_FUNCTION, #assertion), \
       CRASH()) : \
      (void)0)

#undef ASSERT_NOT_REACHED
#define ASSERT_NOT_REACHED() do { \
  WTFReportAssertionFailure(__FILE__, __LINE__, WTF_PRETTY_FUNCTION, 0); \
  CRASH(); \
} while (0)

It may be interesting to enable WebKit LOG() and GStreamer GST_DEBUG() macros only on selected files:

#define LOG(channel, msg, ...) do { \
  printf("%s: ", #channel); \
  printf(msg, ## __VA_ARGS__); \
  printf("\n"); \
  fflush(stdout); \
} while (false)

#define _GST_DEBUG(msg, ...) do { \
  printf("### %s: ", __PRETTY_FUNCTION__); \
  printf(msg, ## __VA_ARGS__); \
  printf("\n"); \
  fflush(stdout); \
} while (false)

Note all the preprocessor trickery used here:

  • First arguments (channel, msg) are captured intependently
  • The remaining args are captured in __VA_ARGS__
  • do while(false) is a trick to avoid {braces} and make the code block work when used in if/then/else one-liners
  • #channel expands LOG(MyChannel,....) as printf("%s: ", "MyChannel"). It’s called “stringification”.
  • ## __VA_ARGS__ expands the variable argument list as a comma-separated list of items, but if the list is empty, it eats the comma after “msg”, preventing syntax errors

Print the compile-time type of an expression

Use typeid(<expression>).name(). Filter the ouput through c++filt -t:

std::vector<char *> v; 
printf("Type: %s\n", typeid(v.begin()).name());

Abusing the compiler to know all the places where a function is called

If you want to know all the places from where the GstClockTime toGstClockTime(float time) function is called, you can convert it to a template function and use static_assert on a wrong datatype like this (in the .h):

template <typename T = float> GstClockTime toGstClockTime(float time) { 
  static_assert(std::is_integral<T>::value,
    "Don't call toGstClockTime(float)!");
  return 0;
}

Note that T=float is different to integer (is_integral). It has nothing to do with the float time parameter declaration.

You will get compile-time errors like this on every place the function is used:

WebKitMediaSourceGStreamer.cpp:474:87:   required from here
GStreamerUtilities.h:84:43: error: static assertion failed: Don't call toGstClockTime(float)!

Use pragma message to print values at compile time

Sometimes is useful to know if a particular define is enabled:

#include <limits.h>

#define _STR(x) #x
#define STR(x) _STR(x)

#pragma message "Int max is " STR(INT_MAX)

#ifdef WHATEVER
#pragma message "Compilation goes by here"
#else
#pragma message "Compilation goes by there"
#endif

...

The code above would generate this output:

test.c:6:9: note: #pragma message: Int max is 0x7fffffff
 #pragma message "Int max is " STR(INT_MAX)
         ^~~~~~~
test.c:11:9: note: #pragma message: Compilation goes by there
 #pragma message "Compilation goes by there"
         ^~~~~~~

GStreamer WebKit debugging tricks using GDB (2/2)

This post is a continuation of a series of blog posts about the most interesting debugging tricks I’ve found while working on GStreamer WebKit on embedded devices. These are the other posts of the series published so far:

Print corrupt stacktraces

In some circumstances you may get stacktraces that eventually stop because of missing symbols or corruption (?? entries).

#3  0x01b8733c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

However, you can print the stack in a useful way that gives you leads about what was next in the stack:

  • For i386: x/256wa $esp
  • For x86_64: x/256ga $rsp
  • For ARM 32 bit: x/256wa $sp

You may want to enable asm-demangle: set print asm-demangle

Example output, the 3 last lines give interesting info:

0x7ef85550:     0x1b87400       0x2     0x0     0x1b87400
0x7ef85560:     0x0     0x1b87140       0x1b87140       0x759e88a4
0x7ef85570:     0x1b87330       0x759c71a9 <gst_base_sink_change_state+956>     0x140c  0x1b87330
0x7ef85580:     0x759e88a4      0x7ef855b4      0x0     0x7ef855b4
...
0x7ef85830:     0x76dbd6c4 <WebCore::AppendPipeline::resetPipeline()::__PRETTY_FUNCTION__>        0x4     0x3     0x1bfeb50
0x7ef85840:     0x0     0x76d59268      0x75135374      0x75135374
0x7ef85850:     0x76dbd6c4 <WebCore::AppendPipeline::resetPipeline()::__PRETTY_FUNCTION__>        0x1b7e300       0x1d651d0       0x75151b74

More info: 1

Sometimes the symbol names aren’t printed in the stack memdump. You can do this trick to iterate the stack and print the symbols found there (take with a grain of salt!):

(gdb) set $i = 0
(gdb) p/a *((void**)($sp + 4*$i++))

[Press ENTER multiple times to repeat the command]

$46 = 0xb6f9fb17 <_dl_lookup_symbol_x+250>
$58 = 0xb40a9001 <g_log_writer_standard_streams+128>
$142 = 0xb40a877b <g_return_if_fail_warning+22>
$154 = 0xb65a93d5 <WebCore::MediaPlayerPrivateGStreamer::changePipelineState(GstState)+180>
$164 = 0xb65ab4e5 <WebCore::MediaPlayerPrivateGStreamer::playbackPosition() const+420>
...

Many times it’s just a matter of gdb not having loaded the unstripped version of the library. /proc/<PID>/smaps and info proc mappings can help to locate the library providing the missing symbol. Then we can load it by hand.

For instance, for this backtrace:

#0  0x740ad3fc in syscall () from /home/enrique/buildroot-wpe/output/staging/lib/libc.so.6 
#1  0x74375c44 in g_cond_wait () from /home/enrique/buildroot-wpe/output/staging/usr/lib/libglib-2.0.so.0 
#2  0x6cfd0d60 in ?? ()

In a shell, we examine smaps and find out that the unknown piece of code comes from libgstomx:

$ cat /proc/715/smaps
...
6cfc1000-6cff8000 r-xp 00000000 b3:02 785380     /usr/lib/gstreamer-1.0/libgstomx.so
...

Now we load the unstripped .so in gdb and we’re able to see the new symbol afterwards:

(gdb) add-symbol-file /home/enrique/buildroot-wpe/output/build/gst-omx-custom/omx/.libs/libgstomx.so 0x6cfc1000
(gdb) bt
#0  0x740ad3fc in syscall () from /home/enrique/buildroot-wpe/output/staging/lib/libc.so.6
#1  0x74375c44 in g_cond_wait () from /home/enrique/buildroot-wpe/output/staging/usr/lib/libglib-2.0.so.0
#2  0x6cfd0d60 in gst_omx_video_dec_loop (self=0x6e0c8130) at gstomxvideodec.c:1311
#3  0x6e0c8130 in ?? ()

Useful script to prepare the add-symbol-file:

cat /proc/715/smaps | grep '[.]so' | sed -e 's/-[0-9a-f]*//' | { while read ADDR _ _ _ _ LIB; do echo "add-symbol-file $LIB 0x$ADDR"; done; }

More info: 1

The “figuring out corrupt ARM stacktraces” post has some additional info about how to use addr2line to translate memory addresses to function names on systems with a hostile debugging environment.

Debugging a binary without debug symbols

There are times when there’s just no way to get debug symbols working, or where we’re simply debugging on a release version of the software. In those cases, we must directly debug the assembly code. The gdb text user interface (TUI) can be used to examine the disassebled code and the CPU registers. It can be enabled with these commands:

layout asm
layout regs
set print asm-demangle

Some useful keybindings in this mode:

  • Arrows: scroll the disassemble window
  • CTRL+p/n: Navigate history (previously done with up/down arrows)
  • CTRL+b/f: Go backward/forward one character (previously left/right arrows)
  • CTRL+d: Delete character (previously “Del” key)
  • CTRL+a/e: Go to the start/end of the line

This screenshot shows how we can infer that an empty RefPtr is causing a crash in some WebKit code.

Wake up an unresponsive gdb on ARM

Sometimes, when you continue (‘c’) execution on ARM there’s no way to stop it again unless a breakpoint is hit. But there’s a trick to retake the control: just send a harmless signal to the process.

kill -SIGCONT 1234

Know which GStreamer thread id matches with each gdb thread

Sometimes you need to match threads in the GStreamer logs with threads in a running gdb session. The simplest way is to ask it to GThread for each gdb thread:

(gdb) set output-radix 16
(gdb) thread apply all call g_thread_self()

This will print a list of gdb threads and GThread*. We only need to find the one we’re looking for.

Generate a pipeline dump from gdb

If we have a pointer to the pipeline object, we can call the function that dumps the pipeline:

(gdb) call gst_debug_bin_to_dot_file_with_ts((GstBin*)0x15f0078, GST_DEBUG_GRAPH_SHOW_ALL, "debug")

GStreamer WebKit debugging tricks using GDB (1/2)

I’ve been developing and debugging desktop and mobile applications on embedded devices over the last decade or so. The main part of this period I’ve been focused on the multimedia side of the WebKit ports using GStreamer, an area that is a mix of C (glib, GObject and GStreamer) and C++ (WebKit).

Over these years I’ve had to work on ARM embedded devices (mobile phones, set-top-boxes, Raspberry Pi using buildroot) where most of the environment aids and tools we take for granted on a regular x86 Linux desktop just aren’t available. In these situations you have to be imaginative and find your own way to get the work done and debug the issues you find in along the way.

I’ve been writing down the most interesting tricks I’ve found in this journey and I’m sharing them with you in a series of 7 blog posts, one per week. Most of them aren’t mine, and the ones I learnt in the begining of my career can even seem a bit naive, but I find them worth to share anyway. I hope you find them as useful as I do.

Breakpoints with command

You can break on a place, run some command and continue execution. Useful to get logs:

break getenv
command
 # This disables scroll continue messages
 # and supresses output
 silent
 set pagination off
 p (char*)$r0
continue
end

break grl-xml-factory.c:2720 if (data != 0)
command
 call grl_source_get_id(data->source)
 # $ is the last value in the history, the result of
 # the previous call
 call grl_media_set_source (send_item->media, $)
 call grl_media_serialize_extended (send_item->media, 
  GRL_MEDIA_SERIALIZE_FULL)
 continue
end

This idea can be combined with watchpoints and applied to trace reference counting in GObjects and know from which places the refcount is increased and decreased.

Force execution of an if branch

Just wait until the if chooses a branch and then jump to the other one:

6 if (i > 3) {
(gdb) next
7 printf("%d > 3\n", i);
(gdb) break 9
(gdb) jump 9
9 printf("%d <= 3\n", i);
(gdb) next
5 <= 3

Debug glib warnings

If you get a warning message like this:

W/GLib-GObject(18414): g_object_unref: assertion `G_IS_OBJECT (object)' failed

the functions involved are: g_return_if_fail_warning(), which calls to g_log(). It’s good to set a breakpoint in any of the two:

break g_log

Another method is to export G_DEBUG=fatal_criticals, which will convert all the criticals in crashes, which will stop the debugger.

Debug GObjects

If you want to inspect the contents of a GObjects that you have in a reference…

(gdb) print web_settings 
$1 = (WebKitWebSettings *) 0x7fffffffd020

you can dereference it…

(gdb) print *web_settings
$2 = {parent_instance = {g_type_instance = {g_class = 0x18}, ref_count = 0, qdata = 0x0}, priv = 0x0}

even if it’s an untyped gpointer…

(gdb) print user_data
(void *) 0x7fffffffd020
(gdb) print *((WebKitWebSettings *)(user_data))
{parent_instance = {g_type_instance = {g_class = 0x18}, ref_count = 0, qdata = 0x0}, priv = 0x0}

To find the type, you can use GType:

(gdb) call (char*)g_type_name( ((GTypeInstance*)0x70d1b038)->g_class->g_type )
$86 = 0x2d7e14 "GstOMXH264Dec-omxh264dec"

Instantiate C++ object from gdb

(gdb) call malloc(sizeof(std::string))
$1 = (void *) 0x91a6a0
(gdb) call ((std::string*)0x91a6a0)->basic_string()
(gdb) call ((std::string*)0x91a6a0)->assign("Hello, World")
$2 = (std::basic_string<char, std::char_traits<char>, std::allocator<char> > &) @0x91a6a0: {static npos = <optimized out>, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x91a6f8 "Hello, World"}}
(gdb) call SomeFunctionThatTakesAConstStringRef(*(const std::string*)0x91a6a0)

See: 1 and 2