GStreamer WebKit debugging by using external tools (2/2)

This is the last post of the series showing interesting debugging tools, I hope you have found it useful. Don’t miss the custom scripts at the bottom to process GStreamer logs, help you highlight the interesting parts and find the root cause of difficult bugs. Here are also the previous posts of the series:

How to debug pkgconfig

When pkg-config finds the PKG_CONFIG_DEBUG_SPEW env var, it explains all the steps used to resolve the packages:

PKG_CONFIG_DEBUG_SPEW=1 /usr/bin/pkg-config --libs x11

This is useful to know why a particular package isn’t found and what are the default values for PKG_CONFIG_PATH when it’s not defined. For example:

Adding directory '/usr/local/lib/x86_64-linux-gnu/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/local/lib/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/local/share/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/lib/x86_64-linux-gnu/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/lib/pkgconfig' from PKG_CONFIG_PATH
Adding directory '/usr/share/pkgconfig' from PKG_CONFIG_PATH

If we have tuned PKG_CONFIG_PATH, maybe we also want to add the default paths. For example:

SYSROOT=~/sysroot-x86-64
export PKG_CONFIG_PATH=${SYSROOT}/usr/local/lib/pkgconfig:${SYSROOT}/usr/lib/pkgconfig
# Add also the standard pkg-config paths to find libraries in the system
export PKG_CONFIG_PATH=${PKG_CONFIG_PATH}:/usr/local/lib/x86_64-linux-gnu/pkgconfig:\
/usr/local/lib/pkgconfig:/usr/local/share/pkgconfig:/usr/lib/x86_64-linux-gnu/pkgconfig:\
/usr/lib/pkgconfig:/usr/share/pkgconfig
# This tells pkg-config where the "system" pkg-config dir is. This is useful when cross-compiling for other
# architecture, to avoid pkg-config using the system .pc files and mixing host and target libraries
export PKG_CONFIG_LIBDIR=${SYSROOT}/usr/lib
# This could have been used for cross compiling:
#export PKG_CONFIG_SYSROOT_DIR=${SYSROOT}

Man in the middle proxy for WebKit

Sometimes it’s useful to use our own modified/unminified files with a 3rd party service we don’t control. Mitmproxy can be used as a man-in-the-middle proxy, but I haven’t tried it personally yet. What I have tried (with WPE) is this:

  1. Add an /etc/hosts entry to point the host serving the files we want to change to an IP address controlled by us.
  2. Configure a web server to provide the files in the expected path.
  3. Modify the ResourceRequestBase constructor to change the HTTPS requests to HTTP when the hostname matches the target:
ResourceRequestBase(const URL& url, ResourceRequestCachePolicy policy)
    : m_url(url)
    , m_timeoutInterval(s_defaultTimeoutInterval)
    ...
    , m_isAppBound(false)
{
    if (m_url.host().toStringWithoutCopying().containsIgnoringASCIICase(String("out-of-control-service.com"))
        && m_url.protocol().containsIgnoringASCIICase(String("https"))) {
        printf("### %s: URL %s detected, changing from https to http\n",
            __PRETTY_FUNCTION__, m_url.string().utf8().data()); 
        fflush(stdout);
        m_url.setProtocol(String("http"));
    }
}

:bulb: Pro tip: If you have to debug minified/obfuscated JavaScript code and don’t have a deobfuscated version to use in a man-in-the-middle fashion, use http://www.jsnice.org/ to deobfuscate it and get meaningful variable names.

Bandwidth control for a dependent device

If your computer has a “shared internet connection” enabled in Network Manager and provides access to a dependent device , you can control the bandwidth offered to that device. This is useful to trigger quality changes on adaptive streaming videos from services out of your control.

This can be done using tc, the Traffic Control tool from the Linux kernel. You can use this script to automate the process (edit it to suit to your needs).

Useful scripts to process GStreamer logs

I use these scripts in my daily job to look for strange patterns in GStreamer logs that help me to find the cause of the bugs I’m debugging:

  • h: Highlights each expression in the command line in a different color.
  • mgrep: Greps (only) for the lines with the expressions in the command line and highlights each expression in a different color.
  • filter-time: Gets a subset of the log lines between a start and (optionally) an end GStreamer log timestamp.
  • highlight-threads: Highlights each thread in a GStreamer log with a different color. That way it’s easier to follow a thread with the naked eye.
  • remove-ansi-colors: Removes the color codes from a colored GStreamer log.
  • aha: ANSI-HTML-Adapter converts plain text with color codes to HTML, so you can share your GStreamer logs from a web server (eg: for bug discussion). Available in most distros.
  • gstbuffer-leak-analyzer: Analyzes a GStreamer log and shows unbalances in the creation/destruction of GstBuffer and GstMemory objects.

GStreamer WebKit debugging by using external tools (1/2)

In this new post series, I’ll show you how both existing and ad-hoc tools can be helpful to find the root cause of some problems. Here are also the older posts of this series in case you find them useful:

Use strace to know which config/library files are used by a program

If you’re becoming crazy supposing that the program should use some config and it seems to ignore it, just use strace to check what config files, libraries or other kind of files is the program actually using. Use the grep rules you need to refine the search:

$ strace -f -e trace=%file nano 2> >(grep 'nanorc')
access("/etc/nanorc", R_OK)             = 0
access("/usr/share/nano/javascript.nanorc", R_OK) = 0
access("/usr/share/nano/gentoo.nanorc", R_OK) = 0
...

Know which process is killing another one

First, try to strace -e trace=signal -p 1234 the killed process.

If that doesn’t work (eg: because it’s being killed with the uncatchable SIGKILL signal), then you can resort to modifying the kernel source code (signal.c) to log the calls to kill():

SYSCALL_DEFINE2(kill, pid_t, pid, int, sig)
{
    struct task_struct *tsk_p;
    ...
    /* Log SIGKILL */
    if (sig & 0x1F == 9) {
        tsk_p = find_task_by_vpid(pid);

        if (tsk_p) {
            printk(KERN_DEBUG "Sig: %d from pid: %d (%s) to pid: %d (%s)\n",
                sig, current->pid, current->comm, pid, tsk_p->comm);
        } else {
            printk(KERN_DEBUG "Sig: %d from pid: %d (%s) to pid: %d\n",
                sig, current->pid, current->comm, pid);
        }
    }
    ...
}

Wrap gcc/ld/make to tweak build parameters

If you ever find yourself with little time in front of a stubborn build system and, no matter what you try, you can’t get the right flags to the compiler, think about putting something (a wrapper) between the build system and the compiler. Example for g++:

#!/bin/bash
main() {
    # Build up arg[] array with all options to be passed
    # to subcommand.
    i=0
    for opt in "$@"; do
        case "$opt" in
        -O2) ;; # Removes this option
        *)
            arg[i]="$opt" # Keeps the others
            i=$((i+1))
            ;;
        esac
    done
    EXTRA_FLAGS="-O0" # Adds extra option
    echo "g++ ${EXTRA_FLAGS} ${arg[@]}" # >> /tmp/build.log # Logs the command
    /usr/bin/ccache g++ ${EXTRA_FLAGS} "${arg[@]}" # Runs the command
}
main "$@"

Make sure that the wrappers appear earlier than the real commands in your PATH.

The make wrapper can also call remake instead. Remake is fully compatible with make but has features to help debugging compilation and makefile errors.

Analyze the structure of MP4 data

The ISOBMFF Box Structure Viewer online tool allows you to upload an MP4 file and explore its structure.

GStreamer WebKit debugging by instrumenting source code (3/3)

This is the last post on the instrumenting source code series. I hope you to find the tricks below as useful as the previous ones.

In this post I show some more useful debugging tricks. Don’t forget to have a look at the other posts of the series:

Finding memory leaks in a RefCounted subclass

The source code shown below must be placed in the .h where the class to be debugged is defined. It’s written in a way that doesn’t need to rebuild RefCounted.h, so it saves a lot of build time. It logs all refs, unrefs and adoptPtrs, so that any anomaly in the refcounting can be traced and investigated later. To use it, just make your class inherit from LoggedRefCounted instead of RefCounted.

Example output:

void WTF::adopted(WTF::LoggedRefCounted<T>*) [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1
void WTF::adopted(WTF::LoggedRefCounted<T>*) [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1
^^^ Two adopts, this is not good.
void WTF::LoggedRefCounted<T>::ref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1 --> ...
void WTF::LoggedRefCounted<T>::ref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount ... --> 2
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 2 --> ...
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount ... --> 1
void WTF::adopted(WTF::LoggedRefCounted<T>*) [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1 --> ...
void WTF::LoggedRefCounted<T>::deref() [with T = WebCore::MediaSourceClientGStreamerMSE]: this=0x673c07a4, refCount 1 --> ...
^^^ Two recursive derefs, not good either.
#include "Logging.h"

namespace WTF {

template<typename T> class LoggedRefCounted : public WTF::RefCounted<T> {
    WTF_MAKE_NONCOPYABLE(LoggedRefCounted); WTF_MAKE_FAST_ALLOCATED;
public:
    void ref() {
        printf("%s: this=%p, refCount %d --> ...\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
        WTF::RefCounted<T>::ref();
        printf("%s: this=%p, refCount ... --> %d\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
    }

    void deref() {
        printf("%s: this=%p, refCount %d --> ...\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
        WTF::RefCounted<T>::deref();
        printf("%s: this=%p, refCount ... --> %d\n", __PRETTY_FUNCTION__, this, WTF::RefCounted<T>::refCount()); fflush(stdout);
    }

protected:
    LoggedRefCounted() { }
    ~LoggedRefCounted() { }
};

template<typename T> inline void adopted(WTF::LoggedRefCounted<T>* object)
{
    printf("%s: this=%p, refCount %d\n", __PRETTY_FUNCTION__, object, (object)?object->refCount():0); fflush(stdout);
    adopted(static_cast<RefCountedBase*>(object));
}

} // Namespace WTF

Pause WebProcess on launch

WebProcessMainGtk and WebProcessMainWPE will sleep for 30 seconds if a special environment variable is defined:

export WEBKIT2_PAUSE_WEB_PROCESS_ON_LAUNCH=1

It only works #if ENABLE(DEVELOPER_MODE), so you might want to remove those ifdefs if you’re building in Release mode.

Log tracers

In big pipelines (e.g. playbin) it can be very hard to find what element is replying to a query or handling an event. Even using gdb can be extremely tedious due to the very high level of recursion. My coworker Alicia commented that using log tracers is more helpful in this case.

GST_TRACERS=log enables additional GST_TRACE() calls all accross GStreamer. The following example logs entries and exits into the query function.

GST_TRACERS=log GST_DEBUG='query:TRACE'

The names of the logging categories are somewhat inconsistent:

  • log (the log tracer itself)
  • GST_BUFFER
  • GST_BUFFER_LIST
  • GST_EVENT
  • GST_MESSAGE
  • GST_STATES
  • GST_PADS
  • GST_ELEMENT_PADS
  • GST_ELEMENT_FACTORY
  • query
  • bin

The log tracer code is in subprojects/gstreamer/plugins/tracers/gstlog.c.

GStreamer WebKit debugging by instrumenting source code (2/3)

In this post I show some more useful debugging tricks. Check also the other posts of the series:

Print current thread id

The thread id is generated by Linux and can take values higher than 1-9, just like PIDs. This thread number is useful to know which function calls are issued by the same thread, avoiding confusion between threads.

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

printf("%s [%d]\n", __PRETTY_FUNCTION__, syscall(SYS_gettid));
fflush(stdout);

Debug GStreamer thread locks

We redefine the GST_OBJECT_LOCK/UNLOCK/TRYLOCK macros to print the calls, compare locks against unlocks, and see who’s not releasing its lock:

#include "wtf/Threading.h"
#define GST_OBJECT_LOCK(obj) do { \
  printf("### [LOCK] %s [%p]\n", __PRETTY_FUNCTION__, &Thread::current()); fflush(stdout); \
  g_mutex_lock(GST_OBJECT_GET_LOCK(obj)); \
} while (0)
#define GST_OBJECT_UNLOCK(obj) do { \
  printf("### [UNLOCK] %s [%p]\n", __PRETTY_FUNCTION__, &Thread::current()); fflush(stdout); \
  g_mutex_unlock(GST_OBJECT_GET_LOCK(obj)); \
} while (0)
#define GST_OBJECT_TRYLOCK(obj) ({ \
  gboolean result = g_mutex_trylock(GST_OBJECT_GET_LOCK(obj)); \
  if (result) { \
   printf("### [LOCK] %s [%p]\n", __PRETTY_FUNCTION__, &Thread::current()); fflush(stdout); \
  } \
  result; \
})

Warning: The statement expression that allows the TRYLOCK macro to return a value will only work on GCC.

There’s a way to know which thread has taken a lock in glib/GStreamer using gdb. First locate the stalled thread:

(gdb) thread 
(gdb) bt
#2  0x74f07416 in pthread_mutex_lock ()
#3  0x7488aec6 in gst_pad_query ()
#4  0x6debebf2 in autoplug_query_allocation ()

(gdb) frame 3
#3  0x7488aec6 in gst_pad_query (pad=pad@entry=0x54a9b8, ...)
4058        GST_PAD_STREAM_LOCK (pad);

Now get the process id (PID) and use the pthread_mutex_t structure to print the Linux thread id that has acquired the lock:

(gdb) call getpid()
$30 = 6321
(gdb) p ((pthread_mutex_t*)pad.stream_rec_lock.p)->__data.__owner
$31 = 6368
(gdb) thread find 6321.6368
Thread 21 has target id 'Thread 6321.6368'

Trace function calls (poor developer version)

If you’re using C++, you can define a tracer class. This is for webkit, but you get the idea:

#define MYTRACER() MyTracer(__PRETTY_FUNCTION__);
class MyTracer {
public:
    MyTracer(const gchar* functionName)
      : m_functionName(functionName) {
      printf("### %s : begin %d\n", m_functionName.utf8().data(), currentThread()); fflush(stdout);
    }
    virtual ~MyTracer() {
        printf("### %s : end %d\n", m_functionName.utf8().data(), currentThread()); fflush(stdout);
    }
private:
    String m_functionName;
};

And use it like this in all the functions you want to trace:

void somefunction() {
  MYTRACER();
  // Some other code...
}

The constructor will log when the execution flow enters into the function and the destructor will log when the flow exits.

Setting breakpoints from C

In the C code, just call raise(SIGINT) (simulate CTRL+C, normally the program would finish).

And then, in a previously attached gdb, after breaking and having debugging all you needed, just continue the execution by ignoring the signal or just plainly continuing:

(gdb) signal 0
(gdb) continue

There’s a way to do the same but attaching gdb after the raise. Use raise(SIGSTOP) instead (simulate CTRL+Z). Then attach gdb, locate the thread calling raise and switch to it:

(gdb) thread apply all bt
[now search for "raise" in the terminal log]
Thread 36 (Thread 1977.2033): #1 0x74f5b3f2 in raise () from /home/enrique/buildroot/output2/staging/lib/libpthread.so.0
(gdb) thread 36

Now, from a terminal, send a continuation signal: kill -SIGCONT 1977. Finally instruct gdb to single-step only the current thread (IMPORTANT!) and run some steps until all the raises have been processed:

(gdb) set scheduler-locking on
(gdb) next    // Repeat several times...

Know the name of a GStreamer function stored in a pointer at runtime

Just use this macro:

GST_DEBUG_FUNCPTR_NAME(func)

Detecting memory leaks in WebKit

RefCountedLeakCounter is a tool class that can help to debug reference leaks by printing this kind of messages when WebKit exits:

  LEAK: 2 XMLHttpRequest
  LEAK: 25 CachedResource
  LEAK: 3820 WebCoreNode

To use it you have to modify the particular class you want to debug:

  • Include wtf/RefCountedLeakCounter.h
  • DEFINE_DEBUG_ONLY_GLOBAL(WTF::RefCountedLeakCounter, myClassCounter, ("MyClass"));
  • In the constructor: myClassCounter.increment()
  • In the destructor: myClassCounter.decrement()