{"id":684,"date":"2025-12-05T12:16:07","date_gmt":"2025-12-05T11:16:07","guid":{"rendered":"https:\/\/eocanha.org\/blog\/?p=684"},"modified":"2025-12-05T12:16:07","modified_gmt":"2025-12-05T11:16:07","slug":"meow-process-log-text-files-as-if-you-could-make-cat-speak","status":"publish","type":"post","link":"https:\/\/eocanha.org\/blog\/2025\/12\/05\/meow-process-log-text-files-as-if-you-could-make-cat-speak\/","title":{"rendered":"Meow: Process log text files as if you could make cat speak"},"content":{"rendered":"\n<p>Some years ago I had mentioned <a href=\"https:\/\/eocanha.org\/blog\/2021\/05\/25\/gstreamer-webkit-debugging-by-using-external-tools-2-2\/\" data-type=\"URL\" data-id=\"https:\/\/eocanha.org\/blog\/2021\/05\/25\/gstreamer-webkit-debugging-by-using-external-tools-2-2\/\">some command line tools<\/a> I used to analyze and find useful information on GStreamer logs. I&#8217;ve been using them consistently along all these years, but some weeks ago I thought about unifying them in a single tool that could provide more flexibility in the mid term, and also as an excuse to unrust my Rust knowledge a bit. That&#8217;s how I wrote Meow, a tool to make <code>cat<\/code> speak (that is, to provide meaningful information).<\/p>\n\n\n\n<p>The idea is that you can <code>cat<\/code> a file through <code>meow<\/code> and apply the filters, like this:<\/p>\n\n\n\n<p><code>cat \/tmp\/log.txt | meow appsinknewsample n:V0 n:video ht: \\<br>  ft:-0:00:21.466607596 's:#([A-za-z][A-Za-z]*\/)*#'<\/code><\/p>\n\n\n\n<p>which means &#8220;select those lines that contain <code>appsinknewsample<\/code> (with case insensitive matching), but don&#8217;t contain <code>V0<\/code> nor <code>video<\/code> (that is, by exclusion, only that contain audio, probably because we&#8217;ve analyzed both and realized that we should focus on audio for our specific problem), highlight the different thread ids, only show those lines with timestamp lower than 21.46 sec, and change strings like <code>Source\/WebCore\/platform\/graphics\/gstreamer\/mse\/AppendPipeline.cpp<\/code> to become just <code>AppendPipeline.cpp<\/code>&#8220;, to get an output as shown in this terminal screenshot:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image.png\"><img loading=\"lazy\" width=\"1024\" height=\"254\" src=\"https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image-1024x254.png\" alt=\"Screenshot of a terminal output showing multiple log lines. Some of them have the word &quot;appsinkNewSample&quot; highlighted in red. Some lines have the hexadecimal id of the thread that printed them highlighed (purple for one thread, brown for the other)\" class=\"wp-image-686\" srcset=\"https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image-1024x254.png 1024w, https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image-300x74.png 300w, https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image-768x190.png 768w, https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image-1536x381.png 1536w, https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/image.png 1899w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>Cool, isn&#8217;t it? After all, I&#8217;m convinced that the answer to any GStreamer bug is always hidden in the logs (or will be, as soon as I add &#8220;<em>just a couple of log lines more, bro<\/em>&#8221; \ud83e\udd2d).<\/p>\n\n\n\n<p>Currently, meow supports this set of manipulation commands:<\/p>\n\n\n\n<ul><li><strong>Word filter and highlighting by regular expression<\/strong> (<code>fc:REGEX<\/code>, or just <code>REGEX<\/code>): Every expression will highlight its matched words in a different color.<\/li><li><strong>Filtering without highlighting<\/strong> (<code>fn:REGEX<\/code>): Same as <code>fc:<\/code>, but without highlighting the matched string. This is useful for those times when you want to match lines that have two expressions (<code>E1<\/code>, <code>E2<\/code>) but the highlighting would pollute the line too much. In those case you can use a regex such as <code>E1.*E2<\/code> and then highlight the subexpressions manually later with an <code>h:<\/code> rule.<\/li><li><strong>Negative filter<\/strong> (<code>n:REGEX<\/code>): Selects only the lines that don&#8217;t match the regex filter. No highlighting.<\/li><li><strong>Highlight with no filter<\/strong> (<code>h:REGEX<\/code>): Doesn&#8217;t discard any line, just highlights the specified regex.<\/li><li><strong>Substitution<\/strong> (<code>s:\/REGEX\/REPLACE<\/code>): Replaces one pattern for another. Any other delimiter character can be used instead of \/, it that&#8217;s more convenient to the user (for instance, using # when dealing with expressions to manipulate paths).<\/li><li><strong>Time filter<\/strong> (<code>ft:TIME-TIME<\/code>): Assuming the lines start with a GStreamer log timestamp, this filter selects only the lines between the target start and end time. Any of the time arguments (or both) can be omitted, but the <code>-<\/code> delimiter must be present. Specifying multiple time filters will generate matches that fit on any of the time ranges, but overlapping ranges can trigger undefined behaviour.<\/li><li><strong>Highlight threads<\/strong> (<code>ht:<\/code>): Assuming a GStreamer log, where the thread id appears as the third word in the line, highlights each thread in a different color.<\/li><\/ul>\n\n\n\n<p>The <code>REGEX<\/code> pattern is a regular expression. All the matches are case insensitive. When used for substitutions, capture groups can be defined as <code>(?<span class=\"has-inline-color has-medium-pink-color\">CAPTURE_NAME<\/span>REGEX)<\/code>.<\/p>\n\n\n\n<p>The <code>REPLACE<\/code>ment string is the text that the <code>REGEX<\/code> will be replaced by when doing substitutions. Text captured by a named capture group can be referred to by <code>${<span class=\"has-inline-color has-medium-pink-color\">CAPTURE_NAME<\/span>}<\/code>.<\/p>\n\n\n\n<p>The <code>TIME<\/code> pattern can be any sequence of numbers, <code>:<\/code> or <code>.<\/code> . Typically, it will be a GStreamer timestamp (eg: 0:01:10.881123150), but it can actually be any other numerical sequence. Times are compared lexicographically, so it&#8217;s important that all of them have the same string length.<\/p>\n\n\n\n<p>The filtering algorithm has a custom set of priorities for operations, so that they get executed in an intuitive order. For instance, a sequence of filter matching expressions (<code>fc:<\/code>, <code>fn:<\/code>) will have the same priority (that is, any of them will let a text line pass if it matches, not forbidding any of the lines already allowed by sibling expressions), while a negative filter will only be applied on the results left by the sequence of filters before it. Substitutions will be applied at their specific position (not before or after), and will therefore modify the line in a way that can alter the matching of subsequent filters. In general, the user doesn&#8217;t have to worry about any of this, because the rules are designed to generate the result that you would expect.<\/p>\n\n\n\n<p>Now some practical examples:<\/p>\n\n\n\n<p><strong>Example 1<\/strong>: Select lines with the word &#8220;one&#8221;, or the word &#8220;orange&#8221;, or a number, highlighting each pattern in a different color except the number, which will have no color:<br><br><code>$ cat file.txt | meow one fc:orange 'fn:[0-9][0-9]*'<br>000 <span class=\"has-inline-color has-medium-pink-color\">one<\/span> small <span class=\"has-inline-color has-yellow-color\">orange<\/span><br>005 <span class=\"has-inline-color has-medium-pink-color\">one<\/span> big <span class=\"has-inline-color has-yellow-color\">orange<\/span><\/code><\/p>\n\n\n\n<p><strong>Example 2<\/strong>: Assuming a pictures filename listing, select filenames not ending in &#8220;jpg&#8221; nor in &#8220;jpeg&#8221;, and rename the filename to &#8220;.bak&#8221;, preserving the extension at the end:<br><br><code>$ cat list.txt | meow  'n:jpe?g' \\<\/code><br><code>\u00a0<\/code>\u00a0\u00a0<code>'s:#^(?&lt;f>[^.]*)(?&lt;e>[.].*)$#${f}.bak${e}'<br>train.bak.png<br>sunset.bak.gif<\/code><\/p>\n\n\n\n<p><strong>Example 3<\/strong>: Only print the log lines with times between 0:00:24.787450146 and 0:00:24.790741865 or those at 0:00:30.492576587 or after, and highlight every thread in a different color:<br><br><code>$ cat log.txt | meow ft:0:00:24.787450146-0:00:24.790741865 \\<br>\u00a0<\/code>\u00a0\u00a0<code>ft:0:00:30.492576587- ht:<br>0:00:24.787450146 739 <span class=\"has-inline-color has-medium-pink-color\">0x1ee2320<\/span> DEBUG \u2026<br>0:00:24.790382735 739 <span class=\"has-inline-color has-yellow-color\">0x1f01598<\/span> INFO \u2026<br>0:00:24.790741865 739 <span class=\"has-inline-color has-medium-pink-color\">0x1ee2320<\/span> DEBUG \u2026<br>0:00:30.492576587 739 <span class=\"has-inline-color has-yellow-color\">0x1f01598<\/span> DEBUG \u2026<br>0:00:31.938743646 739 <span class=\"has-inline-color has-yellow-color\">0x1f01598<\/span> ERROR \u2026<\/code><\/p>\n\n\n\n<p>This is only the begining. I have great ideas for this new tool (as time allows), such as support for parenthesis (so the expressions can be grouped), or call stack indentation on logs generated by tracers, in a similar way to what Alicia&#8217;s <a href=\"https:\/\/github.com\/ntrrgc\/dotfiles\/blob\/master\/bin\/gst-log-indent-tracers\" data-type=\"URL\" data-id=\"https:\/\/github.com\/ntrrgc\/dotfiles\/blob\/master\/bin\/gst-log-indent-tracers\"><code>gst-log-indent-tracers<\/code> tool<\/a> does. I might also predefine some common expressions to use in regular expressions, such as the ones to match paths (so that the user doesn&#8217;t have to think about them and reinvent the wheel every time). Anyway, these are only ideas. Only time and hyperfocus slots will tell&#8230;<\/p>\n\n\n\n<p>By now, you can <a href=\"https:\/\/github.com\/eocanha\/meow\" data-type=\"URL\" data-id=\"https:\/\/github.com\/eocanha\/meow\">find the source code on my github<\/a>. Meow! <img loading=\"lazy\" width=\"128\" height=\"128\" class=\"wp-image-698\" style=\"width: 28px;\" src=\"https:\/\/eocanha.org\/blog\/wp-content\/uploads\/2025\/12\/blobcat.png\" alt=\"\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some years ago I had mentioned some command line tools I used to analyze and find useful information on GStreamer logs. I&#8217;ve been using them consistently along all these years, but some weeks ago I thought about unifying them in a single tool that could provide more flexibility in the mid term, and also as &hellip; <a href=\"https:\/\/eocanha.org\/blog\/2025\/12\/05\/meow-process-log-text-files-as-if-you-could-make-cat-speak\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Meow: Process log text files as if you could make cat speak<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[13,2,7,12],"tags":[47,20,49,51,45,44,46,50,48],"_links":{"self":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts\/684"}],"collection":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/comments?post=684"}],"version-history":[{"count":13,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts\/684\/revisions"}],"predecessor-version":[{"id":699,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/posts\/684\/revisions\/699"}],"wp:attachment":[{"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/media?parent=684"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/categories?post=684"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/eocanha.org\/blog\/wp-json\/wp\/v2\/tags?post=684"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}