Thursday, May 22, 2014

Tracing Erlang with Redbug - Part 1

For the past year and a half I have been programming in Erlang full time.  When it comes to complex distributed applications, I don't think any other language or framework can beat it.  And as far as I'm concerned, the biggest thing it has going for it is the ability to trace the execution of code and messages even across nodes in a cluster.

Regular stepping-through-the-code debugging is great for simple programs, but it's about impossible to make sense of code doing thousands of tasks a second on different machines with multiple cores. Tracing in Erlang lets you see what functions are called with what parameters in what process, and what messages are passed between them.   This feature comes out of the box and can be used even in production systems (if you are careful!).

Although tracing is built into Erlang, the best way to get started is by using the Redbug library that comes in eper.  I find it easier to explain redbug by example. The first and main thing to remember is that it is self documented.  Just enter this in the console:
 
  > redbug:help().


So let's get to it.  To trace function calls in any Erlang process, you would use something like this:

1> redbug:start("lists:sort").
{30,2}
2> lists:sort([1,2,3,4]).
[1,2,3,4]

% 18:59:08 <0.32.0>({erlang,apply,2})
% lists:sort([1,2,3,4])
redbug done, timeout - 1
3>


That starts the redbug process, which will run for 15 seconds and will inform you of any calls made to the sort function in the lists module. Shutting itself down promptly is one of the niceties about redbug. Tracing works by sending messages from the process running the code that matches our request to the process where you start redbug. If you specify a commonly used function in a system running thousands of transient Erlang processes, things could quickly get out of hand and even crash the node.

Other similar ways of tracing function calls are:

% Trace multiple functions at once
 1> redbug:start(["erlang:term_to_binary", "lists:sort"]).
 % Trace only the version of this function that takes 2 parameters
 2> redbug:start("erlang:term_to_binary/2"]).
 % Trace any function in the lists module
 3> redbug:start("lists").  


You can also limit tracing to functions called with certain values as arguments. For example:

% Only trace calls to lists:sort when it's passed the list [1,2,3]
 1> redbug:start("lists:sort([1,2,3]").


Erlang's pattern matching can be used to match only certain arguments, or only certain parts of an argument. Here only the call to lists:member with 3 as a first argument is traced:

10> redbug:start("lists:member(3,_)").
{30,1}
11> lists:member(3, [1,2]).

% 19:18:45 <0.32.0>({erlang,apply,2})
% lists:member(3, [1,2])
false
12> lists:member(4, [1,2]).
false


You can (and should!) get really fancy with your patterns. Alas, redbug does not currently allow matching on binaries. But you can use pretty much any guard you can use in an Erlang function:

% Only match if the only argument is a binary of size 10
1> redbug:start("erlang:term_to_binary(B) when is_binary(B), size(B) == 10").
% Only if the argument is a list where the first element is greater than 9
2> redbug:start("lists:sort([A|_]) when A > 9").
% Only if the argument is a two tuple with 'tag' as the first element
3> redbug:start("erlang:term_to_binary({tag, _}).


Sometimes you would like to know what the traced functions return, or get a stack trace at the point of execution. Or maybe both:

1> redbug:start("lists:max->return").
2> redbug:start("lists:max->stack").
3> redbug:start("lists:max->return;stack").


If you run the example above and call lists:max, you will see it is a recursive function. That means you may get a lot more messages than you expected. Luckily, redbug will shut itself down after a certain number of messages have been received for your protection. You can control both the time it stays up and the maximum number of messages it processes like this:

% Process at most 1000 messages or
% stay up for 60 seconds, whatever happens first.
1> redbug:start(["lists->return", "erlang:term_to_binary/2->stack"],
     [{msgs, 1000}, {time, 60000}]).


Other parameters let you customize redbug in many useful ways. Just check out redbug:help for details. You can redirect the output to a file, filter it with a regular expression, omit the argument values passed into traced functions, change the size limit for messages (redbug will also shut down if tracing a function call with massive arguments), etc.

In a later post we can explore more details about redbug and the underlying tracing mechanism in Erlang. As useful as redbug is, there will be plenty of times when you need full access to the tracing primitives. That usually means you are in a lot of trouble :).

No comments:

Post a Comment