TRACE: a useful debugging tool

Intro

When debugging fragment generators, as when debugging any piece of software, it's useful to print out statements, variable values, etc., to determine what the system is doing. One potential problem with this is that littering code with lots of lengthy print statements can affect the code execution time; this is of special concern in a system which depends as heavily on execution time as a DAQ. Another problem is that it can be a hassle to recompile the code whenever you want to add or remove a print statement. A package called "TRACE" has been developed at Fermilab which is designed to address these two concerns.

Enabling TRACEs already in the code

The TRACE package gets set up whenever you source a dune-artdaq source file, /path/to/your/dune-artdaq/installation/setupDUNEARTDAQ_forRunning. You'll notice that in many such source files, there are some Bash statements at the bottom of the source file which configure TRACE:

       export TRACE_FILE=/tmp/trace_example_buffer
       export TRACE_LIMIT_MS="5,1000,2000"

       toffM 15 -n CommandableFragmentGenerator
       tonM 15 -n CommandableFragmentGenerator 

Basically, TRACE is capable of sending print statements to a buffer on a disk defined by the environment variable TRACE_FILE at a much greater speed than a program can send print statements using printf, cout, etc., to stdout. Normally you'd want to take advantage of this speed and send TRACE messages to buffer; however, you also have the option of sending TRACEs to stdout, and they need not be the same TRACEs sent to buffer. There are other environment variables which can further control TRACE's behavior; the TRACE_LIMIT_MS variable, above, is essentially saying "Don't print to stdout more than 5 identical messages to stdout in a 1000 millisecond (1 second) time window, and if you hit this limit, pause for 2000 milliseconds (2 seconds) before allowing further prints". Note however that TRACE_LIMIT_MS doesn't apply to the statements being saved to the buffer, due to the buffer's speed.

The "toffM 15 -n CommandableFragmentGenerator" line tells TRACE that you want it not to print TRACE messages in artdaq's CommandableFragmentGenerator base class that are at level 15, and the following line "tonM 15 -n CommandableFragmentGenerator" tells you actually do want it to. I'll describe what levels are in a moment, but first, the reason for the apparent uselessness of the "toffM" statement given that the following "tonM" statement will override it is that if you wish to turn off the printing of the message, you must do so by explicitly calling toffM. If you wish to turn off the TRACE statement, with the syntax above all you need to do is comment out the "tonM" statement, but without the "toffM" line it's very easy to forget that you need to call toffM and simply try commenting out tonM, which won't actually make the change you want.

Concerning what "level 15" means: all TRACE statements have a name and level associated with them. The name in this case is "CommandableFragmentGenerator", and the level in this case is 15. What this means is, in the Bash code above, TRACE statements with name "CommandableFragmentGenerator" and level 15 would get saved to a buffer during a DAQ run. If you look at line 281 of the CommandableFragmentGenerator source code, https://cdcvs.fnal.gov/redmine/projects/artdaq/repository/entry/artdaq/Application/CommandableFragmentGenerator.cc?utf8=✓&rev=v3_00_03a , you'll see

        TLOG_ARB(15, "CommandableFragmentGenerator") << "getDataBufferStats: Calculating buffer size" << TLOG_ENDL;
...this is an example of such a TRACE statement. Please observe that the buffer whose size is mentioned in the body of this TRACE statement isn't the same as the buffer to which TRACE statements are sent!

If you wish to have a TRACE statement printed to stdout instead of (or in addition to) sent to the buffer, you'd want to replace the commands "tonM" and "toffM" with "tonS" and "toffS". The "M" stands for Memory, the "S" for Stdout. Also, be aware that multiple TRACEs may share the same name and level; if this is the case, then when you execute

tonM <level> -n <name>
...what you're actually doing is enabling ALL TRACEs which have that name and level to be sent to the buffer next time you run.

Of course, there are times when you may not have write access to the setupDUNEARTDAQ_forRunning file. If that's the case, then after source-ing setupDUNEARTDAQ_forRunning, you can interactively enter TRACE commands at the command line. What's particularly powerful about this is that you can start a trace even when the system is in the running state.

Examining TRACE statements sent to buffer

This raises the obvious question: if a TRACE statement is written to a buffer referred to by TRACE_FILE rather than to stdout or a logfile, how can one look at the statement? Note that the buffer isn't an ASCII file, and it can't simply be opened by a text editor. What you need to do after performing a run with the desired TRACEs enabled is to log onto the node where your fragment generator ran, and then do the following:

source <full pathname of the setupDUNEARTDAQ_forRunning file sourced for the run>
tshow | tdelta -d 1 -ct 1 | less

what you'll then see looks like the following (note that output may scroll past the limits of your browser window):

  idx                us_tod     delta   pid   tid cpu                                  name lvl r msg
-----      ---------------- --------- ----- ----- --- ------------------------------------- --- - -----------------------------
    0 03-30 21:13:59.603276         0 124131 124131   5                       BoardReaderMain nfo . Commandable Server ended, exiting...
    1 03-30 21:13:59.602884       392 124131 124131   5                      xmlrpc_commander dbg . server terminated
...etc. Basically, each TRACE statement gets its own row, and each row is composed of a set of fields, whose names are at the top of the output. The interpretation of the fields is as follows:

  1. idx: the idx-th most recent TRACE statement, counting from 0
  2. us_tod: the time of the TRACE statement, given to microsecond precision
  3. delta: the number of microseconds the TRACE was performed before the next TRACE statement
  4. pid: the process ID of the process in which the TRACE statement executed
  5. tid: the thread ID of the same
  6. cpu: the CPU # the TRACE statement executed on
  7. name: the name associated with the TRACE statement ("CommandableFragmentGenerator" in the example earlier on this page)
  8. lvl: the level of the TRACE statement (15 in the example earlier on this page)
  9. msg: the actual message contents of the TRACE statement

You can examine the buffer during a run or after you've stopped it. As you might imagine, it's very common to grep for specific rows of this output, as well as use tools such as awk to select on specific fields. Also, you only need to source setupDUNEARTDAQ_forRunning one time after you've logged into the node; once you've done this you can just use the "tshow | tdelta -d 1 -ct 1 | less" for each new run. Since the buffer is a circular buffer, and can only hold so many TRACE statements, you may want to save the buffer for a particularly interesting run before that run's TRACE statements are overridden, e.g. by executing

cp -p $TRACE_FILE /directory/you/want/to/save/the/buffer/in/trace.<run number>

Adding TRACE statements to your fragment generator

Adding TRACE statements to your fragment generator is quite simple. First, you want to be sure to add the following #include to your source file:

#include "artdaq/DAQdata/Globals.hh"
And then, in your code, you can add lines such as the following:
TLOG(an_integer_with_the_level_of_this_message, a_string_with_the_name_of_this_message) << "A string containing the contents of this message, which can include " << a_variable << " since the syntax is simply C++ streaming syntax";
a concrete example of this would look like
TLOG(40, "ToySimulator") << "About to send fragment with sequence ID " << fragment_ptr->seqID() << ", this is a test of TRACE";
When you rebuild your code, you can then edit setupDUNEARTDAQ_forRunning as described earlied to make sure that the TRACE you added is enabled so it will be sent to buffer during running. Please note that, rather than just an integer as the first argument to TLOG, you can also pass TLVL_ERROR, TLVL_WARNING, TLVL_INFO and TLVL_DEBUG (not surrounded by quotes). These actually correspond to TRACE levels 0, 1, 2 and 3, respectively, and are special in that the first three will appear in stdout and in logfiles automatically (i.e., TRACE levels 0, 1, and 2 are automatically enabled).

For further details on TRACE, you can refer to its Fermilab Redmine wiki, https://cdcvs.fnal.gov/redmine/projects/trace/wiki

-- JohnChristianFreeman - 2019-12-04

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2019-12-04 - JohnChristianFreeman
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CENF All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback