Carp::Datum - Debugging And Tracing Ultimate Module |
Carp::Datum - Debugging And Tracing Ultimate Module
# In modules use Carp::Datum; # Programming by contract sub routine { DFEATURE my $f_, "optional message"; # $f_ is a lexical lvalue here my ($a, $b) = @_; DREQUIRE $a > $b, "a > b"; $a += 1; $b += 1; DASSERT $a > $b, "ordering a > b preserved"; my $result = $b - $a; DENSURE $result < 0; return DVAL $result; } # Tracing DTRACE "this is a debug message"; DTRACE TRC_NOTICE, "note: a = ", $a, " is positive"; DTRACE {-level => TRC_NOTICE, -marker => "!!"}, "note with marker"; # Returning return DVAL $scalar; # single value return DARY @list; # list of values
# In application's main use Carp::Datum qw(:all on); # turns Datum "on" or "off"
DLOAD_CONFIG(-file => "debug.cf", -config => "config string");
The Carp::Datum
module brings powerful debugging and tracing features
to development code: automatic flow tracing, returned value tracing,
assertions, and debugging traces. Its various functions may be customized
dynamically (i.e. at run time) via a configuration language allowing
selective activation on a routine, file, or object type basis. See
the Carp::Datum::Cfg manpage for configuration defails.
Carp::Datum
traces are implemented on top of Log::Agent
and go to its
debugging channel. This lets the application have full control of the
final destination of the debugging information (logfile, syslog, etc...).
Carp::Datum
can be globally turned on or off by the application. It is
off by default, which means no control flow tracing (routine entry and exit),
and no returned value tracing. However, assertions are still fully monitored,
and the DTRACE
calls are redirected to Log::Agent
.
The C version of Carp::Datum
is implemented with macros, which may
be redefined to nothing to remove all assertions in the released
code. The Perl version cannot be handled that way, but comes with
a Carp::Datum::Strip
module that will lexically remove all the
assertions, leaving only DTRACE
calls. Modules using Carp::Datum
can make use of Carp::Datum::MakeMaker
in their Makefile.PL to
request stripping at build time. See the Carp::Datum::MakeMaker manpage for
instructions.
Here is a small example showing what traces look like, and what happens by
default on assertion failure. Since Log::Agent
is not being customized, the
debugging channel is STDERR. In real life, one would probably
customize Log::Agent with a file driver, and redirect the debug channel
to a file separate from both STDOUT and STDERR.
First, the script, with line number:
1 #!/usr/bin/perl 2 3 use Carp::Datum qw(:all on); 4 5 DFEATURE my $f_; 6 7 show_inv(2, 0.5, 0); 8 9 sub show_inv { 10 DFEATURE my $f_; 11 foreach (@_) { 12 print "Inverse of $_ is ", inv($_), "\n"; 13 } 14 return DVOID; 15 } 16 17 sub inv { 18 DFEATURE my $f_; 19 my ($x) = @_; 20 DREQUIRE $x != 0, "x=$x not null"; 21 return DVAL 1 / $x; 22 } 23
What goes to STDOUT:
Inverse of 2 is 0.5 Inverse of 0.5 is 2 FATAL: PANIC: pre-condition FAILED: x=0 not null ($x != 0) [./demo:20]
The debugging output on STDERR:
+-> global [./demo:5] | +-> main::show_inv(2, 0.5, 0) from global at ./demo:7 [./demo:10] | | +-> main::inv(2) from main::show_inv() at ./demo:12 [./demo:18] | | | Returning: (0.5) [./demo:21] | | +-< main::inv(2) from main::show_inv() at ./demo:12 | | +-> main::inv(0.5) from main::show_inv() at ./demo:12 [./demo:18] | | | Returning: (2) [./demo:21] | | +-< main::inv(0.5) from main::show_inv() at ./demo:12 | | +-> main::inv(0) from main::show_inv() at ./demo:12 [./demo:18] !! | | | pre-condition FAILED: x=0 not null ($x != 0) [./demo:20] !! | | | main::inv(0) called at ./demo line 12 !! | | | main::show_inv(2, 0.5, 0) called at ./demo line 7 ** | | | FATAL: PANIC: pre-condition FAILED: x=0 not null ($x != 0) [./demo:20] | | +-< main::inv(0) from main::show_inv() at ./demo:12 | +-< main::show_inv(2, 0.5, 0) from global at ./demo:7 +-< global
The last three lines were manually re-ordered for this manpage: because of the
pre-condition failure, Perl enters its global object destruction routine,
and the destruction order of the lexicals is not right. The $f_ in show_inv()
is destroyed before the one in inv(), resulting in the inversion. To better
please the eye, it has been fixed. And the PANIC is emitted when the pre-condition
failure is detected, but it would have messed up the trace example.
Note that the stack dump is prefixed with the ``!!'' token, and the fatal error is tagged with ``**''. This is a visual aid only, to quickly locate troubles in logfiles by catching the eye.
Routine entry and exit are tagged, returned values and parameters are
shown, and the immediate caller of each routine is also traced. The
final tags from global at ./demo:7 [./demo:10]
refer to the file
name (here the script used was called ``demo'') and the line number
where the call to the Carp::Datum
routine is made: here the
DFEATURE
at line 10. It also indicates the caller origin: here, the
call is made at line 7 of file demo
.
The special name ``global'' (without trailing () marker) is used to indicate that the caller is the main script, i.e. there is no calling routine.
Returned values in inv()
are traced as ``(0.5)'' and ``(2)'', and not as ``0.5''
and ``2'' as one would expect, because the routine was called in non-scalar
context (within a print statement).
The Programming by Contract paradigm was introduced by Bertrand Meyer in his Object Oriented Software Construction book, and later implemented natively in the Eiffel language. It is very simple, yet extremely powerful.
Each feature (routine) of a program is viewed externally as a supplier for
some service. For instance, the sqrt()
routine computes the square root
of any positive number. The computation could be verified, but
sqrt()
probably provides an efficient algorithm for that, and it has already
been written and validated.
However, sqrt()
is only defined for positive numbers. Giving a negative
number to it is not correct. The old way (i.e. in the old days before
Programming by Contract was formalized), people implemented that restriction
by testing the argument x of sqrt(), and doing so in the routine itself
to factorize code. Then, on error, sqrt()
would return -1 for instance
(which cannot be a valid square root for a real number), and the desired
quantity otherwise. The caller had then to check the returned value to
determine whether an error had occurred. Here it is easy, but in languages
where no out-of-band value such as Perl's undef
are implemented, it can
be quite difficult to both report an error and return a result.
With Programming by Contract, the logic is reversed, and the code is greatly simplified:
sqrt()
promises to always return the square root of its argument.
What are the benefits of such a gentlemen's agreement? The code of the sqrt()
routine is much simpler (meaning fewer bugs) because it does not have
to bother with handling the case of negative arguments, since the caller
promised to never call with such invalid values. And the code of the caller
is at worst as complex as before (one test to check that the argument is
positive, against a check for an error code) and at best less complex: if it is
known that the value is positive, it doesn't even have to be checked, for instance
if it is the result of an abs()
call.
But if sqrt()
is called with a negative argument, and there's no explicit test
in sqrt()
to trap the case, what happens if sqrt()
is given a negative
value, despite a promise never to do so? Well, it's a bug, and it's a
bug in the caller, not in the sqrt()
routine.
To find those bugs, one usually monitors the assertions (pre- and post-conditions, plus any other assertion in the code, which is both a post-condition for the code above and a pre-condition for the code below, at the same time) during testing. When the product is released, assertions are no longer checked.
Each routine is equipped with a set of pre-conditions and post-conditions. A routine r is therefore defined as:
r(x) pre-condition body post-condition
The pre- and post-conditions are expressions involving the parameters of r(),
here only x, and, for the post-condition, the returned value of r()
as well.
Conditions satisfying this property are made visible to the clients, and become
the routine's contract, which can be written as:
In object-oriented programming, pre- and post-conditions can also use internal attributes of the object, but then become debugging checks that everything happens correctly (in the proper state, the proper order, etc...) and cannot be part of the contract (for external users of the class) since clients cannot check that the pre-condition is true, because it will not have access to the internal attributes.
Furthermore, in object-oriented programming, a redefined feature must weaken
the pre-condition of its parent feature and strengthen its post-condition.
It can also keep them as-is. To fully understand why, it's best to read
Meyer. Intuitively, it's easy to understand why the pre-condition cannot
be strengthened, nor why the post-condition cannot be weakened: because of dynamic
binding, a caller of r()
only has the static type of the object, not its
dynamic type. Therefore, it cannot know in advance which of the routines will
be called amongst the inheritance tree.
If a pre-condition is so important that it needs to always be
monitored, even within the released product, then Carp::Datum
provides VERIFY
, a pre-condition that will always be checked
(i.e. never stripped by Carp::Datum::Strip
). It can be used to protect
the external interface of a module against abuse.
With Carp::Datum, pre-conditions can be given using DREQUIRE
or VERIFY
.
Assertions are written with DASSERT
and post-conditions given by DENSURE
.
Although all assertions could be expressed with only DASSERT
,
stating whether it's a pre-condition with DREQUIRE
also has
a commentary value for the reader. Moreover, one day, there might be an
automatic tool to extract the pre- and post-conditions of all the routines
for documentation purposes, and if all assertions are called DASSERT
,
the tool will have a hard time figuring out which is what.
Moreover, remember that a pre-condition failure always means a bug in the caller, whilst other assertion failures means a bug near the place of failure. If only for that, it's worth making the distinction.
my
which is very important to ensure that what is going to be stored in the
lexically scoped $f_ variable will be destroyed when the routine ends.
Any name can be used for that lexical, but $f_ is recommended because it is
both unlikely to conflict with any real variable and short.
The optional comment part will be printed in the logs at routine entry time, and can be used to flag object constructors, for instance, for easier grep'ing in the logs afterwards.
return
from a routine.
It allows tracing of the return statement.
Carp::Datum::Strip
. Examples:
return DVAL 5; # OK return DVAL ($a == 1) ? 2 : 4; # WRONG (has parenthesis) return DVAL (1, 2, 4); # WRONG (and will return 4)
my $x = ($a == 1) ? 2 : 4; return DVAL $x; # OK
return DVAL &foo(); # Will be traced as array context
Using DVAL allows tracing of the returned value.
return DARY @x;
If a routine returns something different depending on its calling context, then write:
return DARY @x if wantarray; return DVAL $x;
Be very careful with that, otherwise the program will behave differently
when the DARY
and DVAL
tokens are stripped by Carp::Datum::Strip
,
thereby raising subtle bugs.
DREQUIRE
expr, tagDREQUIRE $x > 0, "x = $x positive";
The tag is optional and may be left off.
VERIFY
expr, tagDREQUIRE
, except that it will not be stripped
by Carp::Datum::Strip
and that it will always be monitored and cause a
fatal error, whatever dynamic configuration is setup.
DENSURE
expr, tagDASSERT
expr, tag
Tracing is ensured by the DTRACE
routine, which is never stripped. When
Carp::Datum
is off, traces are redirected to Log::Agent
(then channel
depends on the level of the trace).
The following forms can be used, from the simpler to the more complex:
DTRACE "the variable x+1 is ", $x + 1, " and y is $y"; DTRACE TRC_WARNING, "a warning message"; DTRACE { -level => TRC_CRITICAL, -marker => "##" }, "very critical";
The first call emits a trace at the TRC_DEBUG
level, by default. The
second call emits a warning at the TRC_WARNING
level, and the last call
emits a TRC_CRITICAL
message prefixed with a marker.
Markers are 2-char strings emitted in the very first columns of the
debugging output, and can be used to put emphasis on specifice messages.
Internally, Carp::Datum
and Log::Agent
use the following markers:
!! assertion failure and stack trace ** critical errors, fatal if not trapped by eval {} >> a message emitted via a Log::Agent routine, not DTRACE
The table below lists the available TRC_
levels defined by Carp::Datum
,
and how they remap to Log::Agent
routines when Carp::Datum
is off:
Carp::Datum Log::Agent ------------- ------------- TRC_EMERGENCY logdie TRC_ALERT logerr TRC_CRITICAL logerr TRC_ERROR logerr TRC_WARNING logwarn TRC_NOTICE logsay TRC_INFO logtrc "info" TRC_DEBUG logtrc "debug"
If an application does not configure Log::Agent
specifically, all the calls
map nicely to perl's native routines (die, warn and print).
equiv
expr1, expr2implies
expr1, expr2!expr1 || expr2
It is always true except when expr1 is true and expr2 is false.
Warning: this is function, not a macro. That is to say, both arguments are evaluated, and there is no short-circuit when expr1 is false.
The Carp::Datum
module can be turned on/off. This indication must
be included when the module is imported in the main program as
followed:
# In application's main use Carp::Datum qw(:all on); # to turn on use Carp::Datum qw(:all off); # to turn off
When Carp::Datum
is turned off, most of the specific functions
(DFEATURE, ...) continue to be invoked during the program execution
but they return immediately. In details, all the tracing functions are
disconnected, the contracts (DASSERT, DREQUIRE, DENSURE) continue to
be verified: assertion failure will stop the program.
That leads to a tiny perfomance loss when running production
release. But, the delivered code keeps the possibility to be easily
debugged. If the performance would be problematic in a production
release, there is a stripper program available that can extract all the
Carp::Datum
calls from a source file. (see the Carp::Datum::Strip manpage).
To turn on/off debugging according to an environment variable, the module can be imported like the following:
# In application's main use Carp::Datum (":all", $ENV{DATUM});
# as a preamble to the program execution # in your favorite shell (here /bin/ksh) export DATUM=on # to turn on export DATUM=off # to turn off
The dynamic configuration is loaded when the DLOAD_CONFIG
function
is invoked in the main program. The function signature passes
either a filename or directly a string (or both).
DLOAD_CONFIG(-file => "./debug.cf") # filename - or - DLOAD_CONFIG(-config => <<EOM); # string routine "show_inv" { all(yes); flow(no); trace(no); return(no); } EOM
The syntax used in the file or the one of the config string is described in the Carp::Datum::Cfg manpage.
The dynamic setting allows to filter the debug traces when running. For instance, one can enforce a routine to be silent.
As an important note, the dynamic configuration is effective only when the global debug switch is turned on.
It's not possible to insert tracing hooks like DFEATURE
or DVAL
in stringification overloading routines. For DFEATURE
, that is because
the argument list might be dumped, and printing $self
will re-invoke
the stringification routine recursively. For DVAL
, this is implied by
the fact that there cannot be any DFEATURE
in the routine, hence DVAL
cannot be used.
Please report any bugs to the current maintainer.
The seed of the Carp::Datum
module started to grow in 1996 when
Raphael Manfredi and Christophe Dehaudt were involved in a tricky
project involving kernel environment. It was Christophe's first experience
with Programming By Contract principles. Raphael was already familar with
the concept due to his participation in the development of the
Eiffel compiler.
Written in C, the first release was based on pre-processor macros. It already distinguished the pre-conditions, post-conditions and assertions. Also included were the concepts of dynamic configuration and flow tracing. The benefit of this lonely include file was very important since the final integration was very short and, since then, there has been no major bug reported on the delivered product.
Based on this first success, they leveraged the techniques for developments in C++. The debug module was upgraded with the necessary notions required for true OO programming in C++.
The Perl module was produced in 2000, when Raphael and Christophe needed
for Perl the same powerful support that they had initiated a few years prior.
Before the first official release in spring 2001, they developed
several other Perl modules and applications (mainly related to CGI
programming) that were powered by Carp::Datum
. Some of them have
also been published in CPAN directory (for instance:
CGI::Mxscreen
).
Christophe Dehaudt and Raphael Manfredi are the original authors.
Send bug reports, hints, tips, suggestions to Dave Hoover at <squirrel@cpan.org>.
Carp::Datum::Cfg(3), Carp::Datum::MakeMaker(3), Carp::Datum::Strip(3), Log::Agent(3).
Carp::Datum - Debugging And Tracing Ultimate Module |