1N/A=head1 NAME
1N/A
1N/Aperldebguts - Guts of Perl debugging
1N/A
1N/A=head1 DESCRIPTION
1N/A
1N/AThis is not the perldebug(1) manpage, which tells you how to use
1N/Athe debugger. This manpage describes low-level details concerning
1N/Athe debugger's internals, which range from difficult to impossible
1N/Ato understand for anyone who isn't incredibly intimate with Perl's guts.
1N/ACaveat lector.
1N/A
1N/A=head1 Debugger Internals
1N/A
1N/APerl has special debugging hooks at compile-time and run-time used
1N/Ato create debugging environments. These hooks are not to be confused
1N/Awith the I<perl -Dxxx> command described in L<perlrun>, which is
1N/Ausable only if a special Perl is built per the instructions in the
1N/AF<INSTALL> podpage in the Perl source tree.
1N/A
1N/AFor example, whenever you call Perl's built-in C<caller> function
1N/Afrom the package C<DB>, the arguments that the corresponding stack
1N/Aframe was called with are copied to the C<@DB::args> array. These
1N/Amechanisms are enabled by calling Perl with the B<-d> switch.
1N/ASpecifically, the following additional features are enabled
1N/A(cf. L<perlvar/$^P>):
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/APerl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
1N/A'perl5db.pl'}> if not present) before the first line of your program.
1N/A
1N/A=item *
1N/A
1N/AEach array C<@{"_<$filename"}> holds the lines of $filename for a
1N/Afile compiled by Perl. The same is also true for C<eval>ed strings
1N/Athat contain subroutines, or which are currently being executed.
1N/AThe $filename for C<eval>ed strings looks like C<(eval 34)>.
1N/ACode assertions in regexes look like C<(re_eval 19)>.
1N/A
1N/AValues in this array are magical in numeric context: they compare
1N/Aequal to zero only if the line is not breakable.
1N/A
1N/A=item *
1N/A
1N/AEach hash C<%{"_<$filename"}> contains breakpoints and actions keyed
1N/Aby line number. Individual entries (as opposed to the whole hash)
1N/Aare settable. Perl only cares about Boolean true here, although
1N/Athe values used by F<perl5db.pl> have the form
1N/AC<"$break_condition\0$action">.
1N/A
1N/AThe same holds for evaluated strings that contain subroutines, or
1N/Awhich are currently being executed. The $filename for C<eval>ed strings
1N/Alooks like C<(eval 34)> or C<(re_eval 19)>.
1N/A
1N/A=item *
1N/A
1N/AEach scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
1N/Aalso the case for evaluated strings that contain subroutines, or
1N/Awhich are currently being executed. The $filename for C<eval>ed
1N/Astrings looks like C<(eval 34)> or C<(re_eval 19)>.
1N/A
1N/A=item *
1N/A
1N/AAfter each C<require>d file is compiled, but before it is executed,
1N/AC<DB::postponed(*{"_<$filename"})> is called if the subroutine
1N/AC<DB::postponed> exists. Here, the $filename is the expanded name of
1N/Athe C<require>d file, as found in the values of %INC.
1N/A
1N/A=item *
1N/A
1N/AAfter each subroutine C<subname> is compiled, the existence of
1N/AC<$DB::postponed{subname}> is checked. If this key exists,
1N/AC<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
1N/Aalso exists.
1N/A
1N/A=item *
1N/A
1N/AA hash C<%DB::sub> is maintained, whose keys are subroutine names
1N/Aand whose values have the form C<filename:startline-endline>.
1N/AC<filename> has the form C<(eval 34)> for subroutines defined inside
1N/AC<eval>s, or C<(re_eval 19)> for those within regex code assertions.
1N/A
1N/A=item *
1N/A
1N/AWhen the execution of your program reaches a point that can hold a
1N/Abreakpoint, the C<DB::DB()> subroutine is called if any of the variables
1N/AC<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables
1N/Aare not C<local>izable. This feature is disabled when executing
1N/Ainside C<DB::DB()>, including functions called from it
1N/Aunless C<< $^D & (1<<30) >> is true.
1N/A
1N/A=item *
1N/A
1N/AWhen execution of the program reaches a subroutine call, a call to
1N/AC<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
1N/Aname of the called subroutine. (This doesn't happen if the subroutine
1N/Awas compiled in the C<DB> package.)
1N/A
1N/A=back
1N/A
1N/ANote that if C<&DB::sub> needs external data for it to work, no
1N/Asubroutine call is possible without it. As an example, the standard
1N/Adebugger's C<&DB::sub> depends on the C<$DB::deep> variable
1N/A(it defines how many levels of recursion deep into the debugger you can go
1N/Abefore a mandatory break). If C<$DB::deep> is not defined, subroutine
1N/Acalls are not possible, even though C<&DB::sub> exists.
1N/A
1N/A=head2 Writing Your Own Debugger
1N/A
1N/A=head3 Environment Variables
1N/A
1N/AThe C<PERL5DB> environment variable can be used to define a debugger.
1N/AFor example, the minimal "working" debugger (it actually doesn't do anything)
1N/Aconsists of one line:
1N/A
1N/A sub DB::DB {}
1N/A
1N/AIt can easily be defined like this:
1N/A
1N/A $ PERL5DB="sub DB::DB {}" perl -d your-script
1N/A
1N/AAnother brief debugger, slightly more useful, can be created
1N/Awith only the line:
1N/A
1N/A sub DB::DB {print ++$i; scalar <STDIN>}
1N/A
1N/AThis debugger prints a number which increments for each statement
1N/Aencountered and waits for you to hit a newline before continuing
1N/Ato the next statement.
1N/A
1N/AThe following debugger is actually useful:
1N/A
1N/A {
1N/A package DB;
1N/A sub DB {}
1N/A sub sub {print ++$i, " $sub\n"; &$sub}
1N/A }
1N/A
1N/AIt prints the sequence number of each subroutine call and the name of the
1N/Acalled subroutine. Note that C<&DB::sub> is being compiled into the
1N/Apackage C<DB> through the use of the C<package> directive.
1N/A
1N/AWhen it starts, the debugger reads your rc file (F<./.perldb> or
1N/AF<~/.perldb> under Unix), which can set important options.
1N/A(A subroutine (C<&afterinit>) can be defined here as well; it is executed
1N/Aafter the debugger completes its own initialization.)
1N/A
1N/AAfter the rc file is read, the debugger reads the PERLDB_OPTS
1N/Aenvironment variable and uses it to set debugger options. The
1N/Acontents of this variable are treated as if they were the argument
1N/Aof an C<o ...> debugger command (q.v. in L<perldebug/Options>).
1N/A
1N/A=head3 Debugger internal variables
1N/AIn addition to the file and subroutine-related variables mentioned above,
1N/Athe debugger also maintains various magical internal variables.
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/AC<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
1N/Aholds the lines of the currently-selected file (compiled by Perl), either
1N/Aexplicitly chosen with the debugger's C<f> command, or implicitly by flow
1N/Aof execution.
1N/A
1N/AValues in this array are magical in numeric context: they compare
1N/Aequal to zero only if the line is not breakable.
1N/A
1N/A=item *
1N/A
1N/AC<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which
1N/Acontains breakpoints and actions keyed by line number in
1N/Athe currently-selected file, either explicitly chosen with the
1N/Adebugger's C<f> command, or implicitly by flow of execution.
1N/A
1N/AAs previously noted, individual entries (as opposed to the whole hash)
1N/Aare settable. Perl only cares about Boolean true here, although
1N/Athe values used by F<perl5db.pl> have the form
1N/AC<"$break_condition\0$action">.
1N/A
1N/A=back
1N/A
1N/A=head3 Debugger customization functions
1N/A
1N/ASome functions are provided to simplify customization.
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/ASee L<perldebug/"Options"> for description of options parsed by
1N/AC<DB::parse_options(string)> parses debugger options; see
1N/AL<pperldebug/Options> for a description of options recognized.
1N/A
1N/A=item *
1N/A
1N/AC<DB::dump_trace(skip[,count])> skips the specified number of frames
1N/Aand returns a list containing information about the calling frames (all
1N/Aof them, if C<count> is missing). Each entry is reference to a hash
1N/Awith keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
1N/Aname, or info about C<eval>), C<args> (C<undef> or a reference to
1N/Aan array), C<file>, and C<line>.
1N/A
1N/A=item *
1N/A
1N/AC<DB::print_trace(FH, skip[, count[, short]])> prints
1N/Aformatted info about caller frames. The last two functions may be
1N/Aconvenient as arguments to C<< < >>, C<< << >> commands.
1N/A
1N/A=back
1N/A
1N/ANote that any variables and functions that are not documented in
1N/Athis manpages (or in L<perldebug>) are considered for internal
1N/Ause only, and as such are subject to change without notice.
1N/A
1N/A=head1 Frame Listing Output Examples
1N/A
1N/AThe C<frame> option can be used to control the output of frame
1N/Ainformation. For example, contrast this expression trace:
1N/A
1N/A $ perl -de 42
1N/A Stack dump during die enabled outside of evals.
1N/A
1N/A Loading DB routines from perl5db.pl patch level 0.94
1N/A Emacs support available.
1N/A
1N/A Enter h or `h h' for help.
1N/A
1N/A main::(-e:1): 0
1N/A DB<1> sub foo { 14 }
1N/A
1N/A DB<2> sub bar { 3 }
1N/A
1N/A DB<3> t print foo() * bar()
1N/A main::((eval 172):3): print foo() + bar();
1N/A main::foo((eval 168):2):
1N/A main::bar((eval 170):2):
1N/A 42
1N/A
1N/Awith this one, once the C<o>ption C<frame=2> has been set:
1N/A
1N/A DB<4> o f=2
1N/A frame = '2'
1N/A DB<5> t print foo() * bar()
1N/A 3: foo() * bar()
1N/A entering main::foo
1N/A 2: sub foo { 14 };
1N/A exited main::foo
1N/A entering main::bar
1N/A 2: sub bar { 3 };
1N/A exited main::bar
1N/A 42
1N/A
1N/ABy way of demonstration, we present below a laborious listing
1N/Aresulting from setting your C<PERLDB_OPTS> environment variable to
1N/Athe value C<f=n N>, and running I<perl -d -V> from the command line.
1N/AExamples use various values of C<n> are shown to give you a feel
1N/Afor the difference between settings. Long those it may be, this
1N/Ais not a complete listing, but only excerpts.
1N/A
1N/A=over 4
1N/A
1N/A=item 1
1N/A
1N/A entering main::BEGIN
1N/A entering Config::BEGIN
1N/A Package lib/Exporter.pm.
1N/A Package lib/Carp.pm.
1N/A Package lib/Config.pm.
1N/A entering Config::TIEHASH
1N/A entering Exporter::import
1N/A entering Exporter::export
1N/A entering Config::myconfig
1N/A entering Config::FETCH
1N/A entering Config::FETCH
1N/A entering Config::FETCH
1N/A entering Config::FETCH
1N/A
1N/A=item 2
1N/A
1N/A entering main::BEGIN
1N/A entering Config::BEGIN
1N/A Package lib/Exporter.pm.
1N/A Package lib/Carp.pm.
1N/A exited Config::BEGIN
1N/A Package lib/Config.pm.
1N/A entering Config::TIEHASH
1N/A exited Config::TIEHASH
1N/A entering Exporter::import
1N/A entering Exporter::export
1N/A exited Exporter::export
1N/A exited Exporter::import
1N/A exited main::BEGIN
1N/A entering Config::myconfig
1N/A entering Config::FETCH
1N/A exited Config::FETCH
1N/A entering Config::FETCH
1N/A exited Config::FETCH
1N/A entering Config::FETCH
1N/A
1N/A=item 4
1N/A
1N/A in $=main::BEGIN() from /dev/null:0
1N/A in $=Config::BEGIN() from lib/Config.pm:2
1N/A Package lib/Exporter.pm.
1N/A Package lib/Carp.pm.
1N/A Package lib/Config.pm.
1N/A in $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
1N/A in @=Config::myconfig() from /dev/null:0
1N/A in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
1N/A
1N/A=item 6
1N/A
1N/A in $=main::BEGIN() from /dev/null:0
1N/A in $=Config::BEGIN() from lib/Config.pm:2
1N/A Package lib/Exporter.pm.
1N/A Package lib/Carp.pm.
1N/A out $=Config::BEGIN() from lib/Config.pm:0
1N/A Package lib/Config.pm.
1N/A in $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A out $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
1N/A out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
1N/A out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A out $=main::BEGIN() from /dev/null:0
1N/A in @=Config::myconfig() from /dev/null:0
1N/A in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
1N/A out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
1N/A out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
1N/A out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
1N/A in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
1N/A
1N/A=item 14
1N/A
1N/A in $=main::BEGIN() from /dev/null:0
1N/A in $=Config::BEGIN() from lib/Config.pm:2
1N/A Package lib/Exporter.pm.
1N/A Package lib/Carp.pm.
1N/A out $=Config::BEGIN() from lib/Config.pm:0
1N/A Package lib/Config.pm.
1N/A in $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A out $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
1N/A out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
1N/A out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A out $=main::BEGIN() from /dev/null:0
1N/A in @=Config::myconfig() from /dev/null:0
1N/A in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
1N/A out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
1N/A in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
1N/A out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
1N/A
1N/A=item 30
1N/A
1N/A in $=CODE(0x15eca4)() from /dev/null:0
1N/A in $=CODE(0x182528)() from lib/Config.pm:2
1N/A Package lib/Exporter.pm.
1N/A out $=CODE(0x182528)() from lib/Config.pm:0
1N/A scalar context return from CODE(0x182528): undef
1N/A Package lib/Config.pm.
1N/A in $=Config::TIEHASH('Config') from lib/Config.pm:628
1N/A out $=Config::TIEHASH('Config') from lib/Config.pm:628
1N/A scalar context return from Config::TIEHASH: empty hash
1N/A in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
1N/A out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
1N/A scalar context return from Exporter::export: ''
1N/A out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A scalar context return from Exporter::import: ''
1N/A
1N/A=back
1N/A
1N/AIn all cases shown above, the line indentation shows the call tree.
1N/AIf bit 2 of C<frame> is set, a line is printed on exit from a
1N/Asubroutine as well. If bit 4 is set, the arguments are printed
1N/Aalong with the caller info. If bit 8 is set, the arguments are
1N/Aprinted even if they are tied or references. If bit 16 is set, the
1N/Areturn value is printed, too.
1N/A
1N/AWhen a package is compiled, a line like this
1N/A
1N/A Package lib/Carp.pm.
1N/A
1N/Ais printed with proper indentation.
1N/A
1N/A=head1 Debugging regular expressions
1N/A
1N/AThere are two ways to enable debugging output for regular expressions.
1N/A
1N/AIf your perl is compiled with C<-DDEBUGGING>, you may use the
1N/AB<-Dr> flag on the command line.
1N/A
1N/AOtherwise, one can C<use re 'debug'>, which has effects at
1N/Acompile time and run time. It is not lexically scoped.
1N/A
1N/A=head2 Compile-time output
1N/A
1N/AThe debugging output at compile time looks like this:
1N/A
1N/A Compiling REx `[bc]d(ef*g)+h[ij]k$'
1N/A size 45 Got 364 bytes for offset annotations.
1N/A first at 1
1N/A rarest char g at 0
1N/A rarest char d at 0
1N/A 1: ANYOF[bc](12)
1N/A 12: EXACT <d>(14)
1N/A 14: CURLYX[0] {1,32767}(28)
1N/A 16: OPEN1(18)
1N/A 18: EXACT <e>(20)
1N/A 20: STAR(23)
1N/A 21: EXACT <f>(0)
1N/A 23: EXACT <g>(25)
1N/A 25: CLOSE1(27)
1N/A 27: WHILEM[1/1](0)
1N/A 28: NOTHING(29)
1N/A 29: EXACT <h>(31)
1N/A 31: ANYOF[ij](42)
1N/A 42: EXACT <k>(44)
1N/A 44: EOL(45)
1N/A 45: END(0)
1N/A anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
1N/A stclass `ANYOF[bc]' minlen 7
1N/A Offsets: [45]
1N/A 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
1N/A 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
1N/A 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
1N/A 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
1N/A Omitting $` $& $' support.
1N/A
1N/AThe first line shows the pre-compiled form of the regex. The second
1N/Ashows the size of the compiled form (in arbitrary units, usually
1N/A4-byte words) and the total number of bytes allocated for the
1N/Aoffset/length table, usually 4+C<size>*8. The next line shows the
1N/Alabel I<id> of the first node that does a match.
1N/A
1N/AThe
1N/A
1N/A anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
1N/A stclass `ANYOF[bc]' minlen 7
1N/A
1N/Aline (split into two lines above) contains optimizer
1N/Ainformation. In the example shown, the optimizer found that the match
1N/Ashould contain a substring C<de> at offset 1, plus substring C<gh>
1N/Aat some offset between 3 and infinity. Moreover, when checking for
1N/Athese substrings (to abandon impossible matches quickly), Perl will check
1N/Afor the substring C<gh> before checking for the substring C<de>. The
1N/Aoptimizer may also use the knowledge that the match starts (at the
1N/AC<first> I<id>) with a character class, and no string
1N/Ashorter than 7 characters can possibly match.
1N/A
1N/AThe fields of interest which may appear in this line are
1N/A
1N/A=over 4
1N/A
1N/A=item C<anchored> I<STRING> C<at> I<POS>
1N/A
1N/A=item C<floating> I<STRING> C<at> I<POS1..POS2>
1N/A
1N/ASee above.
1N/A
1N/A=item C<matching floating/anchored>
1N/A
1N/AWhich substring to check first.
1N/A
1N/A=item C<minlen>
1N/A
1N/AThe minimal length of the match.
1N/A
1N/A=item C<stclass> I<TYPE>
1N/A
1N/AType of first matching node.
1N/A
1N/A=item C<noscan>
1N/A
1N/ADon't scan for the found substrings.
1N/A
1N/A=item C<isall>
1N/A
1N/AMeans that the optimizer information is all that the regular
1N/Aexpression contains, and thus one does not need to enter the regex engine at
1N/Aall.
1N/A
1N/A=item C<GPOS>
1N/A
1N/ASet if the pattern contains C<\G>.
1N/A
1N/A=item C<plus>
1N/A
1N/ASet if the pattern starts with a repeated char (as in C<x+y>).
1N/A
1N/A=item C<implicit>
1N/A
1N/ASet if the pattern starts with C<.*>.
1N/A
1N/A=item C<with eval>
1N/A
1N/ASet if the pattern contain eval-groups, such as C<(?{ code })> and
1N/AC<(??{ code })>.
1N/A
1N/A=item C<anchored(TYPE)>
1N/A
1N/AIf the pattern may match only at a handful of places, (with C<TYPE>
1N/Abeing C<BOL>, C<MBOL>, or C<GPOS>. See the table below.
1N/A
1N/A=back
1N/A
1N/AIf a substring is known to match at end-of-line only, it may be
1N/Afollowed by C<$>, as in C<floating `k'$>.
1N/A
1N/AThe optimizer-specific information is used to avoid entering (a slow) regex
1N/Aengine on strings that will not definitely match. If the C<isall> flag
1N/Ais set, a call to the regex engine may be avoided even when the optimizer
1N/Afound an appropriate place for the match.
1N/A
1N/AAbove the optimizer section is the list of I<nodes> of the compiled
1N/Aform of the regex. Each line has format
1N/A
1N/AC< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
1N/A
1N/A=head2 Types of nodes
1N/A
1N/AHere are the possible types, with short descriptions:
1N/A
1N/A # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
1N/A
1N/A # Exit points
1N/A END no End of program.
1N/A SUCCEED no Return from a subroutine, basically.
1N/A
1N/A # Anchors:
1N/A BOL no Match "" at beginning of line.
1N/A MBOL no Same, assuming multiline.
1N/A SBOL no Same, assuming singleline.
1N/A EOS no Match "" at end of string.
1N/A EOL no Match "" at end of line.
1N/A MEOL no Same, assuming multiline.
1N/A SEOL no Same, assuming singleline.
1N/A BOUND no Match "" at any word boundary
1N/A BOUNDL no Match "" at any word boundary
1N/A NBOUND no Match "" at any word non-boundary
1N/A NBOUNDL no Match "" at any word non-boundary
1N/A GPOS no Matches where last m//g left off.
1N/A
1N/A # [Special] alternatives
1N/A ANY no Match any one character (except newline).
1N/A SANY no Match any one character.
1N/A ANYOF sv Match character in (or not in) this class.
1N/A ALNUM no Match any alphanumeric character
1N/A ALNUML no Match any alphanumeric char in locale
1N/A NALNUM no Match any non-alphanumeric character
1N/A NALNUML no Match any non-alphanumeric char in locale
1N/A SPACE no Match any whitespace character
1N/A SPACEL no Match any whitespace char in locale
1N/A NSPACE no Match any non-whitespace character
1N/A NSPACEL no Match any non-whitespace char in locale
1N/A DIGIT no Match any numeric character
1N/A NDIGIT no Match any non-numeric character
1N/A
1N/A # BRANCH The set of branches constituting a single choice are hooked
1N/A # together with their "next" pointers, since precedence prevents
1N/A # anything being concatenated to any individual branch. The
1N/A # "next" pointer of the last BRANCH in a choice points to the
1N/A # thing following the whole choice. This is also where the
1N/A # final "next" pointer of each individual branch points; each
1N/A # branch starts with the operand node of a BRANCH node.
1N/A #
1N/A BRANCH node Match this alternative, or the next...
1N/A
1N/A # BACK Normal "next" pointers all implicitly point forward; BACK
1N/A # exists to make loop structures possible.
1N/A # not used
1N/A BACK no Match "", "next" ptr points backward.
1N/A
1N/A # Literals
1N/A EXACT sv Match this string (preceded by length).
1N/A EXACTF sv Match this string, folded (prec. by length).
1N/A EXACTFL sv Match this string, folded in locale (w/len).
1N/A
1N/A # Do nothing
1N/A NOTHING no Match empty string.
1N/A # A variant of above which delimits a group, thus stops optimizations
1N/A TAIL no Match empty string. Can jump here from outside.
1N/A
1N/A # STAR,PLUS '?', and complex '*' and '+', are implemented as circular
1N/A # BRANCH structures using BACK. Simple cases (one character
1N/A # per match) are implemented with STAR and PLUS for speed
1N/A # and to minimize recursive plunges.
1N/A #
1N/A STAR node Match this (simple) thing 0 or more times.
1N/A PLUS node Match this (simple) thing 1 or more times.
1N/A
1N/A CURLY sv 2 Match this simple thing {n,m} times.
1N/A CURLYN no 2 Match next-after-this simple thing
1N/A # {n,m} times, set parens.
1N/A CURLYM no 2 Match this medium-complex thing {n,m} times.
1N/A CURLYX sv 2 Match this complex thing {n,m} times.
1N/A
1N/A # This terminator creates a loop structure for CURLYX
1N/A WHILEM no Do curly processing and see if rest matches.
1N/A
1N/A # OPEN,CLOSE,GROUPP ...are numbered at compile time.
1N/A OPEN num 1 Mark this point in input as start of #n.
1N/A CLOSE num 1 Analogous to OPEN.
1N/A
1N/A REF num 1 Match some already matched string
1N/A REFF num 1 Match already matched string, folded
1N/A REFFL num 1 Match already matched string, folded in loc.
1N/A
1N/A # grouping assertions
1N/A IFMATCH off 1 2 Succeeds if the following matches.
1N/A UNLESSM off 1 2 Fails if the following matches.
1N/A SUSPEND off 1 1 "Independent" sub-regex.
1N/A IFTHEN off 1 1 Switch, should be preceded by switcher .
1N/A GROUPP num 1 Whether the group matched.
1N/A
1N/A # Support for long regex
1N/A LONGJMP off 1 1 Jump far away.
1N/A BRANCHJ off 1 1 BRANCH with long offset.
1N/A
1N/A # The heavy worker
1N/A EVAL evl 1 Execute some Perl code.
1N/A
1N/A # Modifiers
1N/A MINMOD no Next operator is not greedy.
1N/A LOGICAL no Next opcode should set the flag only.
1N/A
1N/A # This is not used yet
1N/A RENUM off 1 1 Group with independently numbered parens.
1N/A
1N/A # This is not really a node, but an optimized away piece of a "long" node.
1N/A # To simplify debugging output, we mark it as if it were a node
1N/A OPTIMIZED off Placeholder for dump.
1N/A
1N/A=for unprinted-credits
1N/ANext section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421
1N/A
1N/AFollowing the optimizer information is a dump of the offset/length
1N/Atable, here split across several lines:
1N/A
1N/A Offsets: [45]
1N/A 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
1N/A 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
1N/A 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
1N/A 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
1N/A
1N/AThe first line here indicates that the offset/length table contains 45
1N/Aentries. Each entry is a pair of integers, denoted by C<offset[length]>.
1N/AEntries are numbered starting with 1, so entry #1 here is C<1[4]> and
1N/Aentry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:>
1N/A(the C<1: ANYOF[bc]>) begins at character position 1 in the
1N/Apre-compiled form of the regex, and has a length of 4 characters.
1N/AC<5[1]> in position 12
1N/Aindicates that the node labeled C<12:>
1N/A(the C<< 12: EXACT <d> >>) begins at character position 5 in the
1N/Apre-compiled form of the regex, and has a length of 1 character.
1N/AC<12[1]> in position 14
1N/Aindicates that the node labeled C<14:>
1N/A(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
1N/Apre-compiled form of the regex, and has a length of 1 character---that
1N/Ais, it corresponds to the C<+> symbol in the precompiled regex.
1N/A
1N/AC<0[0]> items indicate that there is no corresponding node.
1N/A
1N/A=head2 Run-time output
1N/A
1N/AFirst of all, when doing a match, one may get no run-time output even
1N/Aif debugging is enabled. This means that the regex engine was never
1N/Aentered and that all of the job was therefore done by the optimizer.
1N/A
1N/AIf the regex engine was entered, the output may look like this:
1N/A
1N/A Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
1N/A Setting an EVAL scope, savestack=3
1N/A 2 <ab> <cdefg__gh_> | 1: ANYOF
1N/A 3 <abc> <defg__gh_> | 11: EXACT <d>
1N/A 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767}
1N/A 4 <abcd> <efg__gh_> | 26: WHILEM
1N/A 0 out of 1..32767 cc=effff31c
1N/A 4 <abcd> <efg__gh_> | 15: OPEN1
1N/A 4 <abcd> <efg__gh_> | 17: EXACT <e>
1N/A 5 <abcde> <fg__gh_> | 19: STAR
1N/A EXACT <f> can match 1 times out of 32767...
1N/A Setting an EVAL scope, savestack=3
1N/A 6 <bcdef> <g__gh__> | 22: EXACT <g>
1N/A 7 <bcdefg> <__gh__> | 24: CLOSE1
1N/A 7 <bcdefg> <__gh__> | 26: WHILEM
1N/A 1 out of 1..32767 cc=effff31c
1N/A Setting an EVAL scope, savestack=12
1N/A 7 <bcdefg> <__gh__> | 15: OPEN1
1N/A 7 <bcdefg> <__gh__> | 17: EXACT <e>
1N/A restoring \1 to 4(4)..7
1N/A failed, try continuation...
1N/A 7 <bcdefg> <__gh__> | 27: NOTHING
1N/A 7 <bcdefg> <__gh__> | 28: EXACT <h>
1N/A failed...
1N/A failed...
1N/A
1N/AThe most significant information in the output is about the particular I<node>
1N/Aof the compiled regex that is currently being tested against the target string.
1N/AThe format of these lines is
1N/A
1N/AC< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE>
1N/A
1N/AThe I<TYPE> info is indented with respect to the backtracking level.
1N/AOther incidental information appears interspersed within.
1N/A
1N/A=head1 Debugging Perl memory usage
1N/A
1N/APerl is a profligate wastrel when it comes to memory use. There
1N/Ais a saying that to estimate memory usage of Perl, assume a reasonable
1N/Aalgorithm for memory allocation, multiply that estimate by 10, and
1N/Awhile you still may miss the mark, at least you won't be quite so
1N/Aastonished. This is not absolutely true, but may provide a good
1N/Agrasp of what happens.
1N/A
1N/AAssume that an integer cannot take less than 20 bytes of memory, a
1N/Afloat cannot take less than 24 bytes, a string cannot take less
1N/Athan 32 bytes (all these examples assume 32-bit architectures, the
1N/Aresult are quite a bit worse on 64-bit architectures). If a variable
1N/Ais accessed in two of three different ways (which require an integer,
1N/Aa float, or a string), the memory footprint may increase yet another
1N/A20 bytes. A sloppy malloc(3) implementation can inflate these
1N/Anumbers dramatically.
1N/A
1N/AOn the opposite end of the scale, a declaration like
1N/A
1N/A sub foo;
1N/A
1N/Amay take up to 500 bytes of memory, depending on which release of Perl
1N/Ayou're running.
1N/A
1N/AAnecdotal estimates of source-to-compiled code bloat suggest an
1N/Aeightfold increase. This means that the compiled form of reasonable
1N/A(normally commented, properly indented etc.) code will take
1N/Aabout eight times more space in memory than the code took
1N/Aon disk.
1N/A
1N/AThe B<-DL> command-line switch is obsolete since circa Perl 5.6.0
1N/A(it was available only if Perl was built with C<-DDEBUGGING>).
1N/AThe switch was used to track Perl's memory allocations and possible
1N/Amemory leaks. These days the use of malloc debugging tools like
1N/AF<Purify> or F<valgrind> is suggested instead.
1N/A
1N/AOne way to find out how much memory is being used by Perl data
1N/Astructures is to install the Devel::Size module from CPAN: it gives
1N/Ayou the minimum number of bytes required to store a particular data
1N/Astructure. Please be mindful of the difference between the size()
1N/Aand total_size().
1N/A
1N/AIf Perl has been compiled using Perl's malloc you can analyze Perl
1N/Amemory usage by setting the $ENV{PERL_DEBUG_MSTATS}.
1N/A
1N/A=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
1N/A
1N/AIf your perl is using Perl's malloc() and was compiled with the
1N/Anecessary switches (this is the default), then it will print memory
1N/Ausage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS}
1N/A> 1 >>, and before termination of the program when C<<
1N/A$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to
1N/Athe following example:
1N/A
1N/A $ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
1N/A Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
1N/A 14216 free: 130 117 28 7 9 0 2 2 1 0 0
1N/A 437 61 36 0 5
1N/A 60924 used: 125 137 161 55 7 8 6 16 2 0 1
1N/A 74 109 304 84 20
1N/A Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
1N/A Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
1N/A 30888 free: 245 78 85 13 6 2 1 3 2 0 1
1N/A 315 162 39 42 11
1N/A 175816 used: 265 176 1112 111 26 22 11 27 2 1 1
1N/A 196 178 1066 798 39
1N/A Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
1N/A
1N/AIt is possible to ask for such a statistic at arbitrary points in
1N/Ayour execution using the mstat() function out of the standard
1N/ADevel::Peek module.
1N/A
1N/AHere is some explanation of that format:
1N/A
1N/A=over 4
1N/A
1N/A=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
1N/A
1N/APerl's malloc() uses bucketed allocations. Every request is rounded
1N/Aup to the closest bucket size available, and a bucket is taken from
1N/Athe pool of buckets of that size.
1N/A
1N/AThe line above describes the limits of buckets currently in use.
1N/AEach bucket has two sizes: memory footprint and the maximal size
1N/Aof user data that can fit into this bucket. Suppose in the above
1N/Aexample that the smallest bucket were size 4. The biggest bucket
1N/Awould have usable size 8188, and the memory footprint would be 8192.
1N/A
1N/AIn a Perl built for debugging, some buckets may have negative usable
1N/Asize. This means that these buckets cannot (and will not) be used.
1N/AFor larger buckets, the memory footprint may be one page greater
1N/Athan a power of 2. If so, case the corresponding power of two is
1N/Aprinted in the C<APPROX> field above.
1N/A
1N/A=item Free/Used
1N/A
1N/AThe 1 or 2 rows of numbers following that correspond to the number
1N/Aof buckets of each size between C<SMALLEST> and C<GREATEST>. In
1N/Athe first row, the sizes (memory footprints) of buckets are powers
1N/Aof two--or possibly one page greater. In the second row, if present,
1N/Athe memory footprints of the buckets are between the memory footprints
1N/Aof two buckets "above".
1N/A
1N/AFor example, suppose under the previous example, the memory footprints
1N/Awere
1N/A
1N/A free: 8 16 32 64 128 256 512 1024 2048 4096 8192
1N/A 4 12 24 48 80
1N/A
1N/AWith non-C<DEBUGGING> perl, the buckets starting from C<128> have
1N/Aa 4-byte overhead, and thus an 8192-long bucket may take up to
1N/A8188-byte allocations.
1N/A
1N/A=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
1N/A
1N/AThe first two fields give the total amount of memory perl sbrk(2)ed
1N/A(ess-broken? :-) and number of sbrk(2)s used. The third number is
1N/Awhat perl thinks about continuity of returned chunks. So long as
1N/Athis number is positive, malloc() will assume that it is probable
1N/Athat sbrk(2) will provide continuous memory.
1N/A
1N/AMemory allocated by external libraries is not counted.
1N/A
1N/A=item C<pad: 0>
1N/A
1N/AThe amount of sbrk(2)ed memory needed to keep buckets aligned.
1N/A
1N/A=item C<heads: 2192>
1N/A
1N/AAlthough memory overhead of bigger buckets is kept inside the bucket, for
1N/Asmaller buckets, it is kept in separate areas. This field gives the
1N/Atotal size of these areas.
1N/A
1N/A=item C<chain: 0>
1N/A
1N/Amalloc() may want to subdivide a bigger bucket into smaller buckets.
1N/AIf only a part of the deceased bucket is left unsubdivided, the rest
1N/Ais kept as an element of a linked list. This field gives the total
1N/Asize of these chunks.
1N/A
1N/A=item C<tail: 6144>
1N/A
1N/ATo minimize the number of sbrk(2)s, malloc() asks for more memory. This
1N/Afield gives the size of the yet unused part, which is sbrk(2)ed, but
1N/Anever touched.
1N/A
1N/A=back
1N/A
1N/A=head2 Example of using B<-DL> switch
1N/A
1N/A(Note that -DL is obsolete since circa 5.6.0, and even before that
1N/APerl needed to be compiled with -DDEBUGGING.)
1N/A
1N/ABelow we show how to analyse memory usage by
1N/A
1N/A do 'lib/auto/POSIX/autosplit.ix';
1N/A
1N/AThe file in question contains a header and 146 lines similar to
1N/A
1N/A sub getcwd;
1N/A
1N/AB<WARNING>: The discussion below supposes 32-bit architecture. In
1N/Anewer releases of Perl, memory usage of the constructs discussed
1N/Ahere is greatly improved, but the story discussed below is a real-life
1N/Astory. This story is mercilessly terse, and assumes rather more than cursory
1N/Aknowledge of Perl internals. Type space to continue, `q' to quit.
1N/A(Actually, you just want to skip to the next section.)
1N/A
1N/AHere is the itemized list of Perl allocations performed during parsing
1N/Aof this file:
1N/A
1N/A !!! "after" at test.pl line 3.
1N/A Id subtot 4 8 12 16 20 24 28 32 36 40 48 56 64 72 80 80+
1N/A 0 02 13752 . . . . 294 . . . . . . . . . . 4
1N/A 0 54 5545 . . 8 124 16 . . . 1 1 . . . . . 3
1N/A 5 05 32 . . . . . . . 1 . . . . . . . .
1N/A 6 02 7152 . . . . . . . . . . 149 . . . . .
1N/A 7 02 3600 . . . . . 150 . . . . . . . . . .
1N/A 7 03 64 . -1 . 1 . . 2 . . . . . . . . .
1N/A 7 04 7056 . . . . . . . . . . . . . . . 7
1N/A 7 17 38404 . . . . . . . 1 . . 442 149 . . 147 .
1N/A 9 03 2078 17 249 32 . . . . 2 . . . . . . . .
1N/A
1N/A
1N/ATo see this list, insert two C<warn('!...')> statements around the call:
1N/A
1N/A warn('!');
1N/A do 'lib/auto/POSIX/autosplit.ix';
1N/A warn('!!! "after"');
1N/A
1N/Aand run it with Perl's B<-DL> option. The first warn() will print
1N/Amemory allocation info before parsing the file and will memorize
1N/Athe statistics at this point (we ignore what it prints). The second
1N/Awarn() prints increments with respect to these memorized data. This
1N/Ais the printout shown above.
1N/A
1N/ADifferent I<Id>s on the left correspond to different subsystems of
1N/Athe perl interpreter. They are just the first argument given to
1N/Athe perl memory allocation API named New(). To find what C<9 03>
1N/Ameans, just B<grep> the perl source for C<903>. You'll find it in
1N/AF<util.c>, function savepvn(). (I know, you wonder why we told you
1N/Ato B<grep> and then gave away the answer. That's because grepping
1N/Athe source is good for the soul.) This function is used to store
1N/Aa copy of an existing chunk of memory. Using a C debugger, one can
1N/Asee that the function was called either directly from gv_init() or
1N/Avia sv_magic(), and that gv_init() is called from gv_fetchpv()--which
1N/Awas itself called from newSUB(). Please stop to catch your breath now.
1N/A
1N/AB<NOTE>: To reach this point in the debugger and skip the calls to
1N/Asavepvn() during the compilation of the main program, you should
1N/Aset a C breakpoint
1N/Ain Perl_warn(), continue until this point is reached, and I<then> set
1N/Aa C breakpoint in Perl_savepvn(). Note that you may need to skip a
1N/Ahandful of Perl_savepvn() calls that do not correspond to mass production
1N/Aof CVs (there are more C<903> allocations than 146 similar lines of
1N/AF<lib/auto/POSIX/autosplit.ix>). Note also that C<Perl_> prefixes are
1N/Aadded by macroization code in perl header files to avoid conflicts
1N/Awith external libraries.
1N/A
1N/AAnyway, we see that C<903> ids correspond to creation of globs, twice
1N/Aper glob - for glob name, and glob stringification magic.
1N/A
1N/AHere are explanations for other I<Id>s above:
1N/A
1N/A=over 4
1N/A
1N/A=item C<717>
1N/A
1N/ACreates bigger C<XPV*> structures. In the case above, it
1N/Acreates 3 C<AV>s per subroutine, one for a list of lexical variable
1N/Anames, one for a scratchpad (which contains lexical variables and
1N/AC<targets>), and one for the array of scratchpads needed for
1N/Arecursion.
1N/A
1N/AIt also creates a C<GV> and a C<CV> per subroutine, all called from
1N/Astart_subparse().
1N/A
1N/A=item C<002>
1N/A
1N/ACreates a C array corresponding to the C<AV> of scratchpads and the
1N/Ascratchpad itself. The first fake entry of this scratchpad is
1N/Acreated though the subroutine itself is not defined yet.
1N/A
1N/AIt also creates C arrays to keep data for the stash. This is one HV,
1N/Abut it grows; thus, there are 4 big allocations: the big chunks are not
1N/Afreed, but are kept as additional arenas for C<SV> allocations.
1N/A
1N/A=item C<054>
1N/A
1N/ACreates a C<HEK> for the name of the glob for the subroutine. This
1N/Aname is a key in a I<stash>.
1N/A
1N/ABig allocations with this I<Id> correspond to allocations of new
1N/Aarenas to keep C<HE>.
1N/A
1N/A=item C<602>
1N/A
1N/ACreates a C<GP> for the glob for the subroutine.
1N/A
1N/A=item C<702>
1N/A
1N/ACreates the C<MAGIC> for the glob for the subroutine.
1N/A
1N/A=item C<704>
1N/A
1N/ACreates I<arenas> which keep SVs.
1N/A
1N/A=back
1N/A
1N/A=head2 B<-DL> details
1N/A
1N/AIf Perl is run with B<-DL> option, then warn()s that start with `!'
1N/Abehave specially. They print a list of I<categories> of memory
1N/Aallocations, and statistics of allocations of different sizes for
1N/Athese categories.
1N/A
1N/AIf warn() string starts with
1N/A
1N/A=over 4
1N/A
1N/A=item C<!!!>
1N/A
1N/Aprint changed categories only, print the differences in counts of allocations.
1N/A
1N/A=item C<!!>
1N/A
1N/Aprint grown categories only; print the absolute values of counts, and totals.
1N/A
1N/A=item C<!>
1N/A
1N/Aprint nonempty categories, print the absolute values of counts and totals.
1N/A
1N/A=back
1N/A
1N/A=head2 Limitations of B<-DL> statistics
1N/A
1N/AIf an extension or external library does not use the Perl API to
1N/Aallocate memory, such allocations are not counted.
1N/A
1N/A=head1 SEE ALSO
1N/A
1N/AL<perldebug>,
1N/AL<perlguts>,
1N/AL<perlrun>
1N/AL<re>,
1N/Aand
1N/AL<Devel::DProf>.