distrib/pod/perldebguts.pod

1N/A=head1 NAME
1N/A
1N/Aperldebguts - Guts of Perl debugging
1N/A
1N/A=head1 DESCRIPTION
1N/A
1N/AThis is not the perldebug(1) manpage, which tells you how to use
1N/Athe debugger.  This manpage describes low-level details concerning
1N/Athe debugger's internals, which range from difficult to impossible
1N/Ato understand for anyone who isn't incredibly intimate with Perl's guts.
1N/ACaveat lector.
1N/A
1N/A=head1 Debugger Internals
1N/A
1N/APerl has special debugging hooks at compile-time and run-time used
1N/Ato create debugging environments.  These hooks are not to be confused
1N/Awith the I<perl -Dxxx> command described in L<perlrun>, which is
1N/Ausable only if a special Perl is built per the instructions in the
1N/AF<INSTALL> podpage in the Perl source tree.
1N/A
1N/AFor example, whenever you call Perl's built-in C<caller> function
1N/Afrom the package C<DB>, the arguments that the corresponding stack
1N/Aframe was called with are copied to the C<@DB::args> array.  These
1N/Amechanisms are enabled by calling Perl with the B<-d> switch.
1N/ASpecifically, the following additional features are enabled
1N/A(cf. L<perlvar/$^P>):
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/APerl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
1N/A'perl5db.pl'}> if not present) before the first line of your program.
1N/A
1N/A=item *
1N/A
1N/AEach array C<@{"_<$filename"}> holds the lines of $filename for a
1N/Afile compiled by Perl.  The same is also true for C<eval>ed strings
1N/Athat contain subroutines, or which are currently being executed.
1N/AThe $filename for C<eval>ed strings looks like C<(eval 34)>.
1N/ACode assertions in regexes look like C<(re_eval 19)>.
1N/A
1N/AValues in this array are magical in numeric context: they compare
1N/Aequal to zero only if the line is not breakable.
1N/A
1N/A=item *
1N/A
1N/AEach hash C<%{"_<$filename"}> contains breakpoints and actions keyed
1N/Aby line number.  Individual entries (as opposed to the whole hash)
1N/Aare settable.  Perl only cares about Boolean true here, although
1N/Athe values used by F<perl5db.pl> have the form
1N/AC<"$break_condition\0$action">.
1N/A
1N/AThe same holds for evaluated strings that contain subroutines, or
1N/Awhich are currently being executed.  The $filename for C<eval>ed strings
1N/Alooks like C<(eval 34)> or  C<(re_eval 19)>.
1N/A
1N/A=item *
1N/A
1N/AEach scalar C<${"_<$filename"}> contains C<"_<$filename">.  This is
1N/Aalso the case for evaluated strings that contain subroutines, or
1N/Awhich are currently being executed.  The $filename for C<eval>ed
1N/Astrings looks like C<(eval 34)> or C<(re_eval 19)>.
1N/A
1N/A=item *
1N/A
1N/AAfter each C<require>d file is compiled, but before it is executed,
1N/AC<DB::postponed(*{"_<$filename"})> is called if the subroutine
1N/AC<DB::postponed> exists.  Here, the $filename is the expanded name of
1N/Athe C<require>d file, as found in the values of %INC.
1N/A
1N/A=item *
1N/A
1N/AAfter each subroutine C<subname> is compiled, the existence of
1N/AC<$DB::postponed{subname}> is checked.  If this key exists,
1N/AC<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
1N/Aalso exists.
1N/A
1N/A=item *
1N/A
1N/AA hash C<%DB::sub> is maintained, whose keys are subroutine names
1N/Aand whose values have the form C<filename:startline-endline>.
1N/AC<filename> has the form C<(eval 34)> for subroutines defined inside
1N/AC<eval>s, or C<(re_eval 19)> for those within regex code assertions.
1N/A
1N/A=item *
1N/A
1N/AWhen the execution of your program reaches a point that can hold a
1N/Abreakpoint, the C<DB::DB()> subroutine is called if any of the variables
1N/AC<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true.  These variables
1N/Aare not C<local>izable.  This feature is disabled when executing
1N/Ainside C<DB::DB()>, including functions called from it
1N/Aunless C<< $^D & (1<<30) >> is true.
1N/A
1N/A=item *
1N/A
1N/AWhen execution of the program reaches a subroutine call, a call to
1N/AC<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
1N/Aname of the called subroutine. (This doesn't happen if the subroutine
1N/Awas compiled in the C<DB> package.)
1N/A
1N/A=back
1N/A
1N/ANote that if C<&DB::sub> needs external data for it to work, no
1N/Asubroutine call is possible without it. As an example, the standard
1N/Adebugger's C<&DB::sub> depends on the C<$DB::deep> variable
1N/A(it defines how many levels of recursion deep into the debugger you can go
1N/Abefore a mandatory break).  If C<$DB::deep> is not defined, subroutine
1N/Acalls are not possible, even though C<&DB::sub> exists.
1N/A
1N/A=head2 Writing Your Own Debugger
1N/A
1N/A=head3 Environment Variables
1N/A
1N/AThe C<PERL5DB> environment variable can be used to define a debugger.
1N/AFor example, the minimal "working" debugger (it actually doesn't do anything)
1N/Aconsists of one line:
1N/A
1N/A  sub DB::DB {}
1N/A
1N/AIt can easily be defined like this:
1N/A
1N/A  $ PERL5DB="sub DB::DB {}" perl -d your-script
1N/A
1N/AAnother brief debugger, slightly more useful, can be created
1N/Awith only the line:
1N/A
1N/A  sub DB::DB {print ++$i; scalar <STDIN>}
1N/A
1N/AThis debugger prints a number which increments for each statement
1N/Aencountered and waits for you to hit a newline before continuing
1N/Ato the next statement.
1N/A
1N/AThe following debugger is actually useful:
1N/A
1N/A  {
1N/A    package DB;
1N/A    sub DB  {}
1N/A    sub sub {print ++$i, " $sub\n"; &$sub}
1N/A  }
1N/A
1N/AIt prints the sequence number of each subroutine call and the name of the
1N/Acalled subroutine.  Note that C<&DB::sub> is being compiled into the
1N/Apackage C<DB> through the use of the C<package> directive.
1N/A
1N/AWhen it starts, the debugger reads your rc file (F<./.perldb> or
1N/AF<~/.perldb> under Unix), which can set important options.
1N/A(A subroutine (C<&afterinit>) can be defined here as well; it is executed
1N/Aafter the debugger completes its own initialization.)
1N/A
1N/AAfter the rc file is read, the debugger reads the PERLDB_OPTS
1N/Aenvironment variable and uses it to set debugger options. The
1N/Acontents of this variable are treated as if they were the argument
1N/Aof an C<o ...> debugger command (q.v. in L<perldebug/Options>).
1N/A
1N/A=head3 Debugger internal variables
1N/AIn addition to the file and subroutine-related variables mentioned above,
1N/Athe debugger also maintains various magical internal variables.
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/AC<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
1N/Aholds the lines of the currently-selected file (compiled by Perl), either
1N/Aexplicitly chosen with the debugger's C<f> command, or implicitly by flow
1N/Aof execution.
1N/A
1N/AValues in this array are magical in numeric context: they compare
1N/Aequal to zero only if the line is not breakable.
1N/A
1N/A=item *
1N/A
1N/AC<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which
1N/Acontains breakpoints and actions keyed by line number in
1N/Athe currently-selected file, either explicitly chosen with the
1N/Adebugger's C<f> command, or implicitly by flow of execution.
1N/A
1N/AAs previously noted, individual entries (as opposed to the whole hash)
1N/Aare settable.  Perl only cares about Boolean true here, although
1N/Athe values used by F<perl5db.pl> have the form
1N/AC<"$break_condition\0$action">.
1N/A
1N/A=back
1N/A
1N/A=head3 Debugger customization functions
1N/A
1N/ASome functions are provided to simplify customization.
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/ASee L<perldebug/"Options"> for description of options parsed by
1N/AC<DB::parse_options(string)> parses debugger options; see
1N/AL<pperldebug/Options> for a description of options recognized.
1N/A
1N/A=item *
1N/A
1N/AC<DB::dump_trace(skip[,count])> skips the specified number of frames
1N/Aand returns a list containing information about the calling frames (all
1N/Aof them, if C<count> is missing).  Each entry is reference to a hash
1N/Awith keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
1N/Aname, or info about C<eval>), C<args> (C<undef> or a reference to
1N/Aan array), C<file>, and C<line>.
1N/A
1N/A=item *
1N/A
1N/AC<DB::print_trace(FH, skip[, count[, short]])> prints
1N/Aformatted info about caller frames.  The last two functions may be
1N/Aconvenient as arguments to C<< < >>, C<< << >> commands.
1N/A
1N/A=back
1N/A
1N/ANote that any variables and functions that are not documented in
1N/Athis manpages (or in L<perldebug>) are considered for internal
1N/Ause only, and as such are subject to change without notice.
1N/A
1N/A=head1 Frame Listing Output Examples
1N/A
1N/AThe C<frame> option can be used to control the output of frame
1N/Ainformation.  For example, contrast this expression trace:
1N/A
1N/A $ perl -de 42
1N/A Stack dump during die enabled outside of evals.
1N/A
1N/A Loading DB routines from perl5db.pl patch level 0.94
1N/A Emacs support available.
1N/A
1N/A Enter h or `h h' for help.
1N/A
1N/A main::(-e:1):   0
1N/A   DB<1> sub foo { 14 }
1N/A
1N/A   DB<2> sub bar { 3 }
1N/A
1N/A   DB<3> t print foo() * bar()
1N/A main::((eval 172):3):   print foo() + bar();
1N/A main::foo((eval 168):2):
1N/A main::bar((eval 170):2):
1N/A 42
1N/A
1N/Awith this one, once the C<o>ption C<frame=2> has been set:
1N/A
1N/A   DB<4> o f=2
1N/A                frame = '2'
1N/A   DB<5> t print foo() * bar()
1N/A 3:      foo() * bar()
1N/A entering main::foo
1N/A  2:     sub foo { 14 };
1N/A exited main::foo
1N/A entering main::bar
1N/A  2:     sub bar { 3 };
1N/A exited main::bar
1N/A 42
1N/A
1N/ABy way of demonstration, we present below a laborious listing
1N/Aresulting from setting your C<PERLDB_OPTS> environment variable to
1N/Athe value C<f=n N>, and running I<perl -d -V> from the command line.
1N/AExamples use various values of C<n> are shown to give you a feel
1N/Afor the difference between settings.  Long those it may be, this
1N/Ais not a complete listing, but only excerpts.
1N/A
1N/A=over 4
1N/A
1N/A=item 1
1N/A
1N/A  entering main::BEGIN
1N/A   entering Config::BEGIN
1N/A    Package lib/Exporter.pm.
1N/A    Package lib/Carp.pm.
1N/A   Package lib/Config.pm.
1N/A   entering Config::TIEHASH
1N/A   entering Exporter::import
1N/A    entering Exporter::export
1N/A  entering Config::myconfig
1N/A   entering Config::FETCH
1N/A   entering Config::FETCH
1N/A   entering Config::FETCH
1N/A   entering Config::FETCH
1N/A
1N/A=item 2
1N/A
1N/A  entering main::BEGIN
1N/A   entering Config::BEGIN
1N/A    Package lib/Exporter.pm.
1N/A    Package lib/Carp.pm.
1N/A   exited Config::BEGIN
1N/A   Package lib/Config.pm.
1N/A   entering Config::TIEHASH
1N/A   exited Config::TIEHASH
1N/A   entering Exporter::import
1N/A    entering Exporter::export
1N/A    exited Exporter::export
1N/A   exited Exporter::import
1N/A  exited main::BEGIN
1N/A  entering Config::myconfig
1N/A   entering Config::FETCH
1N/A   exited Config::FETCH
1N/A   entering Config::FETCH
1N/A   exited Config::FETCH
1N/A   entering Config::FETCH
1N/A
1N/A=item 4
1N/A
1N/A  in  $=main::BEGIN() from /dev/null:0
1N/A   in  $=Config::BEGIN() from lib/Config.pm:2
1N/A    Package lib/Exporter.pm.
1N/A    Package lib/Carp.pm.
1N/A   Package lib/Config.pm.
1N/A   in  $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A   in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A    in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
1N/A  in  @=Config::myconfig() from /dev/null:0
1N/A   in  $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
1N/A
1N/A=item 6
1N/A
1N/A  in  $=main::BEGIN() from /dev/null:0
1N/A   in  $=Config::BEGIN() from lib/Config.pm:2
1N/A    Package lib/Exporter.pm.
1N/A    Package lib/Carp.pm.
1N/A   out $=Config::BEGIN() from lib/Config.pm:0
1N/A   Package lib/Config.pm.
1N/A   in  $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A   out $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A   in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A    in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
1N/A    out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
1N/A   out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A  out $=main::BEGIN() from /dev/null:0
1N/A  in  @=Config::myconfig() from /dev/null:0
1N/A   in  $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
1N/A   out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
1N/A   out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
1N/A   out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
1N/A   in  $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
1N/A
1N/A=item 14
1N/A
1N/A  in  $=main::BEGIN() from /dev/null:0
1N/A   in  $=Config::BEGIN() from lib/Config.pm:2
1N/A    Package lib/Exporter.pm.
1N/A    Package lib/Carp.pm.
1N/A   out $=Config::BEGIN() from lib/Config.pm:0
1N/A   Package lib/Config.pm.
1N/A   in  $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A   out $=Config::TIEHASH('Config') from lib/Config.pm:644
1N/A   in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A    in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
1N/A    out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
1N/A   out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A  out $=main::BEGIN() from /dev/null:0
1N/A  in  @=Config::myconfig() from /dev/null:0
1N/A   in  $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
1N/A   out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
1N/A   in  $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
1N/A   out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
1N/A
1N/A=item 30
1N/A
1N/A  in  $=CODE(0x15eca4)() from /dev/null:0
1N/A   in  $=CODE(0x182528)() from lib/Config.pm:2
1N/A    Package lib/Exporter.pm.
1N/A   out $=CODE(0x182528)() from lib/Config.pm:0
1N/A   scalar context return from CODE(0x182528): undef
1N/A   Package lib/Config.pm.
1N/A   in  $=Config::TIEHASH('Config') from lib/Config.pm:628
1N/A   out $=Config::TIEHASH('Config') from lib/Config.pm:628
1N/A   scalar context return from Config::TIEHASH:   empty hash
1N/A   in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A    in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
1N/A    out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
1N/A    scalar context return from Exporter::export: ''
1N/A   out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
1N/A   scalar context return from Exporter::import: ''
1N/A
1N/A=back
1N/A
1N/AIn all cases shown above, the line indentation shows the call tree.
1N/AIf bit 2 of C<frame> is set, a line is printed on exit from a
1N/Asubroutine as well.  If bit 4 is set, the arguments are printed
1N/Aalong with the caller info.  If bit 8 is set, the arguments are
1N/Aprinted even if they are tied or references.  If bit 16 is set, the
1N/Areturn value is printed, too.
1N/A
1N/AWhen a package is compiled, a line like this
1N/A
1N/A    Package lib/Carp.pm.
1N/A
1N/Ais printed with proper indentation.
1N/A
1N/A=head1 Debugging regular expressions
1N/A
1N/AThere are two ways to enable debugging output for regular expressions.
1N/A
1N/AIf your perl is compiled with C<-DDEBUGGING>, you may use the
1N/AB<-Dr> flag on the command line.
1N/A
1N/AOtherwise, one can C<use re 'debug'>, which has effects at
1N/Acompile time and run time.  It is not lexically scoped.
1N/A
1N/A=head2 Compile-time output
1N/A
1N/AThe debugging output at compile time looks like this:
1N/A
1N/A  Compiling REx `[bc]d(ef*g)+h[ij]k$'
1N/A  size 45 Got 364 bytes for offset annotations.
1N/A  first at 1
1N/A  rarest char g at 0
1N/A  rarest char d at 0
1N/A     1: ANYOF[bc](12)
1N/A    12: EXACT <d>(14)
1N/A    14: CURLYX[0] {1,32767}(28)
1N/A    16:   OPEN1(18)
1N/A    18:     EXACT <e>(20)
1N/A    20:     STAR(23)
1N/A    21:       EXACT <f>(0)
1N/A    23:     EXACT <g>(25)
1N/A    25:   CLOSE1(27)
1N/A    27:   WHILEM[1/1](0)
1N/A    28: NOTHING(29)
1N/A    29: EXACT <h>(31)
1N/A    31: ANYOF[ij](42)
1N/A    42: EXACT <k>(44)
1N/A    44: EOL(45)
1N/A    45: END(0)
1N/A  anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
1N/A        stclass `ANYOF[bc]' minlen 7
1N/A  Offsets: [45]
1N/A    1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
1N/A    0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
1N/A    11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
1N/A    0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
1N/A  Omitting $` $& $' support.
1N/A
1N/AThe first line shows the pre-compiled form of the regex.  The second
1N/Ashows the size of the compiled form (in arbitrary units, usually
1N/A4-byte words) and the total number of bytes allocated for the
1N/Aoffset/length table, usually 4+C<size>*8.  The next line shows the
1N/Alabel I<id> of the first node that does a match.
1N/A
1N/AThe
1N/A
1N/A  anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
1N/A        stclass `ANYOF[bc]' minlen 7
1N/A
1N/Aline (split into two lines above) contains optimizer
1N/Ainformation.  In the example shown, the optimizer found that the match
1N/Ashould contain a substring C<de> at offset 1, plus substring C<gh>
1N/Aat some offset between 3 and infinity.  Moreover, when checking for
1N/Athese substrings (to abandon impossible matches quickly), Perl will check
1N/Afor the substring C<gh> before checking for the substring C<de>.  The
1N/Aoptimizer may also use the knowledge that the match starts (at the
1N/AC<first> I<id>) with a character class, and no string
1N/Ashorter than 7 characters can possibly match.
1N/A
1N/AThe fields of interest which may appear in this line are
1N/A
1N/A=over 4
1N/A
1N/A=item C<anchored> I<STRING> C<at> I<POS>
1N/A
1N/A=item C<floating> I<STRING> C<at> I<POS1..POS2>
1N/A
1N/ASee above.
1N/A
1N/A=item C<matching floating/anchored>
1N/A
1N/AWhich substring to check first.
1N/A
1N/A=item C<minlen>
1N/A
1N/AThe minimal length of the match.
1N/A
1N/A=item C<stclass> I<TYPE>
1N/A
1N/AType of first matching node.
1N/A
1N/A=item C<noscan>
1N/A
1N/ADon't scan for the found substrings.
1N/A
1N/A=item C<isall>
1N/A
1N/AMeans that the optimizer information is all that the regular
1N/Aexpression contains, and thus one does not need to enter the regex engine at
1N/Aall.
1N/A
1N/A=item C<GPOS>
1N/A
1N/ASet if the pattern contains C<\G>.
1N/A
1N/A=item C<plus>
1N/A
1N/ASet if the pattern starts with a repeated char (as in C<x+y>).
1N/A
1N/A=item C<implicit>
1N/A
1N/ASet if the pattern starts with C<.*>.
1N/A
1N/A=item C<with eval>
1N/A
1N/ASet if the pattern contain eval-groups, such as C<(?{ code })> and
1N/AC<(??{ code })>.
1N/A
1N/A=item C<anchored(TYPE)>
1N/A
1N/AIf the pattern may match only at a handful of places, (with C<TYPE>
1N/Abeing C<BOL>, C<MBOL>, or C<GPOS>.  See the table below.
1N/A
1N/A=back
1N/A
1N/AIf a substring is known to match at end-of-line only, it may be
1N/Afollowed by C<$>, as in C<floating `k'$>.
1N/A
1N/AThe optimizer-specific information is used to avoid entering (a slow) regex
1N/Aengine on strings that will not definitely match.  If the C<isall> flag
1N/Ais set, a call to the regex engine may be avoided even when the optimizer
1N/Afound an appropriate place for the match.
1N/A
1N/AAbove the optimizer section is the list of I<nodes> of the compiled
1N/Aform of the regex.  Each line has format
1N/A
1N/AC<   >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
1N/A
1N/A=head2 Types of nodes
1N/A
1N/AHere are the possible types, with short descriptions:
1N/A
1N/A    # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
1N/A
1N/A    # Exit points
1N/A    END     no  End of program.
1N/A    SUCCEED no  Return from a subroutine, basically.
1N/A
1N/A    # Anchors:
1N/A    BOL     no  Match "" at beginning of line.
1N/A    MBOL    no  Same, assuming multiline.
1N/A    SBOL    no  Same, assuming singleline.
1N/A    EOS     no  Match "" at end of string.
1N/A    EOL     no  Match "" at end of line.
1N/A    MEOL    no  Same, assuming multiline.
1N/A    SEOL    no  Same, assuming singleline.
1N/A    BOUND   no  Match "" at any word boundary
1N/A    BOUNDL  no  Match "" at any word boundary
1N/A    NBOUND  no  Match "" at any word non-boundary
1N/A    NBOUNDL no  Match "" at any word non-boundary
1N/A    GPOS    no  Matches where last m//g left off.
1N/A
1N/A    # [Special] alternatives
1N/A    ANY     no  Match any one character (except newline).
1N/A    SANY    no  Match any one character.
1N/A    ANYOF   sv  Match character in (or not in) this class.
1N/A    ALNUM   no  Match any alphanumeric character
1N/A    ALNUML  no  Match any alphanumeric char in locale
1N/A    NALNUM  no  Match any non-alphanumeric character
1N/A    NALNUML no  Match any non-alphanumeric char in locale
1N/A    SPACE   no  Match any whitespace character
1N/A    SPACEL  no  Match any whitespace char in locale
1N/A    NSPACE  no  Match any non-whitespace character
1N/A    NSPACEL no  Match any non-whitespace char in locale
1N/A    DIGIT   no  Match any numeric character
1N/A    NDIGIT  no  Match any non-numeric character
1N/A
1N/A    # BRANCH    The set of branches constituting a single choice are hooked
1N/A    #       together with their "next" pointers, since precedence prevents
1N/A    #       anything being concatenated to any individual branch.  The
1N/A    #       "next" pointer of the last BRANCH in a choice points to the
1N/A    #       thing following the whole choice.  This is also where the
1N/A    #       final "next" pointer of each individual branch points; each
1N/A    #       branch starts with the operand node of a BRANCH node.
1N/A    #
1N/A    BRANCH  node    Match this alternative, or the next...
1N/A
1N/A    # BACK  Normal "next" pointers all implicitly point forward; BACK
1N/A    #       exists to make loop structures possible.
1N/A    # not used
1N/A    BACK    no  Match "", "next" ptr points backward.
1N/A
1N/A    # Literals
1N/A    EXACT   sv  Match this string (preceded by length).
1N/A    EXACTF  sv  Match this string, folded (prec. by length).
1N/A    EXACTFL sv  Match this string, folded in locale (w/len).
1N/A
1N/A    # Do nothing
1N/A    NOTHING no  Match empty string.
1N/A    # A variant of above which delimits a group, thus stops optimizations
1N/A    TAIL    no  Match empty string. Can jump here from outside.
1N/A
1N/A    # STAR,PLUS '?', and complex '*' and '+', are implemented as circular
1N/A    #       BRANCH structures using BACK.  Simple cases (one character
1N/A    #       per match) are implemented with STAR and PLUS for speed
1N/A    #       and to minimize recursive plunges.
1N/A    #
1N/A    STAR    node    Match this (simple) thing 0 or more times.
1N/A    PLUS    node    Match this (simple) thing 1 or more times.
1N/A
1N/A    CURLY   sv 2    Match this simple thing {n,m} times.
1N/A    CURLYN  no 2    Match next-after-this simple thing
1N/A    #           {n,m} times, set parens.
1N/A    CURLYM  no 2    Match this medium-complex thing {n,m} times.
1N/A    CURLYX  sv 2    Match this complex thing {n,m} times.
1N/A
1N/A    # This terminator creates a loop structure for CURLYX
1N/A    WHILEM  no  Do curly processing and see if rest matches.
1N/A
1N/A    # OPEN,CLOSE,GROUPP ...are numbered at compile time.
1N/A    OPEN    num 1   Mark this point in input as start of #n.
1N/A    CLOSE   num 1   Analogous to OPEN.
1N/A
1N/A    REF     num 1   Match some already matched string
1N/A    REFF    num 1   Match already matched string, folded
1N/A    REFFL   num 1   Match already matched string, folded in loc.
1N/A
1N/A    # grouping assertions
1N/A    IFMATCH off 1 2 Succeeds if the following matches.
1N/A    UNLESSM off 1 2 Fails if the following matches.
1N/A    SUSPEND off 1 1 "Independent" sub-regex.
1N/A    IFTHEN  off 1 1 Switch, should be preceded by switcher .
1N/A    GROUPP  num 1   Whether the group matched.
1N/A
1N/A    # Support for long regex
1N/A    LONGJMP off 1 1 Jump far away.
1N/A    BRANCHJ off 1 1 BRANCH with long offset.
1N/A
1N/A    # The heavy worker
1N/A    EVAL    evl 1   Execute some Perl code.
1N/A
1N/A    # Modifiers
1N/A    MINMOD  no  Next operator is not greedy.
1N/A    LOGICAL no  Next opcode should set the flag only.
1N/A
1N/A    # This is not used yet
1N/A    RENUM   off 1 1 Group with independently numbered parens.
1N/A
1N/A    # This is not really a node, but an optimized away piece of a "long" node.
1N/A    # To simplify debugging output, we mark it as if it were a node
1N/A    OPTIMIZED   off Placeholder for dump.
1N/A
1N/A=for unprinted-credits
1N/ANext section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421
1N/A
1N/AFollowing the optimizer information is a dump of the offset/length
1N/Atable, here split across several lines:
1N/A
1N/A  Offsets: [45]
1N/A    1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
1N/A    0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
1N/A    11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
1N/A    0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
1N/A
1N/AThe first line here indicates that the offset/length table contains 45
1N/Aentries.  Each entry is a pair of integers, denoted by C<offset[length]>.
1N/AEntries are numbered starting with 1, so entry #1 here is C<1[4]> and
1N/Aentry #12 is C<5[1]>.  C<1[4]> indicates that the node labeled C<1:>
1N/A(the C<1: ANYOF[bc]>) begins at character position 1 in the
1N/Apre-compiled form of the regex, and has a length of 4 characters.
1N/AC<5[1]> in position 12
1N/Aindicates that the node labeled C<12:>
1N/A(the C<< 12: EXACT <d> >>) begins at character position 5 in the
1N/Apre-compiled form of the regex, and has a length of 1 character.
1N/AC<12[1]> in position 14
1N/Aindicates that the node labeled C<14:>
1N/A(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
1N/Apre-compiled form of the regex, and has a length of 1 character---that
1N/Ais, it corresponds to the C<+> symbol in the precompiled regex.
1N/A
1N/AC<0[0]> items indicate that there is no corresponding node.
1N/A
1N/A=head2 Run-time output
1N/A
1N/AFirst of all, when doing a match, one may get no run-time output even
1N/Aif debugging is enabled.  This means that the regex engine was never
1N/Aentered and that all of the job was therefore done by the optimizer.
1N/A
1N/AIf the regex engine was entered, the output may look like this:
1N/A
1N/A  Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
1N/A    Setting an EVAL scope, savestack=3
1N/A     2 <ab> <cdefg__gh_>    |  1: ANYOF
1N/A     3 <abc> <defg__gh_>    | 11: EXACT <d>
1N/A     4 <abcd> <efg__gh_>    | 13: CURLYX {1,32767}
1N/A     4 <abcd> <efg__gh_>    | 26:   WHILEM
1N/A                0 out of 1..32767  cc=effff31c
1N/A     4 <abcd> <efg__gh_>    | 15:     OPEN1
1N/A     4 <abcd> <efg__gh_>    | 17:     EXACT <e>
1N/A     5 <abcde> <fg__gh_>    | 19:     STAR
1N/A                 EXACT <f> can match 1 times out of 32767...
1N/A    Setting an EVAL scope, savestack=3
1N/A     6 <bcdef> <g__gh__>    | 22:       EXACT <g>
1N/A     7 <bcdefg> <__gh__>    | 24:       CLOSE1
1N/A     7 <bcdefg> <__gh__>    | 26:       WHILEM
1N/A                    1 out of 1..32767  cc=effff31c
1N/A    Setting an EVAL scope, savestack=12
1N/A     7 <bcdefg> <__gh__>    | 15:         OPEN1
1N/A     7 <bcdefg> <__gh__>    | 17:         EXACT <e>
1N/A       restoring \1 to 4(4)..7
1N/A                    failed, try continuation...
1N/A     7 <bcdefg> <__gh__>    | 27:         NOTHING
1N/A     7 <bcdefg> <__gh__>    | 28:         EXACT <h>
1N/A                    failed...
1N/A                failed...
1N/A
1N/AThe most significant information in the output is about the particular I<node>
1N/Aof the compiled regex that is currently being tested against the target string.
1N/AThe format of these lines is
1N/A
1N/AC<    >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>>   |I<ID>:  I<TYPE>
1N/A
1N/AThe I<TYPE> info is indented with respect to the backtracking level.
1N/AOther incidental information appears interspersed within.
1N/A
1N/A=head1 Debugging Perl memory usage
1N/A
1N/APerl is a profligate wastrel when it comes to memory use.  There
1N/Ais a saying that to estimate memory usage of Perl, assume a reasonable
1N/Aalgorithm for memory allocation, multiply that estimate by 10, and
1N/Awhile you still may miss the mark, at least you won't be quite so
1N/Aastonished.  This is not absolutely true, but may provide a good
1N/Agrasp of what happens.
1N/A
1N/AAssume that an integer cannot take less than 20 bytes of memory, a
1N/Afloat cannot take less than 24 bytes, a string cannot take less
1N/Athan 32 bytes (all these examples assume 32-bit architectures, the
1N/Aresult are quite a bit worse on 64-bit architectures).  If a variable
1N/Ais accessed in two of three different ways (which require an integer,
1N/Aa float, or a string), the memory footprint may increase yet another
1N/A20 bytes.  A sloppy malloc(3) implementation can inflate these
1N/Anumbers dramatically.
1N/A
1N/AOn the opposite end of the scale, a declaration like
1N/A
1N/A  sub foo;
1N/A
1N/Amay take up to 500 bytes of memory, depending on which release of Perl
1N/Ayou're running.
1N/A
1N/AAnecdotal estimates of source-to-compiled code bloat suggest an
1N/Aeightfold increase.  This means that the compiled form of reasonable
1N/A(normally commented, properly indented etc.) code will take
1N/Aabout eight times more space in memory than the code took
1N/Aon disk.
1N/A
1N/AThe B<-DL> command-line switch is obsolete since circa Perl 5.6.0
1N/A(it was available only if Perl was built with C<-DDEBUGGING>).
1N/AThe switch was used to track Perl's memory allocations and possible
1N/Amemory leaks.  These days the use of malloc debugging tools like
1N/AF<Purify> or F<valgrind> is suggested instead.
1N/A
1N/AOne way to find out how much memory is being used by Perl data
1N/Astructures is to install the Devel::Size module from CPAN: it gives
1N/Ayou the minimum number of bytes required to store a particular data
1N/Astructure.  Please be mindful of the difference between the size()
1N/Aand total_size().
1N/A
1N/AIf Perl has been compiled using Perl's malloc you can analyze Perl
1N/Amemory usage by setting the $ENV{PERL_DEBUG_MSTATS}.
1N/A
1N/A=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
1N/A
1N/AIf your perl is using Perl's malloc() and was compiled with the
1N/Anecessary switches (this is the default), then it will print memory
1N/Ausage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS}
1N/A> 1 >>, and before termination of the program when C<<
1N/A$ENV{PERL_DEBUG_MSTATS} >= 1 >>.  The report format is similar to
1N/Athe following example:
1N/A
1N/A  $ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
1N/A  Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
1N/A     14216 free:   130   117    28     7     9   0   2     2   1 0 0
1N/A        437    61    36     0     5
1N/A     60924 used:   125   137   161    55     7   8   6    16   2 0 1
1N/A         74   109   304    84    20
1N/A  Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
1N/A  Memory allocation statistics after execution:   (buckets 4(4)..8188(8192)
1N/A     30888 free:   245    78    85    13     6   2   1     3   2 0 1
1N/A        315   162    39    42    11
1N/A    175816 used:   265   176  1112   111    26  22  11    27   2 1 1
1N/A        196   178  1066   798    39
1N/A  Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
1N/A
1N/AIt is possible to ask for such a statistic at arbitrary points in
1N/Ayour execution using the mstat() function out of the standard
1N/ADevel::Peek module.
1N/A
1N/AHere is some explanation of that format:
1N/A
1N/A=over 4
1N/A
1N/A=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
1N/A
1N/APerl's malloc() uses bucketed allocations.  Every request is rounded
1N/Aup to the closest bucket size available, and a bucket is taken from
1N/Athe pool of buckets of that size.
1N/A
1N/AThe line above describes the limits of buckets currently in use.
1N/AEach bucket has two sizes: memory footprint and the maximal size
1N/Aof user data that can fit into this bucket.  Suppose in the above
1N/Aexample that the smallest bucket were size 4.  The biggest bucket
1N/Awould have usable size 8188, and the memory footprint would be 8192.
1N/A
1N/AIn a Perl built for debugging, some buckets may have negative usable
1N/Asize.  This means that these buckets cannot (and will not) be used.
1N/AFor larger buckets, the memory footprint may be one page greater
1N/Athan a power of 2.  If so, case the corresponding power of two is
1N/Aprinted in the C<APPROX> field above.
1N/A
1N/A=item Free/Used
1N/A
1N/AThe 1 or 2 rows of numbers following that correspond to the number
1N/Aof buckets of each size between C<SMALLEST> and C<GREATEST>.  In
1N/Athe first row, the sizes (memory footprints) of buckets are powers
1N/Aof two--or possibly one page greater.  In the second row, if present,
1N/Athe memory footprints of the buckets are between the memory footprints
1N/Aof two buckets "above".
1N/A
1N/AFor example, suppose under the previous example, the memory footprints
1N/Awere
1N/A
1N/A     free:    8     16    32    64    128  256 512 1024 2048 4096 8192
1N/A       4     12    24    48    80
1N/A
1N/AWith non-C<DEBUGGING> perl, the buckets starting from C<128> have
1N/Aa 4-byte overhead, and thus an 8192-long bucket may take up to
1N/A8188-byte allocations.
1N/A
1N/A=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
1N/A
1N/AThe first two fields give the total amount of memory perl sbrk(2)ed
1N/A(ess-broken? :-) and number of sbrk(2)s used.  The third number is
1N/Awhat perl thinks about continuity of returned chunks.  So long as
1N/Athis number is positive, malloc() will assume that it is probable
1N/Athat sbrk(2) will provide continuous memory.
1N/A
1N/AMemory allocated by external libraries is not counted.
1N/A
1N/A=item C<pad: 0>
1N/A
1N/AThe amount of sbrk(2)ed memory needed to keep buckets aligned.
1N/A
1N/A=item C<heads: 2192>
1N/A
1N/AAlthough memory overhead of bigger buckets is kept inside the bucket, for
1N/Asmaller buckets, it is kept in separate areas.  This field gives the
1N/Atotal size of these areas.
1N/A
1N/A=item C<chain: 0>
1N/A
1N/Amalloc() may want to subdivide a bigger bucket into smaller buckets.
1N/AIf only a part of the deceased bucket is left unsubdivided, the rest
1N/Ais kept as an element of a linked list.  This field gives the total
1N/Asize of these chunks.
1N/A
1N/A=item C<tail: 6144>
1N/A
1N/ATo minimize the number of sbrk(2)s, malloc() asks for more memory.  This
1N/Afield gives the size of the yet unused part, which is sbrk(2)ed, but
1N/Anever touched.
1N/A
1N/A=back
1N/A
1N/A=head2 Example of using B<-DL> switch
1N/A
1N/A(Note that -DL is obsolete since circa 5.6.0, and even before that
1N/APerl needed to be compiled with -DDEBUGGING.)
1N/A
1N/ABelow we show how to analyse memory usage by
1N/A
1N/A  do 'lib/auto/POSIX/autosplit.ix';
1N/A
1N/AThe file in question contains a header and 146 lines similar to
1N/A
1N/A  sub getcwd;
1N/A
1N/AB<WARNING>: The discussion below supposes 32-bit architecture.  In
1N/Anewer releases of Perl, memory usage of the constructs discussed
1N/Ahere is greatly improved, but the story discussed below is a real-life
1N/Astory.  This story is mercilessly terse, and assumes rather more than cursory
1N/Aknowledge of Perl internals.  Type space to continue, `q' to quit.
1N/A(Actually, you just want to skip to the next section.)
1N/A
1N/AHere is the itemized list of Perl allocations performed during parsing
1N/Aof this file:
1N/A
1N/A !!! "after" at test.pl line 3.
1N/A    Id  subtot   4   8  12  16  20  24  28  32  36  40  48  56  64  72  80 80+
1N/A  0 02   13752   .   .   .   . 294   .   .   .   .   .   .   .   .   .   .   4
1N/A  0 54    5545   .   .   8 124  16   .   .   .   1   1   .   .   .   .   .   3
1N/A  5 05      32   .   .   .   .   .   .   .   1   .   .   .   .   .   .   .   .
1N/A  6 02    7152   .   .   .   .   .   .   .   .   .   . 149   .   .   .   .   .
1N/A  7 02    3600   .   .   .   .   . 150   .   .   .   .   .   .   .   .   .   .
1N/A  7 03      64   .  -1   .   1   .   .   2   .   .   .   .   .   .   .   .   .
1N/A  7 04    7056   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   7
1N/A  7 17   38404   .   .   .   .   .   .   .   1   .   . 442 149   .   . 147   .
1N/A  9 03    2078  17 249  32   .   .   .   .   2   .   .   .   .   .   .   .   .
1N/A
1N/A
1N/ATo see this list, insert two C<warn('!...')> statements around the call:
1N/A
1N/A  warn('!');
1N/A  do 'lib/auto/POSIX/autosplit.ix';
1N/A  warn('!!! "after"');
1N/A
1N/Aand run it with Perl's B<-DL> option.  The first warn() will print
1N/Amemory allocation info before parsing the file and will memorize
1N/Athe statistics at this point (we ignore what it prints).  The second
1N/Awarn() prints increments with respect to these memorized data.  This
1N/Ais the printout shown above.
1N/A
1N/ADifferent I<Id>s on the left correspond to different subsystems of
1N/Athe perl interpreter.  They are just the first argument given to
1N/Athe perl memory allocation API named New().  To find what C<9 03>
1N/Ameans, just B<grep> the perl source for C<903>.  You'll find it in
1N/AF<util.c>, function savepvn().  (I know, you wonder why we told you
1N/Ato B<grep> and then gave away the answer.  That's because grepping
1N/Athe source is good for the soul.)  This function is used to store
1N/Aa copy of an existing chunk of memory.  Using a C debugger, one can
1N/Asee that the function was called either directly from gv_init() or
1N/Avia sv_magic(), and that gv_init() is called from gv_fetchpv()--which
1N/Awas itself called from newSUB().  Please stop to catch your breath now.
1N/A
1N/AB<NOTE>: To reach this point in the debugger and skip the calls to
1N/Asavepvn() during the compilation of the main program, you should
1N/Aset a C breakpoint
1N/Ain Perl_warn(), continue until this point is reached, and I<then> set
1N/Aa C breakpoint in Perl_savepvn().  Note that you may need to skip a
1N/Ahandful of Perl_savepvn() calls that do not correspond to mass production
1N/Aof CVs (there are more C<903> allocations than 146 similar lines of
1N/AF<lib/auto/POSIX/autosplit.ix>).  Note also that C<Perl_> prefixes are
1N/Aadded by macroization code in perl header files to avoid conflicts
1N/Awith external libraries.
1N/A
1N/AAnyway, we see that C<903> ids correspond to creation of globs, twice
1N/Aper glob - for glob name, and glob stringification magic.
1N/A
1N/AHere are explanations for other I<Id>s above:
1N/A
1N/A=over 4
1N/A
1N/A=item C<717>
1N/A
1N/ACreates bigger C<XPV*> structures.  In the case above, it
1N/Acreates 3 C<AV>s per subroutine, one for a list of lexical variable
1N/Anames, one for a scratchpad (which contains lexical variables and
1N/AC<targets>), and one for the array of scratchpads needed for
1N/Arecursion.
1N/A
1N/AIt also creates a C<GV> and a C<CV> per subroutine, all called from
1N/Astart_subparse().
1N/A
1N/A=item C<002>
1N/A
1N/ACreates a C array corresponding to the C<AV> of scratchpads and the
1N/Ascratchpad itself.  The first fake entry of this scratchpad is
1N/Acreated though the subroutine itself is not defined yet.
1N/A
1N/AIt also creates C arrays to keep data for the stash.  This is one HV,
1N/Abut it grows; thus, there are 4 big allocations: the big chunks are not
1N/Afreed, but are kept as additional arenas for C<SV> allocations.
1N/A
1N/A=item C<054>
1N/A
1N/ACreates a C<HEK> for the name of the glob for the subroutine.  This
1N/Aname is a key in a I<stash>.
1N/A
1N/ABig allocations with this I<Id> correspond to allocations of new
1N/Aarenas to keep C<HE>.
1N/A
1N/A=item C<602>
1N/A
1N/ACreates a C<GP> for the glob for the subroutine.
1N/A
1N/A=item C<702>
1N/A
1N/ACreates the C<MAGIC> for the glob for the subroutine.
1N/A
1N/A=item C<704>
1N/A
1N/ACreates I<arenas> which keep SVs.
1N/A
1N/A=back
1N/A
1N/A=head2 B<-DL> details
1N/A
1N/AIf Perl is run with B<-DL> option, then warn()s that start with `!'
1N/Abehave specially.  They print a list of I<categories> of memory
1N/Aallocations, and statistics of allocations of different sizes for
1N/Athese categories.
1N/A
1N/AIf warn() string starts with
1N/A
1N/A=over 4
1N/A
1N/A=item C<!!!>
1N/A
1N/Aprint changed categories only, print the differences in counts of allocations.
1N/A
1N/A=item C<!!>
1N/A
1N/Aprint grown categories only; print the absolute values of counts, and totals.
1N/A
1N/A=item C<!>
1N/A
1N/Aprint nonempty categories, print the absolute values of counts and totals.
1N/A
1N/A=back
1N/A
1N/A=head2 Limitations of B<-DL> statistics
1N/A
1N/AIf an extension or external library does not use the Perl API to
1N/Aallocate memory, such allocations are not counted.
1N/A
1N/A=head1 SEE ALSO
1N/A
1N/AL<perldebug>,
1N/AL<perlguts>,
1N/AL<perlrun>
1N/AL<re>,
1N/Aand
1N/AL<Devel::DProf>.