distrib/pod/perlsec.pod

1N/A=head1 NAME
1N/A
1N/Aperlsec - Perl security
1N/A
1N/A=head1 DESCRIPTION
1N/A
1N/APerl is designed to make it easy to program securely even when running
1N/Awith extra privileges, like setuid or setgid programs.  Unlike most
1N/Acommand line shells, which are based on multiple substitution passes on
1N/Aeach line of the script, Perl uses a more conventional evaluation scheme
1N/Awith fewer hidden snags.  Additionally, because the language has more
1N/Abuiltin functionality, it can rely less upon external (and possibly
1N/Auntrustworthy) programs to accomplish its purposes.
1N/A
1N/APerl automatically enables a set of special security checks, called I<taint
1N/Amode>, when it detects its program running with differing real and effective
1N/Auser or group IDs.  The setuid bit in Unix permissions is mode 04000, the
1N/Asetgid bit mode 02000; either or both may be set.  You can also enable taint
1N/Amode explicitly by using the B<-T> command line flag. This flag is
1N/AI<strongly> suggested for server programs and any program run on behalf of
1N/Asomeone else, such as a CGI script. Once taint mode is on, it's on for
1N/Athe remainder of your script.
1N/A
1N/AWhile in this mode, Perl takes special precautions called I<taint
1N/Achecks> to prevent both obvious and subtle traps.  Some of these checks
1N/Aare reasonably simple, such as verifying that path directories aren't
1N/Awritable by others; careful programmers have always used checks like
1N/Athese.  Other checks, however, are best supported by the language itself,
1N/Aand it is these checks especially that contribute to making a set-id Perl
1N/Aprogram more secure than the corresponding C program.
1N/A
1N/AYou may not use data derived from outside your program to affect
1N/Asomething else outside your program--at least, not by accident.  All
1N/Acommand line arguments, environment variables, locale information (see
1N/AL<perllocale>), results of certain system calls (readdir(),
1N/Areadlink(), the variable of shmread(), the messages returned by
1N/Amsgrcv(), the password, gcos and shell fields returned by the
1N/Agetpwxxx() calls), and all file input are marked as "tainted".
1N/ATainted data may not be used directly or indirectly in any command
1N/Athat invokes a sub-shell, nor in any command that modifies files,
1N/Adirectories, or processes, B<with the following exceptions>:
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/AArguments to C<print> and C<syswrite> are B<not> checked for taintedness.
1N/A
1N/A=item *
1N/A
1N/ASymbolic methods
1N/A
1N/A    $obj->$method(@args);
1N/A
1N/Aand symbolic sub references
1N/A
1N/A    &{$foo}(@args);
1N/A    $foo->(@args);
1N/A
1N/Aare not checked for taintedness.  This requires extra carefulness
1N/Aunless you want external data to affect your control flow.  Unless
1N/Ayou carefully limit what these symbolic values are, people are able
1N/Ato call functions B<outside> your Perl code, such as POSIX::system,
1N/Ain which case they are able to run arbitrary external code.
1N/A
1N/A=back
1N/A
1N/AFor efficiency reasons, Perl takes a conservative view of
1N/Awhether data is tainted.  If an expression contains tainted data,
1N/Aany subexpression may be considered tainted, even if the value
1N/Aof the subexpression is not itself affected by the tainted data.
1N/A
1N/ABecause taintedness is associated with each scalar value, some
1N/Aelements of an array or hash can be tainted and others not.
1N/AThe keys of a hash are never tainted.
1N/A
1N/AFor example:
1N/A
1N/A    $arg = shift;       # $arg is tainted
1N/A    $hid = $arg, 'bar';     # $hid is also tainted
1N/A    $line = <>;         # Tainted
1N/A    $line = <STDIN>;        # Also tainted
1N/A    open FOO, "/home/me/bar" or die $!;
1N/A    $line = <FOO>;      # Still tainted
1N/A    $path = $ENV{'PATH'};   # Tainted, but see below
1N/A    $data = 'abc';      # Not tainted
1N/A
1N/A    system "echo $arg";     # Insecure
1N/A    system "/bin/echo", $arg;   # Considered insecure
1N/A                # (Perl doesn't know about /bin/echo)
1N/A    system "echo $hid";     # Insecure
1N/A    system "echo $data";    # Insecure until PATH set
1N/A
1N/A    $path = $ENV{'PATH'};   # $path now tainted
1N/A
1N/A    $ENV{'PATH'} = '/bin:/usr/bin';
1N/A    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
1N/A
1N/A    $path = $ENV{'PATH'};   # $path now NOT tainted
1N/A    system "echo $data";    # Is secure now!
1N/A
1N/A    open(FOO, "< $arg");    # OK - read-only file
1N/A    open(FOO, "> $arg");    # Not OK - trying to write
1N/A
1N/A    open(FOO,"echo $arg|"); # Not OK
1N/A    open(FOO,"-|")
1N/A    or exec 'echo', $arg;   # Also not OK
1N/A
1N/A    $shout = `echo $arg`;   # Insecure, $shout now tainted
1N/A
1N/A    unlink $data, $arg;     # Insecure
1N/A    umask $arg;         # Insecure
1N/A
1N/A    exec "echo $arg";       # Insecure
1N/A    exec "echo", $arg;      # Insecure
1N/A    exec "sh", '-c', $arg;  # Very insecure!
1N/A
1N/A    @files = <*.c>;     # insecure (uses readdir() or similar)
1N/A    @files = glob('*.c');   # insecure (uses readdir() or similar)
1N/A
1N/A    # In Perl releases older than 5.6.0 the <*.c> and glob('*.c') would
1N/A    # have used an external program to do the filename expansion; but in
1N/A    # either case the result is tainted since the list of filenames comes
1N/A    # from outside of the program.
1N/A
1N/A    $bad = ($arg, 23);      # $bad will be tainted
1N/A    $arg, `true`;       # Insecure (although it isn't really)
1N/A
1N/AIf you try to do something insecure, you will get a fatal error saying
1N/Asomething like "Insecure dependency" or "Insecure $ENV{PATH}".
1N/A
1N/A=head2 Laundering and Detecting Tainted Data
1N/A
1N/ATo test whether a variable contains tainted data, and whose use would
1N/Athus trigger an "Insecure dependency" message, you can use the
1N/Atainted() function of the Scalar::Util module, available in your
1N/Anearby CPAN mirror, and included in Perl starting from the release 5.8.0.
1N/AOr you may be able to use the following C<is_tainted()> function.
1N/A
1N/A    sub is_tainted {
1N/A        return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
1N/A    }
1N/A
1N/AThis function makes use of the fact that the presence of tainted data
1N/Aanywhere within an expression renders the entire expression tainted.  It
1N/Awould be inefficient for every operator to test every argument for
1N/Ataintedness.  Instead, the slightly more efficient and conservative
1N/Aapproach is used that if any tainted value has been accessed within the
1N/Asame expression, the whole expression is considered tainted.
1N/A
1N/ABut testing for taintedness gets you only so far.  Sometimes you have just
1N/Ato clear your data's taintedness.  Values may be untainted by using them
1N/Aas keys in a hash; otherwise the only way to bypass the tainting
1N/Amechanism is by referencing subpatterns from a regular expression match.
1N/APerl presumes that if you reference a substring using $1, $2, etc., that
1N/Ayou knew what you were doing when you wrote the pattern.  That means using
1N/Aa bit of thought--don't just blindly untaint anything, or you defeat the
1N/Aentire mechanism.  It's better to verify that the variable has only good
1N/Acharacters (for certain values of "good") rather than checking whether it
1N/Ahas any bad characters.  That's because it's far too easy to miss bad
1N/Acharacters that you never thought of.
1N/A
1N/AHere's a test to make sure that the data contains nothing but "word"
1N/Acharacters (alphabetics, numerics, and underscores), a hyphen, an at sign,
1N/Aor a dot.
1N/A
1N/A    if ($data =~ /^([-\@\w.]+)$/) {
1N/A    $data = $1;             # $data now untainted
1N/A    } else {
1N/A    die "Bad data in '$data'";  # log this somewhere
1N/A    }
1N/A
1N/AThis is fairly secure because C</\w+/> doesn't normally match shell
1N/Ametacharacters, nor are dot, dash, or at going to mean something special
1N/Ato the shell.  Use of C</.+/> would have been insecure in theory because
1N/Ait lets everything through, but Perl doesn't check for that.  The lesson
1N/Ais that when untainting, you must be exceedingly careful with your patterns.
1N/ALaundering data using regular expression is the I<only> mechanism for
1N/Auntainting dirty data, unless you use the strategy detailed below to fork
1N/Aa child of lesser privilege.
1N/A
1N/AThe example does not untaint $data if C<use locale> is in effect,
1N/Abecause the characters matched by C<\w> are determined by the locale.
1N/APerl considers that locale definitions are untrustworthy because they
1N/Acontain data from outside the program.  If you are writing a
1N/Alocale-aware program, and want to launder data with a regular expression
1N/Acontaining C<\w>, put C<no locale> ahead of the expression in the same
1N/Ablock.  See L<perllocale/SECURITY> for further discussion and examples.
1N/A
1N/A=head2 Switches On the "#!" Line
1N/A
1N/AWhen you make a script executable, in order to make it usable as a
1N/Acommand, the system will pass switches to perl from the script's #!
1N/Aline.  Perl checks that any command line switches given to a setuid
1N/A(or setgid) script actually match the ones set on the #! line.  Some
1N/AUnix and Unix-like environments impose a one-switch limit on the #!
1N/Aline, so you may need to use something like C<-wU> instead of C<-w -U>
1N/Aunder such systems.  (This issue should arise only in Unix or
1N/AUnix-like environments that support #! and setuid or setgid scripts.)
1N/A
1N/A=head2 Taint mode and @INC
1N/A
1N/AWhen the taint mode (C<-T>) is in effect, the "." directory is removed
1N/Afrom C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
1N/Aare ignored by Perl. You can still adjust C<@INC> from outside the
1N/Aprogram by using the C<-I> command line option as explained in
1N/AL<perlrun>. The two environment variables are ignored because
1N/Athey are obscured, and a user running a program could be unaware that
1N/Athey are set, whereas the C<-I> option is clearly visible and
1N/Atherefore permitted.
1N/A
1N/AAnother way to modify C<@INC> without modifying the program, is to use
1N/Athe C<lib> pragma, e.g.:
1N/A
1N/A  perl -Mlib=/foo program
1N/A
1N/AThe benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
1N/Awill automagically remove any duplicated directories, while the later
1N/Awill not.
1N/A
1N/A=head2 Cleaning Up Your Path
1N/A
1N/AFor "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to a
1N/Aknown value, and each directory in the path must be non-writable by others
1N/Athan its owner and group.  You may be surprised to get this message even
1N/Aif the pathname to your executable is fully qualified.  This is I<not>
1N/Agenerated because you didn't supply a full path to the program; instead,
1N/Ait's generated because you never set your PATH environment variable, or
1N/Ayou didn't set it to something that was safe.  Because Perl can't
1N/Aguarantee that the executable in question isn't itself going to turn
1N/Aaround and execute some other program that is dependent on your PATH, it
1N/Amakes sure you set the PATH.
1N/A
1N/AThe PATH isn't the only environment variable which can cause problems.
1N/ABecause some shells may use the variables IFS, CDPATH, ENV, and
1N/ABASH_ENV, Perl checks that those are either empty or untainted when
1N/Astarting subprocesses. You may wish to add something like this to your
1N/Asetid and taint-checking scripts.
1N/A
1N/A    delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};   # Make %ENV safer
1N/A
1N/AIt's also possible to get into trouble with other operations that don't
1N/Acare whether they use tainted values.  Make judicious use of the file
1N/Atests in dealing with any user-supplied filenames.  When possible, do
1N/Aopens and such B<after> properly dropping any special user (or group!)
1N/Aprivileges. Perl doesn't prevent you from opening tainted filenames for reading,
1N/Aso be careful what you print out.  The tainting mechanism is intended to
1N/Aprevent stupid mistakes, not to remove the need for thought.
1N/A
1N/APerl does not call the shell to expand wild cards when you pass B<system>
1N/Aand B<exec> explicit parameter lists instead of strings with possible shell
1N/Awildcards in them.  Unfortunately, the B<open>, B<glob>, and
1N/Abacktick functions provide no such alternate calling convention, so more
1N/Asubterfuge will be required.
1N/A
1N/APerl provides a reasonably safe way to open a file or pipe from a setuid
1N/Aor setgid program: just create a child process with reduced privilege who
1N/Adoes the dirty work for you.  First, fork a child using the special
1N/AB<open> syntax that connects the parent and child by a pipe.  Now the
1N/Achild resets its ID set and any other per-process attributes, like
1N/Aenvironment variables, umasks, current working directories, back to the
1N/Aoriginals or known safe values.  Then the child process, which no longer
1N/Ahas any special permissions, does the B<open> or other system call.
1N/AFinally, the child passes the data it managed to access back to the
1N/Aparent.  Because the file or pipe was opened in the child while running
1N/Aunder less privilege than the parent, it's not apt to be tricked into
1N/Adoing something it shouldn't.
1N/A
1N/AHere's a way to do backticks reasonably safely.  Notice how the B<exec> is
1N/Anot called with a string that the shell could expand.  This is by far the
1N/Abest way to call something that might be subjected to shell escapes: just
1N/Anever call the shell at all.
1N/A
1N/A        use English '-no_match_vars';
1N/A        die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
1N/A        if ($pid) {           # parent
1N/A            while (<KID>) {
1N/A                # do something
1N/A            }
1N/A            close KID;
1N/A        } else {
1N/A            my @temp     = ($EUID, $EGID);
1N/A            my $orig_uid = $UID;
1N/A            my $orig_gid = $GID;
1N/A            $EUID = $UID;
1N/A            $EGID = $GID;
1N/A            # Drop privileges
1N/A            $UID  = $orig_uid;
1N/A            $GID  = $orig_gid;
1N/A            # Make sure privs are really gone
1N/A            ($EUID, $EGID) = @temp;
1N/A            die "Can't drop privileges"
1N/A                unless $UID == $EUID  && $GID eq $EGID;
1N/A            $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
1N/A        # Consider sanitizing the environment even more.
1N/A            exec 'myprog', 'arg1', 'arg2'
1N/A                or die "can't exec myprog: $!";
1N/A        }
1N/A
1N/AA similar strategy would work for wildcard expansion via C<glob>, although
1N/Ayou can use C<readdir> instead.
1N/A
1N/ATaint checking is most useful when although you trust yourself not to have
1N/Awritten a program to give away the farm, you don't necessarily trust those
1N/Awho end up using it not to try to trick it into doing something bad.  This
1N/Ais the kind of security checking that's useful for set-id programs and
1N/Aprograms launched on someone else's behalf, like CGI programs.
1N/A
1N/AThis is quite different, however, from not even trusting the writer of the
1N/Acode not to try to do something evil.  That's the kind of trust needed
1N/Awhen someone hands you a program you've never seen before and says, "Here,
1N/Arun this."  For that kind of safety, check out the Safe module,
1N/Aincluded standard in the Perl distribution.  This module allows the
1N/Aprogrammer to set up special compartments in which all system operations
1N/Aare trapped and namespace access is carefully controlled.
1N/A
1N/A=head2 Security Bugs
1N/A
1N/ABeyond the obvious problems that stem from giving special privileges to
1N/Asystems as flexible as scripts, on many versions of Unix, set-id scripts
1N/Aare inherently insecure right from the start.  The problem is a race
1N/Acondition in the kernel.  Between the time the kernel opens the file to
1N/Asee which interpreter to run and when the (now-set-id) interpreter turns
1N/Aaround and reopens the file to interpret it, the file in question may have
1N/Achanged, especially if you have symbolic links on your system.
1N/A
1N/AFortunately, sometimes this kernel "feature" can be disabled.
1N/AUnfortunately, there are two ways to disable it.  The system can simply
1N/Aoutlaw scripts with any set-id bit set, which doesn't help much.
1N/AAlternately, it can simply ignore the set-id bits on scripts.  If the
1N/Alatter is true, Perl can emulate the setuid and setgid mechanism when it
1N/Anotices the otherwise useless setuid/gid bits on Perl scripts.  It does
1N/Athis via a special executable called B<suidperl> that is automatically
1N/Ainvoked for you if it's needed.
1N/A
1N/AHowever, if the kernel set-id script feature isn't disabled, Perl will
1N/Acomplain loudly that your set-id script is insecure.  You'll need to
1N/Aeither disable the kernel set-id script feature, or put a C wrapper around
1N/Athe script.  A C wrapper is just a compiled program that does nothing
1N/Aexcept call your Perl program.   Compiled programs are not subject to the
1N/Akernel bug that plagues set-id scripts.  Here's a simple wrapper, written
1N/Ain C:
1N/A
1N/A    #define REAL_PATH "/path/to/script"
1N/A    main(ac, av)
1N/A    char **av;
1N/A    {
1N/A    execv(REAL_PATH, av);
1N/A    }
1N/A
1N/ACompile this wrapper into a binary executable and then make I<it> rather
1N/Athan your script setuid or setgid.
1N/A
1N/AIn recent years, vendors have begun to supply systems free of this
1N/Ainherent security bug.  On such systems, when the kernel passes the name
1N/Aof the set-id script to open to the interpreter, rather than using a
1N/Apathname subject to meddling, it instead passes I</dev/fd/3>.  This is a
1N/Aspecial file already opened on the script, so that there can be no race
1N/Acondition for evil scripts to exploit.  On these systems, Perl should be
1N/Acompiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>.  The B<Configure>
1N/Aprogram that builds Perl tries to figure this out for itself, so you
1N/Ashould never have to specify this yourself.  Most modern releases of
1N/ASysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
1N/A
1N/APrior to release 5.6.1 of Perl, bugs in the code of B<suidperl> could
1N/Aintroduce a security hole.
1N/A
1N/A=head2 Protecting Your Programs
1N/A
1N/AThere are a number of ways to hide the source to your Perl programs,
1N/Awith varying levels of "security".
1N/A
1N/AFirst of all, however, you I<can't> take away read permission, because
1N/Athe source code has to be readable in order to be compiled and
1N/Ainterpreted.  (That doesn't mean that a CGI script's source is
1N/Areadable by people on the web, though.)  So you have to leave the
1N/Apermissions at the socially friendly 0755 level.  This lets
1N/Apeople on your local system only see your source.
1N/A
1N/ASome people mistakenly regard this as a security problem.  If your program does
1N/Ainsecure things, and relies on people not knowing how to exploit those
1N/Ainsecurities, it is not secure.  It is often possible for someone to
1N/Adetermine the insecure things and exploit them without viewing the
1N/Asource.  Security through obscurity, the name for hiding your bugs
1N/Ainstead of fixing them, is little security indeed.
1N/A
1N/AYou can try using encryption via source filters (Filter::* from CPAN,
1N/Aor Filter::Util::Call and Filter::Simple since Perl 5.8).
1N/ABut crackers might be able to decrypt it.  You can try using the byte
1N/Acode compiler and interpreter described below, but crackers might be
1N/Aable to de-compile it.  You can try using the native-code compiler
1N/Adescribed below, but crackers might be able to disassemble it.  These
1N/Apose varying degrees of difficulty to people wanting to get at your
1N/Acode, but none can definitively conceal it (this is true of every
1N/Alanguage, not just Perl).
1N/A
1N/AIf you're concerned about people profiting from your code, then the
1N/Abottom line is that nothing but a restrictive licence will give you
1N/Alegal security.  License your software and pepper it with threatening
1N/Astatements like "This is unpublished proprietary software of XYZ Corp.
1N/AYour access to it does not give you permission to use it blah blah
1N/Ablah."  You should see a lawyer to be sure your licence's wording will
1N/Astand up in court.
1N/A
1N/A=head2 Unicode
1N/A
1N/AUnicode is a new and complex technology and one may easily overlook
1N/Acertain security pitfalls.  See L<perluniintro> for an overview and
1N/AL<perlunicode> for details, and L<perlunicode/"Security Implications
1N/Aof Unicode"> for security implications in particular.
1N/A
1N/A=head2 Algorithmic Complexity Attacks
1N/A
1N/ACertain internal algorithms used in the implementation of Perl can
1N/Abe attacked by choosing the input carefully to consume large amounts
1N/Aof either time or space or both.  This can lead into the so-called
1N/AI<Denial of Service> (DoS) attacks.
1N/A
1N/A=over 4
1N/A
1N/A=item *
1N/A
1N/AHash Function - the algorithm used to "order" hash elements has been
1N/Achanged several times during the development of Perl, mainly to be
1N/Areasonably fast.  In Perl 5.8.1 also the security aspect was taken
1N/Ainto account.
1N/A
1N/AIn Perls before 5.8.1 one could rather easily generate data that as
1N/Ahash keys would cause Perl to consume large amounts of time because
1N/Ainternal structure of hashes would badly degenerate.  In Perl 5.8.1
1N/Athe hash function is randomly perturbed by a pseudorandom seed which
1N/Amakes generating such naughty hash keys harder.
1N/ASee L<perlrun/PERL_HASH_SEED> for more information.
1N/A
1N/AThe random perturbation is done by default but if one wants for some
1N/Areason emulate the old behaviour one can set the environment variable
1N/APERL_HASH_SEED to zero (or any other integer).  One possible reason
1N/Afor wanting to emulate the old behaviour is that in the new behaviour
1N/Aconsecutive runs of Perl will order hash keys differently, which may
1N/Aconfuse some applications (like Data::Dumper: the outputs of two
1N/Adifferent runs are no more identical).
1N/A
1N/AB<Perl has never guaranteed any ordering of the hash keys>, and the
1N/Aordering has already changed several times during the lifetime of
1N/APerl 5.  Also, the ordering of hash keys has always been, and
1N/Acontinues to be, affected by the insertion order.
1N/A
1N/AAlso note that while the order of the hash elements might be
1N/Arandomised, this "pseudoordering" should B<not> be used for
1N/Aapplications like shuffling a list randomly (use List::Util::shuffle()
1N/Afor that, see L<List::Util>, a standard core module since Perl 5.8.0;
1N/Aor the CPAN module Algorithm::Numerical::Shuffle), or for generating
1N/Apermutations (use e.g. the CPAN modules Algorithm::Permute or
1N/AAlgorithm::FastPermute), or for any cryptographic applications.
1N/A
1N/A=item *
1N/A
1N/ARegular expressions - Perl's regular expression engine is so called
1N/ANFA (Non-Finite Automaton), which among other things means that it can
1N/Arather easily consume large amounts of both time and space if the
1N/Aregular expression may match in several ways.  Careful crafting of the
1N/Aregular expressions can help but quite often there really isn't much
1N/Aone can do (the book "Mastering Regular Expressions" is required
1N/Areading, see L<perlfaq2>).  Running out of space manifests itself by
1N/APerl running out of memory.
1N/A
1N/A=item *
1N/A
1N/ASorting - the quicksort algorithm used in Perls before 5.8.0 to
1N/Aimplement the sort() function is very easy to trick into misbehaving
1N/Aso that it consumes a lot of time.  Nothing more is required than
1N/Aresorting a list already sorted.  Starting from Perl 5.8.0 a different
1N/Asorting algorithm, mergesort, is used.  Mergesort is insensitive to
1N/Aits input data, so it cannot be similarly fooled.
1N/A
1N/A=back
1N/A
1N/ASee L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
1N/Aand any computer science text book on the algorithmic complexity.
1N/A
1N/A=head1 SEE ALSO
1N/A
1N/AL<perlrun> for its description of cleaning up environment variables.