http_request.c revision 6f984345bbfa9342dde1f2b7b8c35b7987d078af
* Copyright (c) 2000 The Apache Software Foundation. All rights * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * 3. The end-user documentation included with the redistribution, * if any, must include the following acknowledgment: * "This product includes software developed by the * Alternately, this acknowledgment may appear in the software itself, * if and wherever such third-party acknowledgments normally appear. * 4. The names "Apache" and "Apache Software Foundation" must * not be used to endorse or promote products derived from this * software without prior written permission. For written * permission, please contact apache@apache.org. * 5. Products derived from this software may not be called "Apache", * nor may "Apache" appear in their name, without prior written * permission of the Apache Software Foundation. * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * ==================================================================== * This software consists of voluntary contributions made by many * individuals on behalf of the Apache Software Foundation. For more * information on the Apache Software Foundation, please see * Portions of this software are based upon public domain software * originally written at the National Center for Supercomputing Applications, * University of Illinois, Urbana-Champaign. * Thoroughly revamped by rst for Apache. NB this file reads * best from the bottom up. /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. * We don't want people able to serve up pipes, or unix sockets, or other * scary things. Note that symlink tests are performed later. "object is not a file, directory or symlink: %s",
/* OS/2 doesn't have symlinks */ * Strip trailing '/', if any, off what we're checking; trailing slashes * make some systems follow symlinks to directories even in lstat(). * After we've done the lstat, put it back. Also, don't bother checking * Note that we don't have to worry about multiple slashes here because of return OK;
/* Root directory, '/' */ * Note that we don't reject accesses to nonexistent files (multiviews or * the like may cons up a way to run the transaction anyway)... /* OK, it's a symlink. May still be OK with OPT_SYM_OWNER */ /* Dealing with the file system to get PATH_INFO /* assume path_info already set */ /* If the directory is x:\, then we don't want to strip * the trailing slash since x: is not a valid directory. * advance over the trailing slash. Any other * UNC name is OK to strip the slash. path[
2] !=
'/' &&
cp[-
1] ==
'/') {
/* Advance over trailing slashes ... NOT part of filename * if file is not a UNC name (Win32 only). /* See if the pathname ending here exists... */ /* ### We no longer need the test ap_os_is_filename_valid() here * since apr_stat isn't a posix thing - it's apr_stat's responsibility * to handle whatever path string arrives at it's door - by platform * and volume restrictions as applicable... * TODO: This code becomes even simpler if apr_stat grows * an APR_PATHINCOMPLETE result to indicate that we are staring at * Aha! Found something. If it was a directory, we will search * contents of that directory for a multi_match, so the PATH_INFO * argument starts with the component after that. "access to %s denied", r->
uri);
"access to %s failed", r->
uri);
* Are we dealing with a file? If not, we can (hopefuly) safely assume we * have a handler that doesn't require one, but for safety's sake, and so * we have something find_types() can get something out of, fake one. But * don't run through the directory entries. * Go down the directory hierarchy. Where we have to check for symlinks, * do so. Where a .htaccess file has permission to override anything, * try to find one. If either of these things fails, we could poke * around, see why, and adjust the lookup_rec accordingly --- this might * save us a call to get_path_info (with the attendant stat()s); however, * for the moment, that's not worth the trouble. * Fake filenames (i.e. proxy:) only match Directory sections. /* XXX This needs to be rolled into APR, the APR function will not * be allowed to fold the case of any non-existant segment of the path: /* TODO This is rather silly right here, we should simply be setting * filename and path_info at the end of our directory_walk /* XXX This becomes mute, and will already happen above for elements /* XXX This becomes mute, since the APR canonical parsing will handle * 2slash and dot directory issues: /* XXX This needs to be rolled into APR: */ * We will use test_dirname as scratch space while we build directory * names during the walk. Profiling shows directory_walk to be a busy * function so we try to avoid allocating lots of extra memory here. * We need 2 extra bytes, one for trailing \0 and one because * make_dirstr_prefix will add potentially one extra /. /* XXX These exception cases go away if apr_stat() returns the * APR_PATHINCOMPLETE status, so we skip hard filesystem testing * of the initial 'pseudo' elements: /* If the name is a UNC name, then do not perform any true file test * This is optimized to use the normal walk (skips the redundant '/' root) /* If the name is a fully qualified volume name, then do not perform any * true file test on the machine name (start at machine/share:/) * XXX: The implementation eludes me at this moment... * Does this make sense? Please test! /* Should match <Directory> sections starting from '/', not 'e:/' * they have one for each filesystem. Traditionally, Apache has treated * <Directory /> permissions as the base for the whole server, and this * tradition should probably be preserved. * NOTE: MUST SYNC WITH ap_make_dirstr_prefix() CHANGE IN src/main/util.c /* Normal File Systems are rooted at / */ #
endif /* def HAVE_DRIVE_LETTERS || NETWARE */ /* j keeps track of which section we're on, see core_reorder_directories */ * XXX: this could be made faster by only copying the next component * rather than copying the entire thing all over. * Do symlink checks first, because they are done with the * permissions appropriate to the *parent* directory... /* Test only legal names against the real filesystem */ * Begin *this* level by looking for matching <Directory> sections /* To account for the top-level "/" directory when i == 0 * XXX: The net test may be wrong... may fail ap_os_is_path_absolute #
endif /* def HAVE_DRIVE_LETTERS || NETWARE */ /* So that other top-level directory sections (e.g. "e:/") aren't * XXX: I don't get you here, Tim... That's a level 1 section, but * we are at level 0. Did you mean fast-forward to the next? #
endif /* def HAVE_DRIVE_LETTERS || NETWARE */ /* If .htaccess files are enabled, check for one. */ /* Test only legal names against the real filesystem */ * There's two types of IS_SPECIAL sections (see http_core.c), and we've * already handled the proxy:-style stuff. Now we'll deal with the * Symlink permissions are determined by the parent. If the request is * for a directory then applying the symlink test here would use the * permissions of the directory as opposed to its parent. Consider a * symlink pointing to a dir with a .htaccess disallowing symlinks. If * you access /symlink (or /symlink/) you would get a 403 without this * you would *not* get the 403. "Symbolic link not allowed: %s", r->
filename);
return OK;
/* Can only "fail" if access denied by the /* Location and LocationMatch differ on their behaviour w.r.t. multiple * slashes. Location matches multiple slashes with a single slash, * LocationMatch doesn't. An exception, for backwards brokenness is * absoluteURIs... in which case neither match multiple slashes. /* Go through the location entries, and check for matches. */ /* we apply the directive sections in some order; * should really try them with the most general first. /* Go through the file entries, and check for matches. */ /* we apply the directive sections in some order; * should really try them with the most general first. /***************************************************************** * The sub_request mechanism. * Fns to look up a relative URI from, e.g., a map file or SSI document. * These do all access checks, etc., but don't actually run the transaction * ... use run_sub_req below for that. Also, be sure to use destroy_sub_req * as appropriate if you're likely to be creating more than a few of these. * (An early Apache version didn't destroy the sub_reqs used in directory * indexing. The result, when indexing a directory with 800-odd files in * it, was massively excessive storage allocation). * Note more manipulation of protocol-specific vars in the request /* make a copy of the allowed-methods list */ /* start with the same set of output filters */ /* no input filters for a subrequest */ /* would be nicer to pass "method" to ap_set_sub_req_protocol */ * We could be clever at this point, and avoid calling directory_walk, * etc. However, we'd need to test that the old and new filenames contain * the same directory components, so it would require duplicating the * start of translate_name. Instead we rely on the cache of .htaccess * NB: directory_walk() clears the per_dir_config, so we don't inherit * from location_walk() above /* make a copy of the allowed-methods list */ /* start with the same set of output filters */ /* no input filters for a subrequest */ * Check for a special case... if there are no '/' characters in new_file * at all, then we are looking at a relative lookup in the same * directory. That means we won't have to redo directory_walk, and we may * not even have to redo access checks. * no matter what, if it's a subdirectory, we need to re-run * do a file_walk, if it doesn't change the per_dir_config then * we know that we don't have to redo all the access checks /* XXX: @@@: What should be done with the parsed_uri values? */ * XXX: this should be set properly like it is in the same-dir case * but it's actually sometimes to impossible to do it... because the * file may not have a uri associated with it -djg rnew->
uri =
"INTERNALLY GENERATED file-relative req";
/* see comments in process_request_internal() */ /***************************************************************** * Mainline request processing... * The following takes care of Apache redirects to custom response URLs * Note that if we are already dealing with the response to some other * error condition, we just report on the original error, and give up on * any attempt to handle the other thing "intelligently"... r = r->
prev;
/* Get back to original error */ * This test is done here so that none of the auth modules needs to know * about proxy authentication. They treat it like normal auth, and then * If we want to keep the connection, be sure that the request body * (if any) has been read. * Two types of custom redirects --- plain text, and URLs. Plain text has * a leading '"', so the URL code, here, is triggered on its absence * The URL isn't local, so lets drop through the rest of this * apache code, and continue with the usual REDIRECT handler. * But note that the client will ultimately see the wrong * This redirect needs to be a GET no matter what the original * Provide a special method for modules to communicate * more informative (than the plain canned) messages to us. * Propagate them to ErrorDocuments via the ERROR_NOTES variable: * Dumb user has given us a bad url to redirect to --- fake up * dying with a recursive server error... "Invalid error redirection directive: %s",
"configuration error: couldn't %s: %s",
phase, r->
uri);
/* Is there a require line configured for the type of *this* req? */ /* Ignore embedded %2F's in path for proxy requests */ * We don't want TRACE to run through the normal handler set, we * NB: directory_walk() clears the per_dir_config, so we don't inherit * from location_walk() above ?
"check user. No user file?" :
"perform authentication. AuthType not set!", r);
?
"check access. No groups file?" :
"perform authentication. AuthType not set!", r);
:
"perform authentication. AuthType not set!", r);
?
"check user. No user file?" :
"perform authentication. AuthType not set!", r);
?
"check access. No groups file?" :
"perform authentication. AuthType not set!", r);
/* The new insert_filter stage makes sense here IMHO. We are sure that * we are going to run the request now, so we may as well insert filters * if any are available. Since the goal of this phase is to allow all * modules to insert a filter if they want to, this filter returns * void. I just can't see any way that this filter can reasonably * fail, either your modules inserts something or it doesn't. rbb /* Take care of little things that need to happen when we're done */ /* We just send directly to the connection based filters, because at * this point, we know that we have seen all of the data, so we just * want to flush the buckets if something hasn't been sent to the * We want to flush the last packet if this isn't a pipelining connection * *before* we start into logging. Suppose that the logging causes a DNS * lookup to occur, which may have a high latency. If we hold off on * this packet, then it'll appear like the link is stalled when really * it's the application that's stalled. * another missing cleanup. It's particularly inappropriate to be * setting header_only, etc., here. /* Inherit the rest of the protocol info... */ * XXX: hmm. This is because mod_setenvif and mod_unique_id really need * to do their thing on internal redirects as well. Perhaps this is a /* This function is designed for things like actions or CGI scripts, when * using AddHandler, and you want to preserve the content type across * Is it the initial main request, which we only get *once* per HTTP request? (r->
main ==
NULL)
/* otherwise, this is a sub-request */ (r->
prev ==
NULL);
/* otherwise, this is an internal redirect */ * Function to set the r->mtime field to the specified value if it's later * than what's already there. * Get rid of any current settings if requested; not just the * well-known methods but any extensions as well.