request.c revision 38bcc87d9a06e8ba81165421403f275eca4e313b
5462N/A/* ==================================================================== 5462N/A * The Apache Software License, Version 1.1 5462N/A * Copyright (c) 2000-2001 The Apache Software Foundation. All rights 5462N/A * Redistribution and use in source and binary forms, with or without 5462N/A * modification, are permitted provided that the following conditions 5462N/A * 1. Redistributions of source code must retain the above copyright 5462N/A * notice, this list of conditions and the following disclaimer. 5462N/A * 2. Redistributions in binary form must reproduce the above copyright 5462N/A * notice, this list of conditions and the following disclaimer in 5462N/A * the documentation and/or other materials provided with the 5462N/A * 3. The end-user documentation included with the redistribution, 5462N/A * if any, must include the following acknowledgment: 5462N/A * "This product includes software developed by the 6815N/A * Alternately, this acknowledgment may appear in the software itself, 5462N/A * if and wherever such third-party acknowledgments normally appear. 5462N/A * 4. The names "Apache" and "Apache Software Foundation" must 5462N/A * not be used to endorse or promote products derived from this 5462N/A * software without prior written permission. For written 6815N/A * permission, please contact apache@apache.org. 5462N/A * 5. Products derived from this software may not be called "Apache", 5462N/A * nor may "Apache" appear in their name, without prior written 6815N/A * permission of the Apache Software Foundation. 5462N/A * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED 5462N/A * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 5462N/A * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 6815N/A * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR 5462N/A * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 5462N/A * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 5462N/A * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF 5462N/A * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 5462N/A * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 5462N/A * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 5462N/A * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 5462N/A * ==================================================================== 5462N/A * This software consists of voluntary contributions made by many 5462N/A * individuals on behalf of the Apache Software Foundation. For more 5462N/A * information on the Apache Software Foundation, please see * Portions of this software are based upon public domain software * originally written at the National Center for Supercomputing Applications, * University of Illinois, Urbana-Champaign. * Thoroughly revamped by rst for Apache. NB this file reads * best from the bottom up. "configuration error: couldn't %s: %s",
phase, r->
uri);
/* This is the master logic for processing requests. Do NOT duplicate * this logic elsewhere, or the security model will be broken by future * API changes. Each phase must be individually optimized to pick up /* Ignore embedded %2F's in path for proxy requests */ /* File-specific requests with no 'true' URI are a huge pain... they * cannot bubble through the next several steps. Only subrequests may * have an empty uri, otherwise let translate_name kill the request. /* Reset to the server default config prior to running map_to_storage /* This request wasn't in storage (e.g. TRACE) */ /* Excluding file-specific requests with no 'true' URI... /* Rerun the location walk, which overrides any map_to_storage config. /* Only on the main request! */ * and that configuration didn't change (this requires optimized _walk() * functions in map_to_storage that use the same merge results given * identical input.) If the config changes, we must re-auth. ?
"check user. No user file?" :
"perform authentication. AuthType not set!", r);
?
"check access. No groups file?" :
"perform authentication. AuthType not set!", r);
:
"perform authentication. AuthType not set!", r);
?
"check user. No user file?" :
"perform authentication. AuthType not set!", r);
?
"check access. No groups file?" :
"perform authentication. AuthType not set!", r);
/* XXX Must make certain the ap_run_type_checker short circuits mime /* The new insert_filter stage makes sense here IMHO. We are sure that * we are going to run the request now, so we may as well insert filters * if any are available. Since the goal of this phase is to allow all * modules to insert a filter if they want to, this filter returns * void. I just can't see any way that this filter can reasonably * fail, either your modules inserts something or it doesn't. rbb /* Useful caching structures to repeat _walk/merge sequences as required * when a subrequest or redirect reuses substantially the same config. * Directive order in the httpd.conf file and its Includes significantly * impact this optimization. Grouping common blocks at the front of the * config that are less likely to change between a request and * its subrequests, or between a request and its redirects reduced * the work of these functions significantly. const char *
cached;
/* The identifier we matched */ /* Find the most relevant, recent entry to work from. That would be * this request (on the second call), or the parent request of a * subrequest, or the prior request of an internal redirect. Provide * this _walk()er with a copy it is allowed to munge. If there is no * parent or prior cached request, then create a new walk cache. /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. * We don't want people able to serve up pipes, or unix sockets, or other * scary things. Note that symlink tests are performed later. "object is not a file, directory or symlink: %s",
* resolve_symlink must _always_ be called on an APR_LNK file type! * It will resolve the actual target file type, modification date, etc, * and provide any processing required for symlink evaluation. * Path must already be cleaned, no trailing slash, no multi-slashes, * and don't call this on the root! * Simply, the number of times we deref a symlink are minimal compared * to the number of times we had an extra lstat() since we 'weren't sure'. * To optimize, we stat() anything when given (opts & OPT_SYM_LINKS), otherwise * we start off with an lstat(). Every lstat() must be dereferenced in case * it points at a 'nasty' - we must always rerun check_safe_file (or similar.) /* OPT_SYM_OWNER only works if we can get the owner of * both the file and symlink. First fill in a missing * owner of the symlink, then get the info of the target. /* Give back the target */ /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. /* XXX: Better (faster) tests needed!!! * "OK" as a response to a real problem is not _OK_, but to allow broken * modules to proceed, we will permit the not-a-path filename to pass here. * We must catch it later if it's heading for the core handler. Leave an * INFO note here for module debugging. "Module bug? Request filename path %s is missing or " "or not absolute for uri %s",
* r->path_info tracks the remaining source path. * r->filename tracks the path as we build it. * we begin our adventure at the root... "Module bug? Request filename path %s is invalid or " "or not absolute for uri %s",
* Go down the directory hierarchy. Where we have to check for symlinks, * do so. Where a .htaccess file has permission to override anything, /* If we have a file already matches the path of r->filename, * and the vhost's list of directory sections hasn't changed, * we can skip rewalking the directory_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a locations list that can be merged to any vhost. * We must play our own mimi-merge game here, for the few * running dir_config values we care about within dir_walk. * We didn't start the merge from r->per_dir_config, so we * accumulate opts and override as we merge, from the globals. * seg keeps track of which segment we've copied. * sec_idx keeps track of which section we're on, since sections are * ordered by number of segments. See core_reorder_directories /* We have no trailing slash, but we sure would appreciate one... /* Begin *this* level by looking for matching <Directory> sections * from the server config. /* No more possible matches for this many segments? /* We will never skip '0' element components, e.g. plain old * <Directory >, and <Directory "/"> are classified as zero * Otherwise, skip over the mismatches. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Do a mini-merge to our globally-based running calculations of * core_dir->override and core_dir->opts, since now_merged * never considered the global config. Of course, if there is no * core config at this level, continue without a thought. * See core.c::merge_core_dir_configs() for explanation. /* If .htaccess files are enabled, check for one, provided we * have reached a real path. /* If we merged this same htaccess last time, reuse it... * this wouldn't work except that we cache the htaccess * sections for the lifetime of the request, so we match * the same conf. Good planning (no, pure luck ;) /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Do a mini-merge to our globally-based running calculations of * core_dir->override and core_dir->opts, since now_merged * never considered the global config. Of course, if there is no * core config at this level, continue without a thought. * See core.c::merge_core_dir_configs() for explanation. /* That temporary trailing slash was useful, now drop it. /* Time for all good things to come to an end? /* Now it's time for the next segment... * We will assume the next element is an end node, and fix it up /* If nothing remained but a '/' string, we are finished /* XXX: Optimization required: * If...we have allowed symlinks, and * if...we find the segment exists in the directory list * skip the lstat and dummy up an APR_DIR value for r->finfo * this means case sensitive platforms go quite quickly. * Case insensitive platforms might be given the wrong path, * but if it's not found in the cache, then we know we have * something to test (the misspelling is never cached.) /* We choose apr_lstat here, rather that apr_stat, so that we * capture this path object rather than its target. We will * replace the info with our target's info below. We especially * want the name of this 'link' object, not the name of its * target, if we are fixing case. /* Nothing? That could be nice. But our directory walk is done. "access to %s denied", r->
uri);
/* If we hit ENOTDIR, we must have over-optimized, deny * rather than assume not found. "access to %s failed", r->
uri);
/* Fix up the path now if we have a name, and they don't agree * redirect is required here? /* Is this an possibly acceptable symlink? "Symbolic link not allowed: %s", r->
filename);
/* Ok, we are done with the link's info, test the real target /* That was fun, nothing left for us here "symlink doesn't point to a file or directory: %s",
* Now we'll deal with the regexes, note we pick up sec_idx * where we left off (we gave up after we hit entry_core->r) /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but the original walk * found some additional matches. Truncate them. /* It seems this shouldn't be needed anymore. We translated the symlink above x into a real resource, and should have died up there. Even if we keep this, x it needs more thought (maybe an r->file_is_symlink) perhaps it should actually x happen in file_walk, so we catch more obscure cases in autoindex sub requests, etc. x * Symlink permissions are determined by the parent. If the request is x * for a directory then applying the symlink test here would use the x * permissions of the directory as opposed to its parent. Consider a x * symlink pointing to a dir with a .htaccess disallowing symlinks. If x * you access /symlink (or /symlink/) you would get a 403 without this x * you would *not* get the 403. x if (r->finfo.filetype != APR_DIR x && (res = resolve_symlink(r->filename, r->info, ap_allow_options(r), r->pool))) { x ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r, x "Symbolic link not allowed: %s", r->filename); /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /* No tricks here, there are no <Locations > to parse in this vhost. * We won't destroy the cache, just in case _this_ redirect is later * redirected again to a vhost with <Location > blocks to optimize. /* Location and LocationMatch differ on their behaviour w.r.t. multiple * slashes. Location matches multiple slashes with a single slash, * LocationMatch doesn't. An exception, for backwards brokenness is * absoluteURIs... in which case neither match multiple slashes. /* If we have an cache->cached location that matches r->uri, * and the vhost's list of locations hasn't changed, we can skip * rewalking the location_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a locations list that can be merged to any vhost. /* Go through the location entries, and check for matches. * We apply the directive sections in given order, we should * really try them with the most general first. /* ### const strlen can be optimized in location config parsing */ /* Test the regex, fnmatch or string as appropriate. * If it's a strcmp, and the <Location > pattern was * not slash terminated, then this uri must be slash * terminated (or at the end of the string) to match. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but the original walk * found some additional matches. Truncate them. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /* To allow broken modules to proceed, we allow missing filenames to pass. * We will catch it later if it's heading for the core handler. * directory_walk already posted an INFO note for module debugging. /* No tricks here, there are just no <Files > to parse in this context. * We won't destroy the cache, just in case _this_ redirect is later * redirected again to a context containing the same or similar <Files >. /* Get the basename .. and copy for the cache just * in case r->filename is munged by another module /* If we have an cache->cached file name that matches test_file, * and the directory's list of file sections hasn't changed, we * can skip rewalking the file_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * Otherwise (as is the case with most dir_merged requests) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a file section list that can be merged to any dir_walk. /* Go through the location entries, and check for matches. * We apply the directive sections in given order, we should * really try them with the most general first. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but the original walk * found some additional matches. Truncate them. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /***************************************************************** * The sub_request mechanism. * Fns to look up a relative URI from, e.g., a map file or SSI document. * These do all access checks, etc., but don't actually run the transaction * ... use run_sub_req below for that. Also, be sure to use destroy_sub_req * as appropriate if you're likely to be creating more than a few of these. * (An early Apache version didn't destroy the sub_reqs used in directory * indexing. The result, when indexing a directory with 800-odd files in * it, was massively excessive storage allocation). * Note more manipulation of protocol-specific vars in the request /* Start a clean config from this subrequest's vhost. Optimization in * config blocks of the subrequest match the parent request, no merges * will actually occur (and generally a minimal number of merges are * required, even if the parent and subrequest aren't quite identical.) /* make a copy of the allowed-methods list */ /* start with the same set of output filters */ /* no input filters for a subrequest */ /* Is there a require line configured for the type of *this* req? */ /* We have to run this after fill_in_sub_req_vars, or the r->main /* would be nicer to pass "method" to ap_set_sub_req_protocol */ /* We have to run this after fill_in_sub_req_vars, or the r->main * Special case: we are looking at a relative lookup in the same directory. * That means we won't have to redo directory_walk, and we may * not even have to redo access checks. /* This is 100% safe, since dirent->name just came from the filesystem */ /* Preserve the apr_stat results, and perhaps we also tag that * symlinks were tested and/or found for r->filename. * apr_dir_read isn't very complete on this platform, so * we need another apr_lstat (or simply apr_stat if we allow * all symlinks here.) If this is an APR_LNK that resolves * to an APR_DIR, then we will rerun everything anyways... /* We have to run this after fill_in_sub_req_vars, or the r->main /* Translate r->filename, if it was canonical, it stays canonical * Check for a special case... if there are no '/' characters in new_file * at all, and the path was the same, then we are looking at a relative * lookup in the same directory. Fixup the URI to match. /* XXX: @@@: What should be done with the parsed_uri values? * We would be better off stripping down to the 'common' elements * of the path, then reassembling the URI as best as we can. * XXX: this should be set properly like it is in the same-dir case * but it's actually sometimes to impossible to do it... because the * file may not have a uri associated with it -djg * Function to set the r->mtime field to the specified value if it's later * than what's already there. * Is it the initial main request, which we only get *once* per HTTP request? (r->
main ==
NULL)
/* otherwise, this is a sub-request */ (r->
prev ==
NULL);
/* otherwise, this is an internal redirect */