request.c revision 4a13940dc2990df0a798718d3a3f9cf1566c2217
"(decoded='%s'), returning 404",
/* All file subrequests are a huge pain... they cannot bubble through the * next several steps. Only file subrequests are allowed an empty uri, * otherwise let translate_name kill the request. /* Reset to the server default config prior to running map_to_storage /* This request wasn't in storage (e.g. TRACE) */ /* Rerun the location walk, which overrides any map_to_storage config. /* Only on the main request! */ * and that configuration didn't change (this requires optimized _walk() * functions in map_to_storage that use the same merge results given * identical input.) If the config changes, we must re-auth. /* XXX Must make certain the ap_run_type_checker short circuits mime /* Useful caching structures to repeat _walk/merge sequences as required * when a subrequest or redirect reuses substantially the same config. * Directive order in the httpd.conf file and its Includes significantly * impact this optimization. Grouping common blocks at the front of the * config that are less likely to change between a request and * its subrequests, or between a request and its redirects reduced * the work of these functions significantly. const char *
cached;
/* The identifier we matched */ int count;
/* Number of prev invocations of same call in this (sub)req */ /* Find the most relevant, recent walk cache to work from and provide * a copy the caller is allowed to munge. In the case of a sub-request * or internal redirect, this is the cache corresponding to the equivalent * invocation of the same function call in the "parent" request, if such * a cache exists. Otherwise it is the walk cache of the previous * invocation of the same function call in the current request, if * that exists; if not, then create a new walk cache. /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. * resolve_symlink must _always_ be called on an APR_LNK file type! * It will resolve the actual target file type, modification date, etc, * and provide any processing required for symlink evaluation. * Path must already be cleaned, no trailing slash, no multi-slashes, * and don't call this on the root! * Simply, the number of times we deref a symlink are minimal compared * to the number of times we had an extra lstat() since we 'weren't sure'. * To optimize, we stat() anything when given (opts & OPT_SYM_LINKS), otherwise * we start off with an lstat(). Every lstat() must be dereferenced in case * it points at a 'nasty' - we must always rerun check_safe_file (or similar.) /* Save the name from the valid bits. */ /* if OPT_SYM_OWNER is unset, we only need to check target accessible */ /* Give back the target */ /* OPT_SYM_OWNER only works if we can get the owner of * both the file and symlink. First fill in a missing * owner of the symlink, then get the info of the target. /* Give back the target */ * As we walk the directory configuration, the merged config won't * be 'rooted' to a specific vhost until the very end of the merge. * We need a very fast mini-merge to a real, vhost-rooted merge * of core.opts and core.override, the only options tested within * See core.c::merge_core_dir_configs() for explanation. /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. /* XXX: Better (faster) tests needed!!! * "OK" as a response to a real problem is not _OK_, but to allow broken * modules to proceed, we will permit the not-a-path filename to pass the * following two tests. This behavior may be revoked in future versions * of Apache. We still must catch it later if it's heading for the core * handler. Leave INFO notes here for module debugging. "Module bug? Request filename is missing for URI %s",
/* Canonicalize the file path without resolving filename case or aliases * so we can begin by checking the cache for a recent directory walk. * This call will ensure we have an absolute path in the same pass. "Module bug? Request filename path %s is invalid or " "or not absolute for uri %s",
/* XXX Notice that this forces path_info to be canonical. That might * not be desired by all apps. However, some of those same apps likely * have significant security holes. /* If this is not a dirent subrequest with a preconstructed * r->finfo value, then we can simply stat the filename to * save burning mega-cycles with unneeded stats - if this is * an exact file match. We don't care about failure... we * will stat by component failing this meager attempt. * It would be nice to distinguish APR_ENOENT from other * types of failure, such as APR_ENOTDIR. We can do something * with APR_ENOENT, knowing that the path is good. * a regular file but we have '/' at the end of the name; * other OSs will return APR_ENOTDIR for that situation; * handle it the same everywhere by simulating a failure * if it looks like a directory but really isn't * Also reset if the stat failed, just for safety. /* If we have a file already matches the path of r->filename, * and the vhost's list of directory sections hasn't changed, * we can skip rewalking the directory_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * we must merge our dir_conf_merged onto this new r->per_dir_config. * If Symlinks are allowed in general we do not need the following * APR_INCOMPLETE is as fine as result as APR_SUCCESS as we * have added APR_FINFO_NAME to the wanted parameter of * apr_stat above. On Unix platforms this means that apr_stat * is always going to return APR_INCOMPLETE in the case that * the call to the native stat / lstat did not fail. * This should never happen, because we did a stat on the * same file, resolving a possible symlink several lines * above. Therefore do not make a detailed analysis of rv * in this case for the reason of the failure, just bail out * with a HTTP_FORBIDDEN in case we hit a race condition "access to %s failed; stat of '%s' failed.",
/* Is this a possibly acceptable symlink? */ "Symbolic link not allowed " "or link target not accessible: %s",
/* We start now_merged from NULL since we want to build * a locations list that can be merged to any vhost. /* Invariant: from the first time filename_len is set until * it goes out of scope, filename_len==strlen(r->filename) * We must play our own mini-merge game here, for the few * running dir_config values we care about within dir_walk. * We didn't start the merge from r->per_dir_config, so we * accumulate opts and override as we merge, from the globals. /* Set aside path_info to merge back onto path_info later. * If r->filename is a directory, we must remerge the path_info, * before we continue! [Directories cannot, by defintion, have * path info. Either the next segment is not-found, or a file.] * r->path_info tracks the unconsumed source path. * r->filename tracks the path as we process it "dir_walk error, path_info %s is not relative " "to the filename path %s for uri %s",
* Now build r->filename component by component, starting * with the root (on Unix, simply "/"). We will make a huge * assumption here for efficiency, that any canonical path * already given included a canonical root. * Bad assumption above? If the root's length is longer * than the canonical length, then it cannot be trusted as * a truename. So try again, this time more seriously. #
else /* ndef CASE_BLIND_FILESYSTEM, really this simple for Unix today; */ "dir_walk error, could not determine the root " "path of filename %s%s for uri %s",
/* Working space for terminating null and an extra / is required. * seg keeps track of which segment we've copied. * sec_idx keeps track of which section we're on, since sections are * ordered by number of segments. See core_reorder_directories * startseg tells us how many segments describe the root path * e.g. the complete path "//host/foo/" to a UNC share (4) * Go down the directory hierarchy. Where we have to check for * symlinks, do so. Where a .htaccess file has permission to * override anything, try to find one. /* We have no trailing slash, but we sure would appreciate one. * However, we don't want to append a / our first time through. /* Begin *this* level by looking for matching <Directory> sections * from the server config. /* No more possible matches for this many segments? /* We will never skip '0' element components, e.g. plain old * <Directory >, and <Directory "/"> are classified as zero * Otherwise, skip over the mismatches. /* If we haven't continue'd above, we have a match. * Calculate our full-context core opts & override. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* If .htaccess files are enabled, check for one, provided we * have reached a real path. do {
/* Not really a loop, just a break'able code block */ /* No htaccess in an incomplete root path, /* If we are still here, we found our htaccess. * Calculate our full-context core opts & override. /* If we merged this same htaccess last time, reuse it... * this wouldn't work except that we cache the htaccess * sections for the lifetime of the request, so we match * the same conf. Good planning (no, pure luck ;) /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset }
while (0);
/* Only one htaccess, not a real loop */ /* That temporary trailing slash was useful, now drop it. /* Time for all good things to come to an end? /* Now it's time for the next segment... * We will assume the next element is an end node, and fix it up /* If nothing remained but a '/' string, we are finished * XXX: NO WE ARE NOT!!! Now process this puppy!!! */ * If...we knew r->filename was a file, and * if...we have strict (case-sensitive) filenames, or * we know the canonical_filename matches to _this_ name, and * if...we have allowed symlinks * skip the lstat and dummy up an APR_DIR value for thisinfo. /* We choose apr_stat with flag APR_FINFO_LINK here, rather that * plain apr_stat, so that we capture this path object rather than * its target. We will replace the info with our target's info * below. We especially want the name of this 'link' object, not * the name of its target, if we are fixing the filename /* Nothing? That could be nice. But our directory "access to %s denied", r->
uri);
/* If we hit ENOTDIR, we must have over-optimized, deny * rather than assume not found. "access to %s failed", r->
uri);
/* Fix up the path now if we have a name, and they don't agree * redirect is required here? We need to walk the URI and * filename in tandem to properly correlate these. /* Is this a possibly acceptable symlink? "Symbolic link not allowed " "or link target not accessible: %s",
/* Ok, we are done with the link's info, test the real target /* That was fun, nothing left for us here "Forbidden: %s doesn't point to " /* If we have _not_ optimized, this is the time to recover /* Now splice the saved path_info back onto any new path_info * Now we'll deal with the regexes, note we pick up sec_idx * where we left off (we gave up after we hit entry_core->r) /* If we haven't already continue'd above, we have a match. * Calculate our full-context core opts & override. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but either the original * walk found some additional matches (which we need to truncate), or * this walk found some additional matches. /* It seems this shouldn't be needed anymore. We translated the x symlink above into a real resource, and should have died up there. x Even if we keep this, it needs more thought (maybe an r->file_is_symlink) x perhaps it should actually happen in file_walk, so we catch more x obscure cases in autoindex subrequests, etc. x * Symlink permissions are determined by the parent. If the request is x * for a directory then applying the symlink test here would use the x * permissions of the directory as opposed to its parent. Consider a x * symlink pointing to a dir with a .htaccess disallowing symlinks. If x * you access /symlink (or /symlink/) you would get a 403 without this x * you would *not* get the 403. x if (r->finfo.filetype != APR_DIR x && (res = resolve_symlink(r->filename, r->info, ap_allow_options(r), x ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, x "Symbolic link not allowed: %s", r->filename); /* Save future sub-requestors much angst in processing * this subrequest. If dir_walk couldn't canonicalize * the file path, nothing can. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /* No tricks here, there are no <Locations > to parse in this vhost. * We won't destroy the cache, just in case _this_ redirect is later * redirected again to a vhost with <Location > blocks to optimize. /* Location and LocationMatch differ on their behaviour w.r.t. multiple * slashes. Location matches multiple slashes with a single slash, * LocationMatch doesn't. An exception, for backwards brokenness is * absoluteURIs... in which case neither match multiple slashes. /* If we have an cache->cached location that matches r->uri, * and the vhost's list of locations hasn't changed, we can skip * rewalking the location_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a locations list that can be merged to any vhost. /* Go through the location entries, and check for matches. * We apply the directive sections in given order, we should * really try them with the most general first. /* ### const strlen can be optimized in location config parsing */ /* Test the regex, fnmatch or string as appropriate. * If it's a strcmp, and the <Location > pattern was * not slash terminated, then this uri must be slash * terminated (or at the end of the string) to match. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but either the original * walk found some additional matches (which we need to truncate), or * this walk found some additional matches. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /* To allow broken modules to proceed, we allow missing filenames to pass. * We will catch it later if it's heading for the core handler. * directory_walk already posted an INFO note for module debugging. /* No tricks here, there are just no <Files > to parse in this context. * We won't destroy the cache, just in case _this_ redirect is later * redirected again to a context containing the same or similar <Files >. /* Get the basename .. and copy for the cache just * in case r->filename is munged by another module /* If we have an cache->cached file name that matches test_file, * and the directory's list of file sections hasn't changed, we * can skip rewalking the file_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * Otherwise (as is the case with most dir_merged requests) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a file section list that can be merged to any dir_walk. /* Go through the location entries, and check for matches. * We apply the directive sections in given order, we should * really try them with the most general first. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but either the original * walk found some additional matches (which we need to truncate), or * this walk found some additional matches. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /***************************************************************** * The sub_request mechanism. * Fns to look up a relative URI from, e.g., a map file or SSI document. * These do all access checks, etc., but don't actually run the transaction * ... use run_sub_req below for that. Also, be sure to use destroy_sub_req * as appropriate if you're likely to be creating more than a few of these. * (An early Apache version didn't destroy the sub_reqs used in directory * indexing. The result, when indexing a directory with 800-odd files in * it, was massively excessive storage allocation). * Note more manipulation of protocol-specific vars in the request /* Start a clean config from this subrequest's vhost. Optimization in * config blocks of the subrequest match the parent request, no merges * will actually occur (and generally a minimal number of merges are * required, even if the parent and subrequest aren't quite identical.) /* make a copy of the allowed-methods list */ /* start with the same set of output filters */ /* while there are no input filters for a subrequest, we will * try to insert some, so if we don't have valid data, the code /* If NULL - we are expecting to be internal_fast_redirect'ed * to this subrequest - or this request will never be invoked. * Ignore the original request filter stack entirely, and * drill the input and output stacks back to the connection. /* no input filters for a subrequest */ /* We have to run this after we fill in sub req vars, * or the r->main pointer won't be setup /* Begin by presuming any module can make its own path_info assumptions, * until some module interjects and changes the value. /* Pass on the kept body (if any) into the new request. */ /* Is there a require line configured for the type of *this* req? */ /* Initialise res, to avoid a gcc warning */ /* would be nicer to pass "method" to ap_set_sub_req_protocol */ /* We cannot return NULL without violating the API. So just turn this * subrequest into a 500 to indicate the failure. */ * If the content can be served by the quick_handler, we can * safely bypass request_internal processing. * If next_filter is NULL we are expecting to be * internal_fast_redirect'ed to the subrequest, or the subrequest will * never be invoked. We need to make sure that the quickhandler is not * invoked by any lookups. Since an internal_fast_redirect will always * occur too late for the quickhandler to handle the request. /* Special case: we are looking at a relative lookup in the same directory. * This is 100% safe, since dirent->name just came from the filesystem. /* strip path_info off the end of the uri to keep it in sync * with r->filename, which has already been stripped by directory_walk, * merge the dirent->name, and then, if the caller wants us to remerge * the original path info, do so. Note we never fix the path_info back * to r->filename, since dir_walk would do so (but we don't expect it * to happen in the usual cases) /* XXX This is now less relevant; we will do a full location walk * these days for this case. Preserve the apr_stat results, and * perhaps we also tag that symlinks were tested and/or found for * apr_dir_read isn't very complete on this platform, so * we need another apr_stat (with or without APR_FINFO_LINK * depending on whether we allow all symlinks here.) If this * is an APR_LNK that resolves to an APR_DIR, then we will rerun * everything anyways... this should be safe. * Resolve this symlink. We should tie this back to dir_walk's cache /* ap_make_full_path overallocated the buffers * by one character to help us out here. /* fill in parsed_uri values /* We cannot return NULL without violating the API. So just turn this * subrequest into a 500. */ /* Translate r->filename, if it was canonical, it stays canonical * Check for a special case... if there are no '/' characters in new_file * at all, and the path was the same, then we are looking at a relative * lookup in the same directory. Fixup the URI to match. /* XXX: @@@: What should be done with the parsed_uri values? * We would be better off stripping down to the 'common' elements * of the path, then reassembling the URI as best as we can. * XXX: this should be set properly like it is in the same-dir case * but it's actually sometimes to impossible to do it... because the * file may not have a uri associated with it -djg /* We cannot return NULL without violating the API. So just turn this * subrequest into a 500. */ /* Run the quick handler if the subrequest is not a dirent or file * Function to set the r->mtime field to the specified value if it's later * than what's already there. * Is it the initial main request, which we only get *once* per HTTP request? return (r->
main ==
NULL)
/* otherwise, this is a sub-request */ && (r->
prev ==
NULL);
/* otherwise, this is an internal redirect */