request.c revision 666d616b7986a83bb0eac453694fbf4d6f05b98c
"configuration error: couldn't %s: %s",
phase, r->
uri);
/* This is the master logic for processing requests. Do NOT duplicate * this logic elsewhere, or the security model will be broken by future * API changes. Each phase must be individually optimized to pick up /* Ignore embedded %2F's in path for proxy requests */ "found %%2f (encoded '/') in URI " "(decoded='%s'), returning 404",
/* All file subrequests are a huge pain... they cannot bubble through the * next several steps. Only file subrequests are allowed an empty uri, * otherwise let translate_name kill the request. /* Reset to the server default config prior to running map_to_storage /* This request wasn't in storage (e.g. TRACE) */ /* Excluding file-specific requests with no 'true' URI... /* Rerun the location walk, which overrides any map_to_storage config. /* Only on the main request! */ * and that configuration didn't change (this requires optimized _walk() * functions in map_to_storage that use the same merge results given * identical input.) If the config changes, we must re-auth. ?
"check user. No user file?" :
"perform authentication. AuthType not set!",
?
"check access. No groups file?" :
"perform authentication. AuthType not set!",
?
"check user. No user file?" :
"perform authentication. AuthType not set!",
?
"check access. No groups file?" :
"perform authentication. AuthType not set!",
/* XXX Must make certain the ap_run_type_checker short circuits mime /* Useful caching structures to repeat _walk/merge sequences as required * when a subrequest or redirect reuses substantially the same config. * Directive order in the httpd.conf file and its Includes significantly * impact this optimization. Grouping common blocks at the front of the * config that are less likely to change between a request and * its subrequests, or between a request and its redirects reduced * the work of these functions significantly. const char *
cached;
/* The identifier we matched */ /* Find the most relevant, recent entry to work from. That would be * this request (on the second call), or the parent request of a * subrequest, or the prior request of an internal redirect. Provide * this _walk()er with a copy it is allowed to munge. If there is no * parent or prior cached request, then create a new walk cache. /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. * resolve_symlink must _always_ be called on an APR_LNK file type! * It will resolve the actual target file type, modification date, etc, * and provide any processing required for symlink evaluation. * Path must already be cleaned, no trailing slash, no multi-slashes, * and don't call this on the root! * Simply, the number of times we deref a symlink are minimal compared * to the number of times we had an extra lstat() since we 'weren't sure'. * To optimize, we stat() anything when given (opts & OPT_SYM_LINKS), otherwise * we start off with an lstat(). Every lstat() must be dereferenced in case * it points at a 'nasty' - we must always rerun check_safe_file (or similar.) /* Save the name from the valid bits. */ /* Give back the target */ /* OPT_SYM_OWNER only works if we can get the owner of * both the file and symlink. First fill in a missing * owner of the symlink, then get the info of the target. /* Give back the target */ * As we walk the directory configuration, the merged config won't * be 'rooted' to a specific vhost until the very end of the merge. * We need a very fast mini-merge to a real, vhost-rooted merge * of core.opts and core.override, the only options tested within * See core.c::merge_core_dir_configs() for explanation. /***************************************************************** * Getting and checking directory configuration. Also checks the * FollowSymlinks and FollowSymOwner stuff, since this is really the * only place that can happen (barring a new mid_dir_walk callout). * We can't do it as an access_checker module function which gets * called with the final per_dir_config, since we could have a directory * with FollowSymLinks disabled, which contains a symlink to another * with a .htaccess file which turns FollowSymLinks back on --- and * access in such a case must be denied. So, whatever it is that * checks FollowSymLinks needs to know the state of the options as * they change, all the way down. /* XXX: Better (faster) tests needed!!! * "OK" as a response to a real problem is not _OK_, but to allow broken * modules to proceed, we will permit the not-a-path filename to pass the * following two tests. This behavior may be revoked in future versions * of Apache. We still must catch it later if it's heading for the core * handler. Leave INFO notes here for module debugging. "Module bug? Request filename is missing for URI %s",
/* Canonicalize the file path without resolving filename case or aliases * so we can begin by checking the cache for a recent directory walk. * This call will ensure we have an absolute path in the same pass. "Module bug? Request filename path %s is invalid or " "or not absolute for uri %s",
/* XXX Notice that this forces path_info to be canonical. That might * not be desired by all apps. However, some of those same apps likely * have significant security holes. /* If this is not a dirent subrequest with a preconstructed * r->finfo value, then we can simply stat the filename to * save burning mega-cycles with unneeded stats - if this is * an exact file match. We don't care about failure... we * will stat by component failing this meager attempt. * It would be nice to distinguish APR_ENOENT from other * types of failure, such as APR_ENOTDIR. We can do something * with APR_ENOENT, knowing that the path is good. * a regular file but we have '/' at the end of the name; * other OSs will return APR_ENOTDIR for that situation; * handle it the same everywhere by simulating a failure * if it looks like a directory but really isn't * Also reset if the stat failed, just for safety. /* If we have a file already matches the path of r->filename, * and the vhost's list of directory sections hasn't changed, * we can skip rewalking the directory_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a locations list that can be merged to any vhost. /* Invariant: from the first time filename_len is set until * it goes out of scope, filename_len==strlen(r->filename) * We must play our own mini-merge game here, for the few * running dir_config values we care about within dir_walk. * We didn't start the merge from r->per_dir_config, so we * accumulate opts and override as we merge, from the globals. /* Set aside path_info to merge back onto path_info later. * If r->filename is a directory, we must remerge the path_info, * before we continue! [Directories cannot, by defintion, have * path info. Either the next segment is not-found, or a file.] * r->path_info tracks the unconsumed source path. * r->filename tracks the path as we process it "dir_walk error, path_info %s is not relative " "to the filename path %s for uri %s",
* Now build r->filename component by component, starting * with the root (on Unix, simply "/"). We will make a huge * assumption here for efficiency, that any canonical path * already given included a canonical root. * Bad assumption above? If the root's length is longer * than the canonical length, then it cannot be trusted as * a truename. So try again, this time more seriously. #
else /* ndef CASE_BLIND_FILESYSTEM, really this simple for Unix today; */ "dir_walk error, could not determine the root " "path of filename %s%s for uri %s",
/* Working space for terminating null and an extra / is required. * seg keeps track of which segment we've copied. * sec_idx keeps track of which section we're on, since sections are * ordered by number of segments. See core_reorder_directories * startseg tells us how many segments describe the root path * e.g. the complete path "//host/foo/" to a UNC share (4) * Go down the directory hierarchy. Where we have to check for * symlinks, do so. Where a .htaccess file has permission to * override anything, try to find one. /* We have no trailing slash, but we sure would appreciate one. * However, we don't want to append a / our first time through. /* Begin *this* level by looking for matching <Directory> sections * from the server config. /* No more possible matches for this many segments? /* We will never skip '0' element components, e.g. plain old * <Directory >, and <Directory "/"> are classified as zero * Otherwise, skip over the mismatches. /* If we haven't continue'd above, we have a match. * Calculate our full-context core opts & override. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* If .htaccess files are enabled, check for one, provided we * have reached a real path. do {
/* Not really a loop, just a break'able code block */ /* No htaccess in an incomplete root path, /* If we are still here, we found our htaccess. * Calculate our full-context core opts & override. /* If we merged this same htaccess last time, reuse it... * this wouldn't work except that we cache the htaccess * sections for the lifetime of the request, so we match * the same conf. Good planning (no, pure luck ;) /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset }
while (0);
/* Only one htaccess, not a real loop */ /* That temporary trailing slash was useful, now drop it. /* Time for all good things to come to an end? /* Now it's time for the next segment... * We will assume the next element is an end node, and fix it up /* If nothing remained but a '/' string, we are finished * XXX: NO WE ARE NOT!!! Now process this puppy!!! */ * If...we knew r->filename was a file, and * if...we have strict (case-sensitive) filenames, or * we know the canonical_filename matches to _this_ name, and * if...we have allowed symlinks * skip the lstat and dummy up an APR_DIR value for thisinfo. /* We choose apr_stat with flag APR_FINFO_LINK here, rather that * plain apr_stat, so that we capture this path object rather than * its target. We will replace the info with our target's info * below. We especially want the name of this 'link' object, not * the name of its target, if we are fixing the filename /* Nothing? That could be nice. But our directory "access to %s denied", r->
uri);
/* If we hit ENOTDIR, we must have over-optimized, deny * rather than assume not found. "access to %s failed", r->
uri);
/* Fix up the path now if we have a name, and they don't agree * redirect is required here? We need to walk the URI and * filename in tandem to properly correlate these. /* Is this a possibly acceptable symlink? "Symbolic link not allowed: %s",
/* Ok, we are done with the link's info, test the real target /* That was fun, nothing left for us here "Forbidden: %s doesn't point to " /* If we have _not_ optimized, this is the time to recover /* Now splice the saved path_info back onto any new path_info * Now we'll deal with the regexes, note we pick up sec_idx * where we left off (we gave up after we hit entry_core->r) /* If we haven't already continue'd above, we have a match. * Calculate our full-context core opts & override. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but the original walk * found some additional matches. Truncate them. /* It seems this shouldn't be needed anymore. We translated the x symlink above into a real resource, and should have died up there. x Even if we keep this, it needs more thought (maybe an r->file_is_symlink) x perhaps it should actually happen in file_walk, so we catch more x obscure cases in autoindex subrequests, etc. x * Symlink permissions are determined by the parent. If the request is x * for a directory then applying the symlink test here would use the x * permissions of the directory as opposed to its parent. Consider a x * symlink pointing to a dir with a .htaccess disallowing symlinks. If x * you access /symlink (or /symlink/) you would get a 403 without this x * you would *not* get the 403. x if (r->finfo.filetype != APR_DIR x && (res = resolve_symlink(r->filename, r->info, ap_allow_options(r), x ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, x "Symbolic link not allowed: %s", r->filename); /* Save future sub-requestors much angst in processing * this subrequest. If dir_walk couldn't canonicalize * the file path, nothing can. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /* No tricks here, there are no <Locations > to parse in this vhost. * We won't destroy the cache, just in case _this_ redirect is later * redirected again to a vhost with <Location > blocks to optimize. /* Location and LocationMatch differ on their behaviour w.r.t. multiple * slashes. Location matches multiple slashes with a single slash, * LocationMatch doesn't. An exception, for backwards brokenness is * absoluteURIs... in which case neither match multiple slashes. /* If we have an cache->cached location that matches r->uri, * and the vhost's list of locations hasn't changed, we can skip * rewalking the location_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a locations list that can be merged to any vhost. /* Go through the location entries, and check for matches. * We apply the directive sections in given order, we should * really try them with the most general first. /* ### const strlen can be optimized in location config parsing */ /* Test the regex, fnmatch or string as appropriate. * If it's a strcmp, and the <Location > pattern was * not slash terminated, then this uri must be slash * terminated (or at the end of the string) to match. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but the original walk * found some additional matches. Truncate them. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /* To allow broken modules to proceed, we allow missing filenames to pass. * We will catch it later if it's heading for the core handler. * directory_walk already posted an INFO note for module debugging. /* No tricks here, there are just no <Files > to parse in this context. * We won't destroy the cache, just in case _this_ redirect is later * redirected again to a context containing the same or similar <Files >. /* Get the basename .. and copy for the cache just * in case r->filename is munged by another module /* If we have an cache->cached file name that matches test_file, * and the directory's list of file sections hasn't changed, we * can skip rewalking the file_walk entries. /* Well this looks really familiar! If our end-result (per_dir_result) * didn't change, we have absolutely nothing to do :) * Otherwise (as is the case with most dir_merged requests) * we must merge our dir_conf_merged onto this new r->per_dir_config. /* We start now_merged from NULL since we want to build * a file section list that can be merged to any dir_walk. /* Go through the location entries, and check for matches. * We apply the directive sections in given order, we should * really try them with the most general first. /* If we merged this same section last time, reuse it /* We fell out of sync. This is our own copy of walked, * so truncate the remaining matches and reset remaining. /* Whoops - everything matched in sequence, but the original walk * found some additional matches. Truncate them. /* Merge our cache->dir_conf_merged construct with the r->per_dir_configs, * and note the end result to (potentially) skip this step next time. /***************************************************************** * The sub_request mechanism. * Fns to look up a relative URI from, e.g., a map file or SSI document. * These do all access checks, etc., but don't actually run the transaction * ... use run_sub_req below for that. Also, be sure to use destroy_sub_req * as appropriate if you're likely to be creating more than a few of these. * (An early Apache version didn't destroy the sub_reqs used in directory * indexing. The result, when indexing a directory with 800-odd files in * it, was massively excessive storage allocation). * Note more manipulation of protocol-specific vars in the request /* Start a clean config from this subrequest's vhost. Optimization in * config blocks of the subrequest match the parent request, no merges * will actually occur (and generally a minimal number of merges are * required, even if the parent and subrequest aren't quite identical.) /* make a copy of the allowed-methods list */ /* start with the same set of output filters */ /* while there are no input filters for a subrequest, we will * try to insert some, so if we don't have valid data, the code /* If NULL - we are expecting to be internal_fast_redirect'ed * to this subrequest - or this request will never be invoked. * Ignore the original request filter stack entirely, and * drill the input and output stacks back to the connection. /* no input filters for a subrequest */ /* We have to run this after we fill in sub req vars, * or the r->main pointer won't be setup /* Is there a require line configured for the type of *this* req? */ /* would be nicer to pass "method" to ap_set_sub_req_protocol */ /* We cannot return NULL without violating the API. So just turn this * subrequest into a 500 to indicate the failure. */ * If the content can be served by the quick_handler, we can * safely bypass request_internal processing. /* Special case: we are looking at a relative lookup in the same directory. * This is 100% safe, since dirent->name just came from the filesystem. /* strip path_info off the end of the uri to keep it in sync * with r->filename, which has already been stripped by directory_walk, * merge the dirent->name, and then, if the caller wants us to remerge * the original path info, do so. Note we never fix the path_info back * to r->filename, since dir_walk would do so (but we don't expect it * to happen in the usual cases) /* XXX This is now less relevant; we will do a full location walk * these days for this case. Preserve the apr_stat results, and * perhaps we also tag that symlinks were tested and/or found for * apr_dir_read isn't very complete on this platform, so * we need another apr_stat (with or without APR_FINFO_LINK * depending on whether we allow all symlinks here.) If this * is an APR_LNK that resolves to an APR_DIR, then we will rerun * everything anyways... this should be safe. * Resolve this symlink. We should tie this back to dir_walk's cache /* ap_make_full_path overallocated the buffers * by one character to help us out here. /* fill in parsed_uri values /* We cannot return NULL without violating the API. So just turn this * subrequest into a 500. */ /* Translate r->filename, if it was canonical, it stays canonical * Check for a special case... if there are no '/' characters in new_file * at all, and the path was the same, then we are looking at a relative * lookup in the same directory. Fixup the URI to match. /* XXX: @@@: What should be done with the parsed_uri values? * We would be better off stripping down to the 'common' elements * of the path, then reassembling the URI as best as we can. * XXX: this should be set properly like it is in the same-dir case * but it's actually sometimes to impossible to do it... because the * file may not have a uri associated with it -djg /* We cannot return NULL without violating the API. So just turn this * subrequest into a 500. */ /* Run the quick handler if the subrequest is not a dirent or file * Function to set the r->mtime field to the specified value if it's later * than what's already there. * Is it the initial main request, which we only get *once* per HTTP request? return (r->
main ==
NULL)
/* otherwise, this is a sub-request */ && (r->
prev ==
NULL);
/* otherwise, this is an internal redirect */