request.html revision 7d9f2bf92f3b96b3d651365013e5da845849571b
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org" />
<title>Request Processing in Apache 2.0</title>
</head>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
vlink="#000080" alink="#FF0000">
<!--#include virtual="header.html" -->
<h1>Request Processing in Apache 2.0</h1>
<p>Warning - this is a first (fast) draft that needs further
revision!</p>
<p>Several changes in Apache 2.0 affect the internal request
processing mechanics. Module authors need to be aware of these
changes so they may take advantage of the optimizations and
security enhancements.</p>
<p>The first major change is to the subrequest and redirect
mechanisms. There were a number of different code paths in
Apache 1.3 to attempt to optimize subrequest or redirect
behavior. As patches were introduced to 2.0, these
optimizations (and the server behavior) were quickly broken due
to this duplication of code. All duplicate code has been folded
back into <code>ap_process_internal_request()</code> to prevent
the code from falling out of sync again.</p>
<p>This means that much of the existing code was 'unoptimized'.
It is the Apache HTTP Project's first goal to create a robust
and correct implementation of the HTTP server RFC. Additional
goals include security, scalability and optimization. New
methods were sought to optimize the server (beyond the
performance of Apache 1.3) without introducing fragile or
insecure code.</p>
<h2>The Request Processing Cycle</h2>
<p>All requests pass through
<code>ap_process_request_internal()</code> in request.c,
including subrequests and redirects. If a module doesn't pass
generated requests through this code, the author is cautioned
that the module may be broken by future changes to request
processing.</p>
<p>To streamline requests, the module author can take advantage
of the hooks offered to drop out of the request cycle early, or
to bypass core Apache hooks which are irrelevant (and costly in
terms of CPU.)</p>
<h2>The Request Parsing Phase</h2>
<h3>Unescapes the URL</h3>
<p>The request's parsed_uri path is unescaped, once and only
once, at the beginning of internal request processing.</p>
<p>This step is bypassed if the proxyreq flag is set, or the
parsed_uri.path element is unset. The module has no further
control of this one-time unescape operation, either failing to
unescape or multiply unescaping the URL leads to security
reprecussions.</p>
<h3>Strips Parent and This Elements from the URI</h3>
<p>All <code>/../</code> and <code>/./</code> elements are
removed by <code>ap_getparents()</code>. This helps to ensure
the path is (nearly) absolute before the request processing
continues.</p>
<p>This step cannot be bypassed.</p>
<h3>Initial URI Location Walk</h3>
<p>Every request is subject to an
<code>ap_location_walk()</code> call. This ensures that
&lt;Location &gt; sections are consistently enforced for all
requests. If the request is an internal redirect or a
sub-request, it may borrow some or all of the processing from
the previous or parent request's ap_location_walk, so this step
is generally very efficient after processing the main
request.</p>
<h3>Hook: translate_name</h3>
<p>Modules can determine the file name, or alter the given URI
in this step. For example, mod_vhost_alias will translate the
URI's path into the configured virtual host, mod_alias will
translate the path to an alias path, and if the request falls
back on the core, the DocumentRoot is prepended to the request
resource.</p>
<p>If all modules DECLINE this phase, an error 500 is returned
to the browser, and a "couldn't translate name" error is logged
automatically.</p>
<h3>Hook: map_to_storage</h3>
<p>After the file or correct URI was determined, the
appropriate per-dir configurations are merged together. For
example, mod_proxy compares and merges the appropriate
&lt;Proxy &gt; sections. If the URI is nothing more than a
local (non-proxy) TRACE request, the core handles the request
and returns DONE. If no module answers this hook with OK or
DONE, the core will run the request filename against the
&lt;Directory &gt; and &lt;Files &gt; sections. If the request
'filename' isn't an absolute, legal filename, a note is set for
later termination.</p>
<h3>Initial URI Location Walk</h3>
<p>Every request is hardened by a second
<code>ap_location_walk()</code> call. This reassures that a
translated request is still subjected to the configured
&lt;Location &gt; sections. The request again borrows some or
all of the processing from it's previous location_walk above,
so this step is almost always very efficient unless the
translated URI mapped to a substantially different path or
Virtual Host.</p>
<h3>Hook: header_parser</h3>
<p>The main request then parses the client's headers. This
prepares the remaining request processing steps to better serve
the client's request.</p>
<h2>The Security Phase</h2>
<p>Needs Documentation. Code is;</p>
<pre>
switch (ap_satisfies(r)) {
case SATISFY_ALL:
case SATISFY_NOSPEC:
if ((access_status = ap_run_access_checker(r)) != 0) {
return decl_die(access_status, "check access", r);
}
if (ap_some_auth_required(r)) {
if (((access_status = ap_run_check_user_id(r)) != 0) || !ap_auth_type(r)) {
return decl_die(access_status, ap_auth_type(r)
? "check user. No user file?"
: "perform authentication. AuthType not set!", r);
}
if (((access_status = ap_run_auth_checker(r)) != 0) || !ap_auth_type(r)) {
return decl_die(access_status, ap_auth_type(r)
? "check access. No groups file?"
: "perform authentication. AuthType not set!", r);
}
}
break;
case SATISFY_ANY:
if (((access_status = ap_run_access_checker(r)) != 0) || !ap_auth_type(r)) {
if (!ap_some_auth_required(r)) {
return decl_die(access_status, ap_auth_type(r)
? "check access"
: "perform authentication. AuthType not set!", r);
}
if (((access_status = ap_run_check_user_id(r)) != 0) || !ap_auth_type(r)) {
return decl_die(access_status, ap_auth_type(r)
? "check user. No user file?"
: "perform authentication. AuthType not set!", r);
}
if (((access_status = ap_run_auth_checker(r)) != 0) || !ap_auth_type(r)) {
return decl_die(access_status, ap_auth_type(r)
? "check access. No groups file?"
: "perform authentication. AuthType not set!", r);
}
}
break;
}
</pre>
<h2>The Preparation Phase</h2>
<h3>Hook: type_checker</h3>
<p>The modules have an opportunity to test the URI or filename
against the target resource, and set mime information for the
request. Both mod_mime and mod_mime_magic use this phase to
compare the file name or contents against the administrator's
configuration and set the content type, language, character set
and request handler. Some modules may set up their filters or
other request handling parameters at this time.</p>
<p>If all modules DECLINE this phase, an error 500 is returned
to the browser, and a "couldn't find types" error is logged
automatically.</p>
<h3>Hook: fixups</h3>
<p>Many modules are 'trounced' by some phase above. The fixups
phase is used by modules to 'reassert' their ownership or force
the request's fields to their appropriate values. It isn't
always the cleanest mechanism, but occasionally it's the only
option.</p>
<h3>Hook: insert_filter</h3>
<p>Modules that transform the content in some way can insert
their values and override existing filters, such that if the
user configured a more advanced filter out-of-order, then the
module can move it's order as need be.</p>
<h2>The Handler Phase</h2>
<p>This phase is <strong><em>not</em></strong> part of the
processing in <code>ap_process_request_internal()</code>. Many
modules prepare one or more subrequests prior to creating any
content at all. After the core, or a module calls
<code>ap_process_request_internal()</code> it then calls
<code>ap_invoke_handler()</code> to generate the request.</p>
<h3>Hook: handler</h3>
<p>The module finally has a chance to serve the request in it's
handler hook. Note that not every prepared request is sent to
the handler hook. Many modules, such as mod_autoindex, will
create subrequests for a given URI, and then never serve the
subrequest, but simply lists it for the user. Remember not to
put required teardown from the hooks above into this module,
but register pool cleanups against the request pool to free
resources as required.</p>
<!--#include virtual="footer.html" -->
</body>
</html>