API.html revision 6d045a7504b4121f8f5680a2a2ba36019e56d7db
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<html><head>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<title>Apache API notes</title>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw</head>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<body>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<!--#include virtual="header.html" -->
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<h1>Apache API notes</h1>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwThese are some notes on the Apache API and the data structures you
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwhave to deal with, etc. They are not yet nearly complete, but
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwhopefully, they will help you get your bearings. Keep in mind that
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthe API is still subject to change as we gain experience with it.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw(See the TODO file for what <em>might</em> be coming). However,
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwit will be easy to adapt modules to any changes that are made.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw(We have more modules to adapt than you do).
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwA few notes on general pedagogical style here. In the interest of
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwconciseness, all structure declarations here are incomplete --- the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwreal ones have more slots that I'm not telling you about. For the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwmost part, these are reserved to one component of the server core or
148c5f43199ca0b43fc8e3b643aab11cd66ea327Alan Wrightanother, and should be altered by modules with caution. However, in
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwsome cases, they really are things I just haven't gotten around to
148c5f43199ca0b43fc8e3b643aab11cd66ea327Alan Wrightyet. Welcome to the bleeding edge.<p>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwFinally, here's an outline, to give you some bare idea of what's
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwcoming up, and in what order:
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<ul>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<li> <a href="#basics">Basic concepts.</a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<menu>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> <a href="#HMR">Handlers, Modules, and Requests</a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> <a href="#moduletour">A brief tour of a module</a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw</menu>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<li> <a href="#handlers">How handlers work</a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<menu>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> <a href="#req_tour">A brief tour of the <code>request_rec</code></a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> <a href="#req_orig">Where request_rec structures come from</a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> <a href="#req_return">Handling requests, declining, and returning error codes</a>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> <a href="#resp_handlers">Special considerations for response handlers</a>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <li> <a href="#auth_handlers">Special considerations for authentication handlers</a>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego <li> <a href="#log_handlers">Special considerations for logging handlers</a>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego</menu>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as<li> <a href="#pools">Resource allocation and resource pools</a>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<li> <a href="#config">Configuration, commands and the like</a>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<menu>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <li> <a href="#per-dir">Per-directory configuration structures</a>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross <li> <a href="#commands">Command handling</a>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross <li> <a href="#servconf">Side notes --- per-server configuration, virtual servers, etc.</a>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw</menu>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as</ul>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<h2><a name="basics">Basic concepts.</a></h2>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoWe begin with an overview of the basic concepts behind the
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoAPI, and how they are manifested in the code.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego<h3><a name="HMR">Handlers, Modules, and Requests</a></h3>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoApache breaks down request handling into a series of steps, more or
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoless the same way the Netscape server API does (although this API has
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoa few more stages than NetSite does, as hooks for stuff I thought
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asmight be useful in the future). These are:
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as<ul>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <li> URI -&gt; Filename translation
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <li> Auth ID checking [is the user who they say they are?]
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <li> Auth access checking [is the user authorized <em>here</em>?]
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> Access checking other than auth
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> Determining MIME type of the object requested
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> `Fixups' --- there aren't any of these yet, but the phase is
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego intended as a hook for possible extensions like
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <code>SetEnv</code>, which don't really fit well elsewhere.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> Actually sending a response back to the client.
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <li> Logging the request
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as</ul>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asThese phases are handled by looking at each of a succession of
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego<em>modules</em>, looking to see if each of them has a handler for the
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregophase, and attempting invoking it if so. The handler can typically do
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoone of three things:
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego<ul>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> <em>Handle</em> the request, and indicate that it has done so
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego by returning the magic constant <code>OK</code>.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> <em>Decline</em> to handle the request, by returning the magic
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego integer constant <code>DECLINED</code>. In this case, the
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego server behaves in all respects as if the handler simply hadn't
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego been there.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> Signal an error, by returning one of the HTTP error codes.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego This terminates normal handling of the request, although an
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego ErrorDocument may be invoked to try to mop up, and it will be
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego logged in any case.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego</ul>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoMost phases are terminated by the first module that handles them;
dc20a3024900c47dd2ee44b9707e6df38f7d62a5ashowever, for logging, `fixups', and non-access authentication
dc20a3024900c47dd2ee44b9707e6df38f7d62a5aschecking, all handlers always run (barring an error). Also, the
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asresponse phase is unique in that modules may declare multiple handlers
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asfor it, via a dispatch table keyed on the MIME type of the requested
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asobject. Modules may declare a response-phase handler which can handle
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego<em>any</em> request, by giving it the key <code>*/*</code> (i.e., a
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregowildcard MIME type specification). However, wildcard handlers are
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoonly invoked if the server has already tried and failed to find a more
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregospecific response handler for the MIME type of the requested object
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as(either none existed, or they all declined).<p>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borregoThe handlers themselves are functions of one argument (a
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego<code>request_rec</code> structure. vide infra), which returns an
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asinteger, as above.<p>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego<h3><a name="moduletour">A brief tour of a module</a></h3>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asAt this point, we need to explain the structure of a module. Our
dc20a3024900c47dd2ee44b9707e6df38f7d62a5ascandidate will be one of the messier ones, the CGI module --- this
7f667e74610492ddbce8ce60f52ece95d2401949jose borregohandles both CGI scripts and the <code>ScriptAlias</code> config file
dc20a3024900c47dd2ee44b9707e6df38f7d62a5ascommand. It's actually a great deal more complicated than most
7f667e74610492ddbce8ce60f52ece95d2401949jose borregomodules, but if we're going to have only one example, it might as well
7f667e74610492ddbce8ce60f52ece95d2401949jose borregobe the one with its fingers in every place.<p>
6537f381d2d9e7b4e2f7b29c3e7a3f13be036f2eas
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asLet's begin with handlers. In order to handle the CGI scripts, the
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asmodule declares a response handler for them. Because of
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<code>ScriptAlias</code>, it also has handlers for the name
7f667e74610492ddbce8ce60f52ece95d2401949jose borregotranslation phase (to recognize <code>ScriptAlias</code>ed URIs), the
7f667e74610492ddbce8ce60f52ece95d2401949jose borregotype-checking phase (any <code>ScriptAlias</code>ed request is typed
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoas a CGI script).<p>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoThe module needs to maintain some per (virtual)
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asserver information, namely, the <code>ScriptAlias</code>es in effect;
7f667e74610492ddbce8ce60f52ece95d2401949jose borregothe module structure therefore contains pointers to a functions which
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asbuilds these structures, and to another which combines two of them (in
dc20a3024900c47dd2ee44b9707e6df38f7d62a5ascase the main server and a virtual server both have
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>ScriptAlias</code>es declared).<p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan WrightFinally, this module contains code to handle the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>ScriptAlias</code> command itself. This particular module only
8d7e41661dc4633488e93b13363137523ce59977jose borregodeclares one command, but there could be more, so modules have
8d7e41661dc4633488e93b13363137523ce59977jose borrego<em>command tables</em> which declare their commands, and describe
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosswhere they are permitted, and how they are to be invoked. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwA final note on the declared types of the arguments of some of these
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wrightcommands: a <code>pool</code> is a pointer to a <em>resource pool</em>
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightstructure; these are used by the server to keep track of the memory
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwwhich has been allocated, files opened, etc., either to service a
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwparticular request, or to handle the process of configuring itself.
a0aa776e20803c84edd153d9cb584fd67163aef3Alan WrightThat way, when the request is over (or, for the configuration pool,
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosswhen the server is restarting), the memory can be freed, and the files
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightclosed, <i>en masse</i>, without anyone having to write explicit code to
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrighttrack them all down and dispose of them. Also, a
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>cmd_parms</code> structure contains various information about
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossthe config file being read, and other status information, which is
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosssometimes of use to the function which processes a config-file command
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross(such as <code>ScriptAlias</code>).
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwWith no further ado, the module itself:
8d7e41661dc4633488e93b13363137523ce59977jose borrego
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<pre>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw/* Declarations of handlers. */
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwint translate_scriptalias (request_rec *);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwint type_scriptalias (request_rec *);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwint cgi_handler (request_rec *);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright/* Subsidiary dispatch table for response-phase handlers, by MIME type */
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
8d7e41661dc4633488e93b13363137523ce59977jose borregohandler_rec cgi_handlers[] = {
8d7e41661dc4633488e93b13363137523ce59977jose borrego{ "application/x-httpd-cgi", cgi_handler },
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross{ NULL }
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw};
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright/* Declarations of routines to manipulate the module's configuration
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright * info. Note that these are returned, and passed in, as void *'s;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw * the server core keeps track of them, but it doesn't, and can't,
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw * know their internal structure.
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright */
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightvoid *make_cgi_server_config (pool *);
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightvoid *merge_cgi_server_config (pool *, void *, void *);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross/* Declarations of routines to handle config-file commands */
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossextern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw char *real);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
8d7e41661dc4633488e93b13363137523ce59977jose borregocommand_rec cgi_cmds[] = {
8d7e41661dc4633488e93b13363137523ce59977jose borrego{ "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
8d7e41661dc4633488e93b13363137523ce59977jose borrego "a fakename and a realname"},
8d7e41661dc4633488e93b13363137523ce59977jose borrego{ NULL }
8d7e41661dc4633488e93b13363137523ce59977jose borrego};
8d7e41661dc4633488e93b13363137523ce59977jose borrego
8d7e41661dc4633488e93b13363137523ce59977jose borregomodule cgi_module = {
8d7e41661dc4633488e93b13363137523ce59977jose borrego STANDARD_MODULE_STUFF,
8d7e41661dc4633488e93b13363137523ce59977jose borrego NULL, /* initializer */
8d7e41661dc4633488e93b13363137523ce59977jose borrego NULL, /* dir config creator */
8d7e41661dc4633488e93b13363137523ce59977jose borrego NULL, /* dir merger --- default is to override */
8d7e41661dc4633488e93b13363137523ce59977jose borrego make_cgi_server_config, /* server config */
8d7e41661dc4633488e93b13363137523ce59977jose borrego merge_cgi_server_config, /* merge server config */
8d7e41661dc4633488e93b13363137523ce59977jose borrego cgi_cmds, /* command table */
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross cgi_handlers, /* handlers */
8d7e41661dc4633488e93b13363137523ce59977jose borrego translate_scriptalias, /* filename translation */
8d7e41661dc4633488e93b13363137523ce59977jose borrego NULL, /* check_user_id */
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright NULL, /* check auth */
8d7e41661dc4633488e93b13363137523ce59977jose borrego NULL, /* check access */
8d7e41661dc4633488e93b13363137523ce59977jose borrego type_scriptalias, /* type_checker */
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright NULL, /* fixups */
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross NULL, /* logger */
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright NULL /* header parser */
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright};
8d7e41661dc4633488e93b13363137523ce59977jose borrego</pre>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h2><a name="handlers">How handlers work</a></h2>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
8d7e41661dc4633488e93b13363137523ce59977jose borregoThe sole argument to handlers is a <code>request_rec</code> structure.
8d7e41661dc4633488e93b13363137523ce59977jose borregoThis structure describes a particular request which has been made to
8d7e41661dc4633488e93b13363137523ce59977jose borregothe server, on behalf of a client. In most cases, each connection to
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthe client generates only one <code>request_rec</code> structure.<p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<h3><a name="req_tour">A brief tour of the <code>request_rec</code></a></h3>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwThe <code>request_rec</code> contains pointers to a resource pool
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwwhich will be cleared when the server is finished handling the
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wrightrequest; to structures containing per-server and per-connection
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wrightinformation, and most importantly, information on the request itself.<p>
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwThe most important such information is a small set of character
8d7e41661dc4633488e93b13363137523ce59977jose borregostrings describing attributes of the object being requested, including
8d7e41661dc4633488e93b13363137523ce59977jose borregoits URI, filename, content-type and content-encoding (these being filled
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossin by the translation and type-check handlers which handle the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwrequest, respectively). <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan WrightOther commonly used data items are tables giving the MIME headers on
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wrightthe client's original request, MIME headers to be sent back with the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwresponse (which modules can add to at will), and environment variables
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwfor any subprocesses which are spawned off in the course of servicing
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossthe request. These tables are manipulated using the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>table_get</code> and <code>table_set</code> routines. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<BLOCKQUOTE>
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright Note that the <SAMP>Content-type</SAMP> header value <EM>cannot</EM> be
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright set by module content-handlers using the <SAMP>table_*()</SAMP>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw routines. Rather, it is set by pointing the <SAMP>content_type</SAMP>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross field in the <SAMP>request_rec</SAMP> structure to an appropriate
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross string. <EM>E.g.</EM>,
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross <PRE>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw r-&gt;content_type = "text/html";
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw </PRE>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw</BLOCKQUOTE>
8d7e41661dc4633488e93b13363137523ce59977jose borregoFinally, there are pointers to two data structures which, in turn,
148c5f43199ca0b43fc8e3b643aab11cd66ea327Alan Wrightpoint to per-module configuration structures. Specifically, these
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwhold pointers to the data structures which the module has built to
148c5f43199ca0b43fc8e3b643aab11cd66ea327Alan Wrightdescribe the way it has been configured to operate in a given
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwdirectory (via <code>.htaccess</code> files or
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>&lt;Directory&gt;</code> sections), for private data it has
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwbuilt in the course of servicing the request (so modules' handlers for
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwone phase can pass `notes' to their handlers for other phases). There
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwis another such configuration vector in the <code>server_rec</code>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwdata structure pointed to by the <code>request_rec</code>, which
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwcontains per (virtual) server configuration data.<p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwHere is an abridged declaration, giving the fields most commonly used:<p>
8d7e41661dc4633488e93b13363137523ce59977jose borrego
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright<pre>
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wrightstruct request_rec {
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright pool *pool;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright conn_rec *connection;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright server_rec *server;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross /* What object is being requested */
8d7e41661dc4633488e93b13363137523ce59977jose borrego
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright char *uri;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright char *filename;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright char *path_info;
8d7e41661dc4633488e93b13363137523ce59977jose borrego char *args; /* QUERY_ARGS, if any */
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright struct stat finfo; /* Set by server core;
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross * st_mode set to zero if no such file */
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright char *content_type;
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright char *content_encoding;
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright
8d7e41661dc4633488e93b13363137523ce59977jose borrego /* MIME header environments, in and out. Also, an array containing
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross * environment variables to be passed to subprocesses, so people can
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross * write modules to add to that environment.
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross *
8d7e41661dc4633488e93b13363137523ce59977jose borrego * The difference between headers_out and err_headers_out is that
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright * the latter are printed even on error, and persist across internal
8d7e41661dc4633488e93b13363137523ce59977jose borrego * redirects (so the headers printed for ErrorDocument handlers will
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright * have them).
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright */
148c5f43199ca0b43fc8e3b643aab11cd66ea327Alan Wright
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright table *headers_in;
148c5f43199ca0b43fc8e3b643aab11cd66ea327Alan Wright table *headers_out;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright table *err_headers_out;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright table *subprocess_env;
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright
8d7e41661dc4633488e93b13363137523ce59977jose borrego /* Info about the request itself... */
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright int header_only; /* HEAD request, as opposed to GET */
29bd28862cfb8abbd3a0f0a4b17e08bbc3652836Alan Wright char *protocol; /* Protocol, as given to us, or HTTP/0.9 */
8d7e41661dc4633488e93b13363137523ce59977jose borrego char *method; /* GET, HEAD, POST, etc. */
8d7e41661dc4633488e93b13363137523ce59977jose borrego int method_number; /* M_GET, M_POST, etc. */
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego /* Info for logging */
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego char *the_request;
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego int bytes_sent;
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego /* A flag which modules can set, to indicate that the data being
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego * returned is volatile, and clients should be told not to cache it.
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego */
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw int no_cache;
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego /* Various other config info which may change with .htaccess files
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw * These are config vectors, with one void* pointer for each module
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego * (the thing pointed to being the module's business).
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego */
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego void *per_dir_config; /* Options set in config files, etc. */
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego void *request_config; /* Notes on *this* request */
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego};
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego</pre>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<h3><a name="req_orig">Where request_rec structures come from</a></h3>
bbf6f00c25b6a2bed23c35eac6d62998ecdb338cJordan Brown
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asMost <code>request_rec</code> structures are built by reading an HTTP
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwrequest from a client, and filling in the fields. However, there are
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoa few exceptions:
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<ul>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego <li> If the request is to an imagemap, a type map (i.e., a
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego <code>*.var</code> file), or a CGI script which returned a
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw local `Location:', then the resource which the user requested
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego is going to be ultimately located by some URI other than what
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego the client originally supplied. In this case, the server does
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego an <em>internal redirect</em>, constructing a new
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as <code>request_rec</code> for the new URI, and processing it
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw almost exactly as if the client had requested the new URI
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as directly. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> If some handler signaled an error, and an
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <code>ErrorDocument</code> is in scope, the same internal
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross redirect machinery comes into play.<p>
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego <li> Finally, a handler occasionally needs to investigate `what
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego would happen if' some other request were run. For instance,
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego the directory indexing module needs to know what MIME type
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross would be assigned to a request for each directory entry, in
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross order to figure out what icon to use.<p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as Such handlers can construct a <em>sub-request</em>, using the
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego functions <code>sub_req_lookup_file</code> and
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <code>sub_req_lookup_uri</code>; this constructs a new
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <code>request_rec</code> structure and processes it as you
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright would expect, up to but not including the point of actually
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright sending a response. (These functions skip over the access
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross checks if the sub-request is for a file in the same directory
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright as the original request).<p>
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw (Server-side includes work by building sub-requests and then
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego actually invoking the response handler for them, via the
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego function <code>run_sub_request</code>).
89dc44ce9705974a8bc4a39f1e878a0491a5be61jose borrego</ul>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h3><a name="req_return">Handling requests, declining, and returning error codes</a></h3>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossAs discussed above, each handler, when invoked to handle a particular
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>request_rec</code>, has to return an <code>int</code> to
fe1c642d06e14b412cd83ae2179303186ab08972Bill Krierindicate what happened. That can either be
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<ul>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> OK --- the request was handled successfully. This may or may
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw not terminate the phase.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> DECLINED --- no erroneous condition exists, but the module
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw declines to handle the phase; the server tries to find another.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> an HTTP error code, which aborts handling of the request.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw</ul>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwNote that if the error code returned is <code>REDIRECT</code>, then
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthe module should put a <code>Location</code> in the request's
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>headers_out</code>, to indicate where the client should be
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwredirected <em>to</em>. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<h3><a name="resp_handlers">Special considerations for response handlers</a></h3>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossHandlers for most phases do their work by simply setting a few fields
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwin the <code>request_rec</code> structure (or, in the case of access
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwcheckers, simply by returning the correct error code). However,
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossresponse handlers have to actually send a request back to the client. <p>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwThey should begin by sending an HTTP response header, using the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwfunction <code>send_http_header</code>. (You don't have to do
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightanything special to skip sending the header for HTTP/0.9 requests; the
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightfunction figures out on its own that it shouldn't do anything). If
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossthe request is marked <code>header_only</code>, that's all they should
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightdo; they should return after that, without attempting any further
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightoutput. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
8d7e41661dc4633488e93b13363137523ce59977jose borregoOtherwise, they should produce a request body which responds to the
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossclient as appropriate. The primitives for this are <code>rputc</code>
8d7e41661dc4633488e93b13363137523ce59977jose borregoand <code>rprintf</code>, for internally generated output, and
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<code>send_fd</code>, to copy the contents of some <code>FILE *</code>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossstraight to the client. <p>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossAt this point, you should more or less understand the following piece
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwof code, which is the handler which handles <code>GET</code> requests
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosswhich have no more specific handler; it also shows how conditional
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>GET</code>s can be handled, if it's desirable to do so in a
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossparticular response handler --- <code>set_last_modified</code> checks
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwagainst the <code>If-modified-since</code> value supplied by the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwclient, if any, and returns an appropriate code (which will, if
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwnonzero, be USE_LOCAL_COPY). No similar considerations apply for
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>set_content_length</code>, but it returns an error code for
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwsymmetry.<p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<pre>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwint default_handler (request_rec *r)
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross{
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw int errstatus;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw FILE *f;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw if (r-&gt;method_number != M_GET) return DECLINED;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw if (r-&gt;finfo.st_mode == 0) return NOT_FOUND;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw if ((errstatus = set_content_length (r, r-&gt;finfo.st_size))
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross || (errstatus = set_last_modified (r, r-&gt;finfo.st_mtime)))
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross return errstatus;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw f = fopen (r-&gt;filename, "r");
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright if (f == NULL) {
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw log_reason("file permissions deny server access",
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross r-&gt;filename, r);
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross return FORBIDDEN;
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross }
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw register_timeout ("send", r);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw send_http_header (r);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw if (!r-&gt;header_only) send_fd (f, r);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw pfclose (r-&gt;pool, f);
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw return OK;
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw}
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw</pre>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwFinally, if all of this is too much of a challenge, there are a few
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwways out of it. First off, as shown above, a response handler which
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwhas not yet produced any output can simply return an error code, in
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwwhich case the server will automatically produce an error response.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwSecondly, it can punt to some other handler by invoking
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<code>internal_redirect</code>, which is how the internal redirection
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwmachinery discussed above is invoked. A response handler which has
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwinternally redirected should always return <code>OK</code>. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw(Invoking <code>internal_redirect</code> from handlers which are
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<em>not</em> response handlers will lead to serious confusion).
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h3><a name="auth_handlers">Special considerations for authentication handlers</a></h3>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossStuff that should be discussed here in detail:
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<ul>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> Authentication-phase handlers not invoked unless auth is
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw configured for the directory.
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <li> Common auth configuration stored in the core per-dir
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego configuration; it has accessors <code>auth_type</code>,
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw <code>auth_name</code>, and <code>requires</code>.
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross <li> Common routines, to handle the protocol end of things, at least
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw for HTTP basic authentication (<code>get_basic_auth_pw</code>,
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw which sets the <code>connection-&gt;user</code> structure field
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw automatically, and <code>note_basic_auth_failure</code>, which
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw arranges for the proper <code>WWW-Authenticate:</code> header
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw to be sent back).
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross</ul>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h3><a name="log_handlers">Special considerations for logging handlers</a></h3>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwWhen a request has internally redirected, there is the question of
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwwhat to log. Apache handles this by bundling the entire chain of
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwredirects into a list of <code>request_rec</code> structures which are
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthreaded through the <code>r-&gt;prev</code> and <code>r-&gt;next</code>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwpointers. The <code>request_rec</code> which is passed to the logging
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwhandlers in such cases is the one which was originally built for the
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwinitial request from the client; note that the bytes_sent field will
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoonly be correct in the last request in the chain (the one for which a
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwresponse was actually sent).
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw<h2><a name="pools">Resource allocation and resource pools</a></h2>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwOne of the problems of writing and designing a server-pool server is
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthat of preventing leakage, that is, allocating resources (memory,
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwopen files, etc.), without subsequently releasing them. The resource
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwpool machinery is designed to make it easy to prevent this from
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosshappening, by allowing resource to be allocated in such a way that
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthey are <em>automatically</em> released when the server is done with
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amwthem. <p>
da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0amw
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asThe way this works is as follows: the memory which is allocated, file
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asopened, etc., to deal with a particular request are tied to a
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<em>resource pool</em> which is allocated for the request. The pool
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asis a data structure which itself tracks the resources in question. <p>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoWhen the request has been processed, the pool is <em>cleared</em>. At
7f667e74610492ddbce8ce60f52ece95d2401949jose borregothat point, all the memory associated with it is released for reuse,
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoall files associated with it are closed, and any other clean-up
7f667e74610492ddbce8ce60f52ece95d2401949jose borregofunctions which are associated with the pool are run. When this is
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoover, we can be confident that all the resource tied to the pool have
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asbeen released, and that none of them have leaked. <p>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoServer restarts, and allocation of memory and resources for per-server
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asconfiguration, are handled in a similar way. There is a
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<em>configuration pool</em>, which keeps track of resources which were
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoallocated while reading the server configuration files, and handling
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asthe commands therein (for instance, the memory that was allocated for
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoper-server module configuration, log files and other files that were
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asopened, and so forth). When the server restarts, and has to reread
7f667e74610492ddbce8ce60f52ece95d2401949jose borregothe configuration files, the configuration pool is cleared, and so the
7f667e74610492ddbce8ce60f52ece95d2401949jose borregomemory and file descriptors which were taken up by reading them the
dc20a3024900c47dd2ee44b9707e6df38f7d62a5aslast time are made available for reuse. <p>
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asIt should be noted that use of the pool machinery isn't generally
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asobligatory, except for situations like logging handlers, where you
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoreally need to register cleanups to make sure that the log file gets
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoclosed when the server restarts (this is most easily done by using the
7f667e74610492ddbce8ce60f52ece95d2401949jose borregofunction <code><a href="#pool-files">pfopen</a></code>, which also
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoarranges for the underlying file descriptor to be closed before any
7f667e74610492ddbce8ce60f52ece95d2401949jose borregochild processes, such as for CGI scripts, are <code>exec</code>ed), or
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asin case you are using the timeout machinery (which isn't yet even
7f667e74610492ddbce8ce60f52ece95d2401949jose borregodocumented here). However, there are two benefits to using it:
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoresources allocated to a pool never leak (even if you allocate a
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asscratch string, and just forget about it); also, for memory
7f667e74610492ddbce8ce60f52ece95d2401949jose borregoallocation, <code>palloc</code> is generally faster than
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as<code>malloc</code>.<p>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asWe begin here by describing how memory is allocated to pools, and then
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asdiscuss how other resources are tracked by the resource pool
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossmachinery.
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h3>Allocation of memory in pools</h3>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossMemory is allocated to pools by calling the function
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<code>palloc</code>, which takes two arguments, one being a pointer to
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossa resource pool structure, and the other being the amount of memory to
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossallocate (in <code>char</code>s). Within handlers for handling
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossrequests, the most common way of getting a resource pool structure is
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asby looking at the <code>pool</code> slot of the relevant
7f667e74610492ddbce8ce60f52ece95d2401949jose borrego<code>request_rec</code>; hence the repeated appearance of the
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asfollowing idiom in module code:
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright<pre>
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wrightint my_handler(request_rec *r)
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross{
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright struct my_structure *foo;
a0aa776e20803c84edd153d9cb584fd67163aef3Alan Wright ...
8d7e41661dc4633488e93b13363137523ce59977jose borrego
8d7e41661dc4633488e93b13363137523ce59977jose borrego foo = (foo *)palloc (r->pool, sizeof(my_structure));
8d7e41661dc4633488e93b13363137523ce59977jose borrego}
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as</pre>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossNote that <em>there is no <code>pfree</code></em> ---
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<code>palloc</code>ed memory is freed only when the associated
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossresource pool is cleared. This means that <code>palloc</code> does not
dc20a3024900c47dd2ee44b9707e6df38f7d62a5ashave to do as much accounting as <code>malloc()</code>; all it does in
fe1c642d06e14b412cd83ae2179303186ab08972Bill Krierthe typical case is to round up the size, bump a pointer, and do a
dc20a3024900c47dd2ee44b9707e6df38f7d62a5asrange check.<p>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as
dc20a3024900c47dd2ee44b9707e6df38f7d62a5as(It also raises the possibility that heavy use of <code>palloc</code>
dc20a3024900c47dd2ee44b9707e6df38f7d62a5ascould cause a server process to grow excessively large. There are
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosstwo ways to deal with this, which are dealt with below; briefly, you
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosscan use <code>malloc</code>, and try to be sure that all of the memory
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossgets explicitly <code>free</code>d, or you can allocate a sub-pool of
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossthe main pool, allocate your memory in the sub-pool, and clear it out
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossperiodically. The latter technique is discussed in the section on
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosssub-pools below, and is used in the directory-indexing code, in order
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossto avoid excessive storage allocation when listing directories with
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossthousands of files).
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h3>Allocating initialized memory</h3>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon RossThere are functions which allocate initialized memory, and are
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossfrequently useful. The function <code>pcalloc</code> has the same
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossinterface as <code>palloc</code>, but clears out the memory it
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossallocates before it returns it. The function <code>pstrdup</code>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rosstakes a resource pool and a <code>char *</code> as arguments, and
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossallocates memory for a copy of the string the pointer points to,
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossreturning a pointer to the copy. Finally <code>pstrcat</code> is a
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossvarargs-style function, which takes a pointer to a resource pool, and
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossat least two <code>char *</code> arguments, the last of which must be
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<code>NULL</code>. It allocates enough memory to fit copies of each
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossof the strings, as a unit; for instance:
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<pre>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross pstrcat (r->pool, "foo", "/", "bar", NULL);
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross</pre>
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Rossreturns a pointer to 8 bytes worth of memory, initialized to
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<code>"foo/bar"</code>.
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross
b3700b074e637f8c6991b70754c88a2cfffb246bGordon Ross<h3><a name="pool-files">Tracking open files, etc.</a></h3>
As indicated above, resource pools are also used to track other sorts
of resources besides memory. The most common are open files. The
routine which is typically used for this is <code>pfopen</code>, which
takes a resource pool and two strings as arguments; the strings are
the same as the typical arguments to <code>fopen</code>, e.g.,
<pre>
...
FILE *f = pfopen (r->pool, r->filename, "r");
if (f == NULL) { ... } else { ... }
</pre>
There is also a <code>popenf</code> routine, which parallels the
lower-level <code>open</code> system call. Both of these routines
arrange for the file to be closed when the resource pool in question
is cleared. <p>
Unlike the case for memory, there <em>are</em> functions to close
files allocated with <code>pfopen</code>, and <code>popenf</code>,
namely <code>pfclose</code> and <code>pclosef</code>. (This is
because, on many systems, the number of files which a single process
can have open is quite limited). It is important to use these
functions to close files allocated with <code>pfopen</code> and
<code>popenf</code>, since to do otherwise could cause fatal errors on
systems such as Linux, which react badly if the same
<code>FILE*</code> is closed more than once. <p>
(Using the <code>close</code> functions is not mandatory, since the
file will eventually be closed regardless, but you should consider it
in cases where your module is opening, or could open, a lot of files).
<h3>Other sorts of resources --- cleanup functions</h3>
More text goes here. Describe the the cleanup primitives in terms of
which the file stuff is implemented; also, <code>spawn_process</code>.
<h3>Fine control --- creating and dealing with sub-pools, with a note
on sub-requests</h3>
On rare occasions, too-free use of <code>palloc()</code> and the
associated primitives may result in undesirably profligate resource
allocation. You can deal with such a case by creating a
<em>sub-pool</em>, allocating within the sub-pool rather than the main
pool, and clearing or destroying the sub-pool, which releases the
resources which were associated with it. (This really <em>is</em> a
rare situation; the only case in which it comes up in the standard
module set is in case of listing directories, and then only with
<em>very</em> large directories. Unnecessary use of the primitives
discussed here can hair up your code quite a bit, with very little
gain). <p>
The primitive for creating a sub-pool is <code>make_sub_pool</code>,
which takes another pool (the parent pool) as an argument. When the
main pool is cleared, the sub-pool will be destroyed. The sub-pool
may also be cleared or destroyed at any time, by calling the functions
<code>clear_pool</code> and <code>destroy_pool</code>, respectively.
(The difference is that <code>clear_pool</code> frees resources
associated with the pool, while <code>destroy_pool</code> also
deallocates the pool itself. In the former case, you can allocate new
resources within the pool, and clear it again, and so forth; in the
latter case, it is simply gone). <p>
One final note --- sub-requests have their own resource pools, which
are sub-pools of the resource pool for the main request. The polite
way to reclaim the resources associated with a sub request which you
have allocated (using the <code>sub_req_lookup_...</code> functions)
is <code>destroy_sub_request</code>, which frees the resource pool.
Before calling this function, be sure to copy anything that you care
about which might be allocated in the sub-request's resource pool into
someplace a little less volatile (for instance, the filename in its
<code>request_rec</code> structure). <p>
(Again, under most circumstances, you shouldn't feel obliged to call
this function; only 2K of memory or so are allocated for a typical sub
request, and it will be freed anyway when the main request pool is
cleared. It is only when you are allocating many, many sub-requests
for a single main request that you should seriously consider the
<code>destroy...</code> functions).
<h2><a name="config">Configuration, commands and the like</a></h2>
One of the design goals for this server was to maintain external
compatibility with the NCSA 1.3 server --- that is, to read the same
configuration files, to process all the directives therein correctly,
and in general to be a drop-in replacement for NCSA. On the other
hand, another design goal was to move as much of the server's
functionality into modules which have as little as possible to do with
the monolithic server core. The only way to reconcile these goals is
to move the handling of most commands from the central server into the
modules. <p>
However, just giving the modules command tables is not enough to
divorce them completely from the server core. The server has to
remember the commands in order to act on them later. That involves
maintaining data which is private to the modules, and which can be
either per-server, or per-directory. Most things are per-directory,
including in particular access control and authorization information,
but also information on how to determine file types from suffixes,
which can be modified by <code>AddType</code> and
<code>DefaultType</code> directives, and so forth. In general, the
governing philosophy is that anything which <em>can</em> be made
configurable by directory should be; per-server information is
generally used in the standard set of modules for information like
<code>Alias</code>es and <code>Redirect</code>s which come into play
before the request is tied to a particular place in the underlying
file system. <p>
Another requirement for emulating the NCSA server is being able to
handle the per-directory configuration files, generally called
<code>.htaccess</code> files, though even in the NCSA server they can
contain directives which have nothing at all to do with access
control. Accordingly, after URI -&gt; filename translation, but before
performing any other phase, the server walks down the directory
hierarchy of the underlying filesystem, following the translated
pathname, to read any <code>.htaccess</code> files which might be
present. The information which is read in then has to be
<em>merged</em> with the applicable information from the server's own
config files (either from the <code>&lt;Directory&gt;</code> sections
in <code>access.conf</code>, or from defaults in
<code>srm.conf</code>, which actually behaves for most purposes almost
exactly like <code>&lt;Directory /&gt;</code>).<p>
Finally, after having served a request which involved reading
<code>.htaccess</code> files, we need to discard the storage allocated
for handling them. That is solved the same way it is solved wherever
else similar problems come up, by tying those structures to the
per-transaction resource pool. <p>
<h3><a name="per-dir">Per-directory configuration structures</a></h3>
Let's look out how all of this plays out in <code>mod_mime.c</code>,
which defines the file typing handler which emulates the NCSA server's
behavior of determining file types from suffixes. What we'll be
looking at, here, is the code which implements the
<code>AddType</code> and <code>AddEncoding</code> commands. These
commands can appear in <code>.htaccess</code> files, so they must be
handled in the module's private per-directory data, which in fact,
consists of two separate <code>table</code>s for MIME types and
encoding information, and is declared as follows:
<pre>
typedef struct {
table *forced_types; /* Additional AddTyped stuff */
table *encoding_types; /* Added with AddEncoding... */
} mime_dir_config;
</pre>
When the server is reading a configuration file, or
<code>&lt;Directory&gt;</code> section, which includes one of the MIME
module's commands, it needs to create a <code>mime_dir_config</code>
structure, so those commands have something to act on. It does this
by invoking the function it finds in the module's `create per-dir
config slot', with two arguments: the name of the directory to which
this configuration information applies (or <code>NULL</code> for
<code>srm.conf</code>), and a pointer to a resource pool in which the
allocation should happen. <p>
(If we are reading a <code>.htaccess</code> file, that resource pool
is the per-request resource pool for the request; otherwise it is a
resource pool which is used for configuration data, and cleared on
restarts. Either way, it is important for the structure being created
to vanish when the pool is cleared, by registering a cleanup on the
pool if necessary). <p>
For the MIME module, the per-dir config creation function just
<code>palloc</code>s the structure above, and a creates a couple of
<code>table</code>s to fill it. That looks like this:
<pre>
void *create_mime_dir_config (pool *p, char *dummy)
{
mime_dir_config *new =
(mime_dir_config *) palloc (p, sizeof(mime_dir_config));
new-&gt;forced_types = make_table (p, 4);
new-&gt;encoding_types = make_table (p, 4);
return new;
}
</pre>
Now, suppose we've just read in a <code>.htaccess</code> file. We
already have the per-directory configuration structure for the next
directory up in the hierarchy. If the <code>.htaccess</code> file we
just read in didn't have any <code>AddType</code> or
<code>AddEncoding</code> commands, its per-directory config structure
for the MIME module is still valid, and we can just use it.
Otherwise, we need to merge the two structures somehow. <p>
To do that, the server invokes the module's per-directory config merge
function, if one is present. That function takes three arguments:
the two structures being merged, and a resource pool in which to
allocate the result. For the MIME module, all that needs to be done
is overlay the tables from the new per-directory config structure with
those from the parent:
<pre>
void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
{
mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
mime_dir_config *subdir = (mime_dir_config *)subdirv;
mime_dir_config *new =
(mime_dir_config *)palloc (p, sizeof(mime_dir_config));
new-&gt;forced_types = overlay_tables (p, subdir-&gt;forced_types,
parent_dir-&gt;forced_types);
new-&gt;encoding_types = overlay_tables (p, subdir-&gt;encoding_types,
parent_dir-&gt;encoding_types);
return new;
}
</pre>
As a note --- if there is no per-directory merge function present, the
server will just use the subdirectory's configuration info, and ignore
the parent's. For some modules, that works just fine (e.g., for the
includes module, whose per-directory configuration information
consists solely of the state of the <code>XBITHACK</code>), and for
those modules, you can just not declare one, and leave the
corresponding structure slot in the module itself <code>NULL</code>.<p>
<h3><a name="commands">Command handling</a></h3>
Now that we have these structures, we need to be able to figure out
how to fill them. That involves processing the actual
<code>AddType</code> and <code>AddEncoding</code> commands. To find
commands, the server looks in the module's <code>command table</code>.
That table contains information on how many arguments the commands
take, and in what formats, where it is permitted, and so forth. That
information is sufficient to allow the server to invoke most
command-handling functions with pre-parsed arguments. Without further
ado, let's look at the <code>AddType</code> command handler, which
looks like this (the <code>AddEncoding</code> command looks basically
the same, and won't be shown here):
<pre>
char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
{
if (*ext == '.') ++ext;
table_set (m-&gt;forced_types, ext, ct);
return NULL;
}
</pre>
This command handler is unusually simple. As you can see, it takes
four arguments, two of which are pre-parsed arguments, the third being
the per-directory configuration structure for the module in question,
and the fourth being a pointer to a <code>cmd_parms</code> structure.
That structure contains a bunch of arguments which are frequently of
use to some, but not all, commands, including a resource pool (from
which memory can be allocated, and to which cleanups should be tied),
and the (virtual) server being configured, from which the module's
per-server configuration data can be obtained if required.<p>
Another way in which this particular command handler is unusually
simple is that there are no error conditions which it can encounter.
If there were, it could return an error message instead of
<code>NULL</code>; this causes an error to be printed out on the
server's <code>stderr</code>, followed by a quick exit, if it is in
the main config files; for a <code>.htaccess</code> file, the syntax
error is logged in the server error log (along with an indication of
where it came from), and the request is bounced with a server error
response (HTTP error status, code 500). <p>
The MIME module's command table has entries for these commands, which
look like this:
<pre>
command_rec mime_cmds[] = {
{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2,
"a mime type followed by a file extension" },
{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
"an encoding (e.g., gzip), followed by a file extension" },
{ NULL }
};
</pre>
The entries in these tables are:
<ul>
<li> The name of the command
<li> The function which handles it
<li> a <code>(void *)</code> pointer, which is passed in the
<code>cmd_parms</code> structure to the command handler ---
this is useful in case many similar commands are handled by the
same function.
<li> A bit mask indicating where the command may appear. There are
mask bits corresponding to each <code>AllowOverride</code>
option, and an additional mask bit, <code>RSRC_CONF</code>,
indicating that the command may appear in the server's own
config files, but <em>not</em> in any <code>.htaccess</code>
file.
<li> A flag indicating how many arguments the command handler wants
pre-parsed, and how they should be passed in.
<code>TAKE2</code> indicates two pre-parsed arguments. Other
options are <code>TAKE1</code>, which indicates one pre-parsed
argument, <code>FLAG</code>, which indicates that the argument
should be <code>On</code> or <code>Off</code>, and is passed in
as a boolean flag, <code>RAW_ARGS</code>, which causes the
server to give the command the raw, unparsed arguments
(everything but the command name itself). There is also
<code>ITERATE</code>, which means that the handler looks the
same as <code>TAKE1</code>, but that if multiple arguments are
present, it should be called multiple times, and finally
<code>ITERATE2</code>, which indicates that the command handler
looks like a <code>TAKE2</code>, but if more arguments are
present, then it should be called multiple times, holding the
first argument constant.
<li> Finally, we have a string which describes the arguments that
should be present. If the arguments in the actual config file
are not as required, this string will be used to help give a
more specific error message. (You can safely leave this
<code>NULL</code>).
</ul>
Finally, having set this all up, we have to use it. This is
ultimately done in the module's handlers, specifically for its
file-typing handler, which looks more or less like this; note that the
per-directory configuration structure is extracted from the
<code>request_rec</code>'s per-directory configuration vector by using
the <code>get_module_config</code> function.
<pre>
int find_ct(request_rec *r)
{
int i;
char *fn = pstrdup (r->pool, r->filename);
mime_dir_config *conf = (mime_dir_config *)
get_module_config(r->per_dir_config, &amp;mime_module);
char *type;
if (S_ISDIR(r->finfo.st_mode)) {
r->content_type = DIR_MAGIC_TYPE;
return OK;
}
if((i=rind(fn,'.')) &lt; 0) return DECLINED;
++i;
if ((type = table_get (conf->encoding_types, &amp;fn[i])))
{
r->content_encoding = type;
/* go back to previous extension to try to use it as a type */
fn[i-1] = '\0';
if((i=rind(fn,'.')) &lt; 0) return OK;
++i;
}
if ((type = table_get (conf->forced_types, &amp;fn[i])))
{
r->content_type = type;
}
return OK;
}
</pre>
<h3><a name="servconf">Side notes --- per-server configuration, virtual servers, etc.</a></h3>
The basic ideas behind per-server module configuration are basically
the same as those for per-directory configuration; there is a creation
function and a merge function, the latter being invoked where a
virtual server has partially overridden the base server configuration,
and a combined structure must be computed. (As with per-directory
configuration, the default if no merge function is specified, and a
module is configured in some virtual server, is that the base
configuration is simply ignored). <p>
The only substantial difference is that when a command needs to
configure the per-server private module data, it needs to go to the
<code>cmd_parms</code> data to get at it. Here's an example, from the
alias module, which also indicates how a syntax error can be returned
(note that the per-directory configuration argument to the command
handler is declared as a dummy, since the module doesn't actually have
per-directory config data):
<pre>
char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
{
server_rec *s = cmd->server;
alias_server_conf *conf = (alias_server_conf *)
get_module_config(s-&gt;module_config,&amp;alias_module);
alias_entry *new = push_array (conf-&gt;redirects);
if (!is_url (url)) return "Redirect to non-URL";
new-&gt;fake = f; new-&gt;real = url;
return NULL;
}
</pre>
<!--#include virtual="footer.html" -->
</body></html>