layeredio.html revision 753a32fbcbddec2fcdd58f4cfaf2bf08bca36fb6
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
<title>Apache 2.0 Layered I/O</title>
</head>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<BODY
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#000080"
ALINK="#FF0000"
>
<H1 align="center">Apache Layered I/O</H1>
<P>Layered I/O has been the holy grail of Apache module writers for years.
With Apache 2.0, module writers can finally take advantage of layered I/O
in their modules.
<P>In all previous versions of Apache, only one handler was allowed to modify
the data stream that was sent to the client. With Apache 2.0, one module
can modify the data and then specify that other modules can modify the data
if they would like.
<H2>Taking advantage of layered I/O</H2>
<P>In order to make a module use layered I/O, there are some modifications
needed. A new return value has been added for modules, RERUN_HANDLERS.
When a handler returns this value, the core searches through the list of
handlers looking for another module that wants to try the request.
<P>When a module returns RERUN_HANDLERS, it must modify teo fields of the
request_rec, the handler and content_type fields. Most modules will
set the handler field to NULL, and allow the core to choose the which
module gets run next. If these two fields are not modified, then the server
will loop forever calling the same module's handler.
<P>Most module's should not write out to the network if they want to take
advantage of layered I/O. Two BUFF structures have been added to the
request_rec, one of input, and one for output. The module should read and
write to these BUFF's. The module will also have to setup the input field for
the next module in the list. A new function has been added, ap_setup_input,
which all modules should call before they do any reading to get data to modify.
This function checks to determine if the previous module set the input field,
if so, that input is used, if not the file is opend and that data source
is used. The output field is used basically the same way. The module must
set this field before they call ap_r* in order to take advantage of
layered I/O. If this field is not set, ap_r* will write directly to the
client. Usually at the end of a handler, the input (for the next module)
will be the read side of a pipe, and the output will be the write side of
the same pipe.
<H3>An Example of Layered I/O.</H3>
<P>This example is the most basic layered I/O example possible. It is
basically CGI's generated by mod_cgi and sent to the network via http_core.
<P>mod_cgi executes the cgi script, and then sets request_rec->input to
the output pipe of the CGI. It then NULL's out request_rec->handler, and
sets request_rec->content_type to whatever the CGI writes out (in this case,
<P>ap_invoke_handlers() then loops back to the top of the handler list
and searches for a handler that can deal with this content_type. In this case
the correct module is the default_handler from http_core.
<P>When default handler starts, it calls ap_setup_input, which has found
a valid request_rec->input, so that is used for all inputs. The output field
in the request_rec is NULL, so when default_handler calls an output primitive
it gets sent out over the network.</P>
<I>Ryan Bloom, 25th March 2000</I>
</body>
</html>