dso.html revision fc3990cf558f26cfac81bf57c1b8d9274e8dcdec
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org" />
<title>Dynamic Shared Object (DSO) support</title>
</head>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
vlink="#000080" alink="#FF0000">
<!--#include virtual="header.html" -->
<h1 align="center">Dynamic Shared Object (DSO) Support</h1>
<p>The Apache HTTP Server is a modular program where the
administrator can choose the functionality to include in the
server by selecting a set of modules. The modules can be
statically compiled into the <code>httpd</code> binary when the
server is built. Alternatively, modules can be compiled as
Dynamic Shared Objects (DSOs) that exist separately from the
main <code>httpd</code> binary file. DSO modules may be
compiled at the time the server is built, or they may be
compiled and added at a later time using the Apache Extension
Tool (<a href="programs/apxs.html">apxs</a>).</p>
<p>This document describes how to use DSO modules as well as
the theory behind their use.</p>
<ul>
<li><a href="#implementation">Implementation</a></li>
<li><a href="#usage">Usage Summary</a></li>
<li><a href="#background">Background</a></li>
<li><a href="#advantages">Advantages and
Disadvantages</a></li>
</ul>
<hr />
<table border="1">
<tr>
<td valign="top"><strong>Related Modules</strong><br />
<br />
<a href="mod/mod_so.html">mod_so</a><br />
</td>
<td valign="top"><strong>Related Directives</strong><br />
<br />
<a href="mod/mod_so.html#loadmodule">LoadModule</a><br />
</td>
</tr>
</table>
<h2><a id="implementation"
name="implementation">Implementation</a></h2>
<p>The DSO support for loading individual Apache modules is
based on a module named <a
href="mod/mod_so.html"><code>mod_so.c</code></a> which must be
statically compiled into the Apache core. It is the only module
besides <code>core.c</code> which cannot be put into a DSO
itself. Practically all other distributed Apache modules
can then be placed into a DSO by individually enabling the DSO
build for them via <code>configure</code>'s
<code>--enable-<i>module</i>=shared</code> option as disucussed
in the <a href="install.html">install documentation</a>. After
a module is compiled into a DSO named <code>mod_foo.so</code>
you can use <a href="mod/mod_so.html"><code>mod_so</code></a>'s
<a
href="mod/mod_so.html#loadmodule"><code>LoadModule</code></a>
command in your <code>httpd.conf</code> file to load this
module at server startup or restart.</p>
<p>To simplify this creation of DSO files for Apache modules
(especially for third-party modules) a new support program
named <a href="programs/apxs.html">apxs</a> (<em>APache
eXtenSion</em>) is available. It can be used to build DSO based
modules <em>outside of</em> the Apache source tree. The idea is
simple: When installing Apache the <code>configure</code>'s
<code>make install</code> procedure installs the Apache C
header files and puts the platform-dependent compiler and
linker flags for building DSO files into the <code>apxs</code>
program. This way the user can use <code>apxs</code> to compile
his Apache module sources without the Apache distribution
source tree and without having to fiddle with the
platform-dependent compiler and linker flags for DSO
support.</p>
<h2><a id="usage" name="usage">Usage Summary</a></h2>
<p>To give you an overview of the DSO features of Apache 2.0,
here is a short and concise summary:</p>
<ol>
<li>
Build and install a <em>distributed</em> Apache module, say
<code>mod_foo.c</code>, into its own DSO
<code>mod_foo.so</code>:
<table bgcolor="#f0f0f0" cellpadding="10">
<tr>
<td>
<pre>
$ /configure --prefix=/path/to/install
--enable-foo=shared
$ make install
</pre>
</td>
</tr>
</table>
</li>
<li>
Build and install a <em>third-party</em> Apache module, say
<code>mod_foo.c</code>, into its own DSO
<code>mod_foo.so</code>:
<table bgcolor="#f0f0f0" cellpadding="10">
<tr>
<td>
<pre>
$ /configure --add-module=module_type:/path/to/3rdparty/mod_foo.c
--enable-foo=shared
$ make install
</pre>
</td>
</tr>
</table>
</li>
<li>
Configure Apache for <em>later installation</em> of shared
modules:
<table bgcolor="#f0f0f0" cellpadding="10">
<tr>
<td>
<pre>
$ /configure --enable-so
$ make install
</pre>
</td>
</tr>
</table>
</li>
<li>
Build and install a <em>third-party</em> Apache module, say
<code>mod_foo.c</code>, into its own DSO
<code>mod_foo.so</code> <em>outside of</em> the Apache
source tree using <a href="programs/apxs.html">apxs</a>:
<table bgcolor="#f0f0f0" cellpadding="10">
<tr>
<td>
<pre>
$ cd /path/to/3rdparty
$ apxs -c mod_foo.c
$ apxs -i -a -n foo mod_foo.so
</pre>
</td>
</tr>
</table>
</li>
</ol>
<p>In all cases, once the shared module is compiled, you must
use a <a
href="mod/mod_so.html#loadmodule"><code>LoadModule</code></a>
directive in <code>httpd.conf</code> to tell Apache to activate
the module.</p>
<h2><a id="background" name="background">Background</a></h2>
<p>On modern Unix derivatives there exists a nifty mechanism
usually called dynamic linking/loading of <em>Dynamic Shared
Objects</em> (DSO) which provides a way to build a piece of
program code in a special format for loading it at run-time
into the address space of an executable program.</p>
<p>This loading can usually be done in two ways: Automatically
by a system program called <code>ld.so</code> when an
executable program is started or manually from within the
executing program via a programmatic system interface to the
Unix loader through the system calls
<code>dlopen()/dlsym()</code>.</p>
<p>In the first way the DSO's are usually called <em>shared
libraries</em> or <em>DSO libraries</em> and named
<code>libfoo.so</code> or <code>libfoo.so.1.2</code>. They
reside in a system directory (usually <code>/usr/lib</code>)
and the link to the executable program is established at
build-time by specifying <code>-lfoo</code> to the linker
command. This hard-codes library references into the executable
program file so that at start-time the Unix loader is able to
locate <code>libfoo.so</code> in <code>/usr/lib</code>, in
paths hard-coded via linker-options like <code>-R</code> or in
paths configured via the environment variable
<code>LD_LIBRARY_PATH</code>. It then resolves any (yet
unresolved) symbols in the executable program which are
available in the DSO.</p>
<p>Symbols in the executable program are usually not referenced
by the DSO (because it's a reusable library of general code)
and hence no further resolving has to be done. The executable
program has no need to do anything on its own to use the
symbols from the DSO because the complete resolving is done by
the Unix loader. (In fact, the code to invoke
<code>ld.so</code> is part of the run-time startup code which
is linked into every executable program which has been bound
non-static). The advantage of dynamic loading of common library
code is obvious: the library code needs to be stored only once,
in a system library like <code>libc.so</code>, saving disk
space for every program.</p>
<p>In the second way the DSO's are usually called <em>shared
objects</em> or <em>DSO files</em> and can be named with an
arbitrary extension (although the canonical name is
<code>foo.so</code>). These files usually stay inside a
program-specific directory and there is no automatically
established link to the executable program where they are used.
Instead the executable program manually loads the DSO at
run-time into its address space via <code>dlopen()</code>. At
this time no resolving of symbols from the DSO for the
executable program is done. But instead the Unix loader
automatically resolves any (yet unresolved) symbols in the DSO
from the set of symbols exported by the executable program and
its already loaded DSO libraries (especially all symbols from
the ubiquitous <code>libc.so</code>). This way the DSO gets
knowledge of the executable program's symbol set as if it had
been statically linked with it in the first place.</p>
<p>Finally, to take advantage of the DSO's API the executable
program has to resolve particular symbols from the DSO via
<code>dlsym()</code> for later use inside dispatch tables
<em>etc.</em> In other words: The executable program has to
manually resolve every symbol it needs to be able to use it.
The advantage of such a mechanism is that optional program
parts need not be loaded (and thus do not spend memory) until
they are needed by the program in question. When required,
these program parts can be loaded dynamically to extend the
base program's functionality.</p>
<p>Although this DSO mechanism sounds straightforward there is
at least one difficult step here: The resolving of symbols from
the executable program for the DSO when using a DSO to extend a
program (the second way). Why? Because "reverse resolving" DSO
symbols from the executable program's symbol set is against the
library design (where the library has no knowledge about the
programs it is used by) and is neither available under all
platforms nor standardized. In practice the executable
program's global symbols are often not re-exported and thus not
available for use in a DSO. Finding a way to force the linker
to export all global symbols is the main problem one has to
solve when using DSO for extending a program at run-time.</p>
<p>The shared library approach is the typical one, because it
is what the DSO mechanism was designed for, hence it is used
for nearly all types of libraries the operating system
provides. On the other hand using shared objects for extending
a program is not used by a lot of programs.</p>
<p>As of 1998 there are only a few software packages available
which use the DSO mechanism to actually extend their
functionality at run-time: Perl 5 (via its XS mechanism and the
DynaLoader module), Netscape Server, <em>etc.</em> Starting
with version 1.3, Apache joined the crew, because Apache
already uses a module concept to extend its functionality and
internally uses a dispatch-list-based approach to link external
modules into the Apache core functionality. So, Apache is
really predestined for using DSO to load its modules at
run-time.</p>
<h2><a id="advantages" name="advantages">Advantages and
Disadvantages</a></h2>
<p>The above DSO based features have the following
advantages:</p>
<ul>
<li>The server package is more flexible at run-time because
the actual server process can be assembled at run-time via <a
href="mod/mod_so.html#loadmodule"><code>LoadModule</code></a>
<code>httpd.conf</code> configuration commands instead of
<code>configure</code> options at build-time. For instance
this way one is able to run different server instances
(standard &amp; SSL version, minimalistic &amp; powered up
version [mod_perl, PHP3], <em>etc.</em>) with only one Apache
installation.</li>
<li>The server package can be easily extended with
third-party modules even after installation. This is at least
a great benefit for vendor package maintainers who can create
a Apache core package and additional packages containing
extensions like PHP3, mod_perl, mod_fastcgi,
<em>etc.</em></li>
<li>Easier Apache module prototyping because with the
DSO/<code>apxs</code> pair you can both work outside the
Apache source tree and only need an <code>apxs -i</code>
command followed by an <code>apachectl restart</code> to
bring a new version of your currently developed module into
the running Apache server.</li>
</ul>
<p>DSO has the following disadvantages:</p>
<ul>
<li>The DSO mechanism cannot be used on every platform
because not all operating systems support dynamic loading of
code into the address space of a program.</li>
<li>The server is approximately 20% slower at startup time
because of the symbol resolving overhead the Unix loader now
has to do.</li>
<li>The server is approximately 5% slower at execution time
under some platforms because position independent code (PIC)
sometimes needs complicated assembler tricks for relative
addressing which are not necessarily as fast as absolute
addressing.</li>
<li>Because DSO modules cannot be linked against other
DSO-based libraries (<code>ld -lfoo</code>) on all platforms
(for instance a.out-based platforms usually don't provide
this functionality while ELF-based platforms do) you cannot
use the DSO mechanism for all types of modules. Or in other
words, modules compiled as DSO files are restricted to only
use symbols from the Apache core, from the C library
(<code>libc</code>) and all other dynamic or static libraries
used by the Apache core, or from static library archives
(<code>libfoo.a</code>) containing position independent code.
The only chances to use other code is to either make sure the
Apache core itself already contains a reference to it or
loading the code yourself via <code>dlopen()</code>.</li>
</ul>
<!--#include virtual="footer.html" -->
</body>
</html>