perf-tuning.html revision 29fa5989cb9c0722cd802f0b4ac48c81b3c48d88
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#000080" alink="#ff0000">
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Author: Dean Gaudet
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Apache is a general webserver, which is designed to be correct first, and
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinfast second. Even so, it's performance is quite satisfactory. Most
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsites have less than 10Mbits of outgoing bandwidth, which Apache can
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinfill using only a low end Pentium-based webserver. In practice sites
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinwith more bandwidth require more than one machine to fill the bandwidth
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chindue to other constraints (such as CGI or database transaction overhead).
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinFor these reasons the development focus has been mostly on correctness
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinand configurability.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Unfortunately many folks overlook these facts and cite raw performance
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinnumbers as if they are some indication of the quality of a web server
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinproduct. There is a bare minimum performance that is acceptable, beyond
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthat extra speed only caters to a much smaller segment of the market.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinBut in order to avoid this hurdle to the acceptance of Apache in some
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinmarkets, effort was put into Apache 1.3 to bring performance up to a
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinpoint where the difference with other high-end webservers is minimal.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Finally there are the folks who just plain want to see how fast something
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chincan go. The author falls into this category. The rest of this document
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinis dedicated to these folks who want to squeeze every last bit of
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinperformance out of Apache's current model, and want to understand why
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinit does some things which slow it down.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Note that this is tailored towards Apache 1.3 on Unix. Some of it applies
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinto Apache on NT. Apache on NT has not been tuned for performance yet,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinin fact it probably performs very poorly because NT performance requires
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968china different programming model.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>The single biggest hardware issue affecting webserver performance
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinis RAM. A webserver should never ever have to swap, swapping increases
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthe latency of each request beyond a point that users consider "fast
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinenough". This causes users to hit stop and reload, further increasing
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthe load. You can, and should, control the <code>MaxClients</code>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsetting so that your server does not spawn so many children it starts
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Beyond that the rest is mundane: get a fast enough CPU, a fast enough
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinnetwork card, and fast enough disks, where "fast enough" is something
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthat needs to be determined by experimentation.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Operating system choice is largely a matter of local concerns. But
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968china general guideline is to always apply the latest vendor TCP/IP patches.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinHTTP serving completely breaks many of the assumptions built into Unix
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinkernels up through 1994 and even 1995. Good choices include
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinrecent FreeBSD, and Linux.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Prior to Apache 1.3, <code>HostnameLookups</code> defaulted to On.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinThis adds latency
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinto every request because it requires a DNS lookup to complete before
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthe request is finished. In Apache 1.3 this setting defaults to Off.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinHowever (1.3 or later), if you use any <code>allow from domain</code> or
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>deny from domain</code> directives then you will pay for a
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chindouble reverse DNS lookup (a reverse, followed by a forward to make sure
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthat the reverse is not being spoofed). So for the highest performance
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinavoid using these directives (it's fine to use IP addresses rather than
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chindomain names).
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Note that it's possible to scope the directives, such as within
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968china <code><Location /server-status></code> section. In this
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chincase the DNS lookups are only performed on requests matching the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chincriteria. Here's an example which disables
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinlookups except for .html and .cgi files:
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinHostnameLookups off
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<Files ~ "\.(html|cgi)$>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin HostnameLookups on
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin</Files>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinBut even still, if you just need DNS names
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinin some CGIs you could consider doing the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>gethostbyname</code> call in the specific CGIs that need it.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Wherever in your URL-space you do not have an
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>Options SymLinksIfOwnerMatch</code> Apache will have to
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinissue extra system calls to check up on symlinks. One extra call per
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinfilename component. For example, if you had:
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<Directory />
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin Options SymLinksIfOwnerMatch
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin</Directory>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinand a request is made for the URI <code>/index.html</code>.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinThen Apache will perform <code>lstat(2)</code> on <code>/www</code>,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>/www/htdocs</code>, and <code>/www/htdocs/index.html</code>. The
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinso they will occur on every single request. If you really desire the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsymlinks security checking you can do something like this:
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<Directory />
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin Options FollowSymLinks
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin</Directory>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin Options -FollowSymLinks +SymLinksIfOwnerMatch
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin</Directory>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinThis at least avoids the extra checks for the <code>DocumentRoot</code>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinpath. Note that you'll need to add similar sections if you have any
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>Alias</code> or <code>RewriteRule</code> paths outside of your
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chindocument root. For highest performance, and no symlink protection,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinset <code>FollowSymLinks</code> everywhere, and never set
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Wherever in your URL-space you allow overrides (typically
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>.htaccess</code> files) Apache will attempt to open
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>.htaccess</code> for each filename component. For example,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<Directory />
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin AllowOverride all
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin</Directory>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinand a request is made for the URI <code>/index.html</code>. Then
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>/www/.htaccess</code>, and <code>/www/htdocs/.htaccess</code>.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinThe solutions are similar to the previous case of <code>Options
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinFollowSymLinks</code>. For highest performance use
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>AllowOverride None</code> everywhere in your filesystem.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>If at all possible, avoid content-negotiation if you're really
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chininterested in every last ounce of performance. In practice the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinbenefits of negotiation outweigh the performance penalties. There's
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinone case where you can speed up the server. Instead of using
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968china wildcard such as:
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinDirectoryIndex index
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinUse a complete list of options:
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinDirectoryIndex index.cgi index.pl index.shtml index.html
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinwhere you list the most common choice first.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Prior to Apache 1.3 the <code>MinSpareServers</code>,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>MaxSpareServers</code>, and <code>StartServers</code> settings
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinall had drastic effects on benchmark results. In particular, Apache
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinrequired a "ramp-up" period in order to reach a number of children
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsufficient to serve the load being applied. After the initial
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinspawning of <code>StartServers</code> children, only one child per
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsecond would be created to satisfy the <code>MinSpareServers</code>
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsetting. So a server being accessed by 100 simultaneous clients,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinusing the default <code>StartServers</code> of 5 would take on
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthe order 95 seconds to spawn enough children to handle the load. This
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinworks fine in practice on real-life servers, because they aren't restarted
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinfrequently. But does really poorly on benchmarks which might only run
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinfor ten minutes.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>The one-per-second rule was implemented in an effort to avoid
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinswamping the machine with the startup of new children. If the machine
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinis busy spawning children it can't service requests. But it has such
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968china drastic effect on the perceived performance of Apache that it had
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinto be replaced. As of Apache 1.3,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthe code will relax the one-per-second rule. It
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinwill spawn one, wait a second, then spawn two, wait a second, then spawn
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinfour, and it will continue exponentially until it is spawning 32 children
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinper second. It will stop whenever it satisfies the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>This appears to be responsive enough that it's
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinalmost unnecessary to twiddle the <code>MinSpareServers</code>,
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>MaxSpareServers</code> and <code>StartServers</code> knobs. When
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinmore than 4 children are spawned per second, a message will be emitted
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinto the <code>ErrorLog</code>. If you see a lot of these errors then
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinconsider tuning these settings. Use the <code>mod_status</code> output
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinas a guide.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>Related to process creation is process death induced by the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>MaxRequestsPerChild</code> setting. By default this is 30, which
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinis probably far too low unless your server is using a module such as
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>mod_perl</code> which causes children to have bloated memory
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinimages. If your server is serving mostly static pages then consider
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinraising this value to something like 10000. The code is robust enough
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinthat this shouldn't be a problem.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>When keep-alives are in use, children will be kept busy
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chindoing nothing waiting for more requests on the already open
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinconnection. The default <code>KeepAliveTimeout</code> of
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin15 seconds attempts to minimize this effect. The tradeoff
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinhere is between network bandwidth and server resources.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinIn no event should you raise this above about 60 seconds, as
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<a href="http://www.research.digital.com/wrl/techreports/abstracts/95.4.html">
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinmost of the benefits are lost</a>.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinand you also set <code>Rule STATUS=yes</code> when building
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinApache, then on every request Apache will perform two calls to
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>gettimeofday(2)</code> (or <code>times(2)</code> depending
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinon your operating system), and (pre-1.3) several extra calls to
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>time(2)</code>. This is all done so that the status report
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chincontains timing indications. For highest performance, set <code>Rule
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinSTATUS=no</code>.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<p>This discusses a shortcoming in the Unix socket API.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinSuppose your
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinweb server uses multiple <code>Listen</code> statements to listen on
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chineither multiple ports or multiple addresses. In order to test each
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsocket to see if a connection is ready Apache uses <code>select(2)</code>.
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<code>select(2)</code> indicates that a socket has <i>none</i> or
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin<i>at least one</i> connection waiting on it. Apache's model includes
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinmultiple children, and all the idle ones test for new connections at the
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chinsame time. A naive implementation looks something like this
da2e3ebdc1edfbc5028edf1354e7dd2fa69a7968chin(these examples do not match the code, they're contrived for
same uid as the webserver (i.e. all CGIs unless you use something
<a href="ftp://ds.internic.net/internet-drafts/draft-ietf-http-connection-00.txt">draft-ietf-http-connection-00.txt</a> section 8,
never released patches (i.e. SunOS4 -- although folks with a source
on those stacks with a proper implementation (i.e. Linux 2.0.31) this
getsockname(3, {sin_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
writev(3, [{"HTTP/1.1 200 OK\r\nDate: Thu, 11"..., 245}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 6144}], 2) = 6389
(i.e. due to a timeout or because of a maximum number of requests).
getsockname(3, {sin_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
writev(3, [{"HTTP/1.1 200 OK\r\nDate: Thu, 11"..., 245}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 6144}], 2) = 6389
network packets). On testing, various Unixes (BSDI 2.x, Solaris 2.5,
be atomic. (i.e. entries from multiple children could become mixed together).
getsockname(3, {sin_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0