fin_wait_2.html revision 2eaf662cbc81e823e8d9aeb8d54e69e63032493e
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Connections in FIN_WAIT_2 and Apache</title>
<link rev="made" href="mailto:marc@apache.org" />
</head>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
vlink="#000080" alink="#FF0000">
<!--#include virtual="header.html" -->
<h1 align="CENTER">Connections in the FIN_WAIT_2 state and
Apache</h1>
<ol>
<li>
<h2>What is the FIN_WAIT_2 state?</h2>
Starting with the Apache 1.2 betas, people are reporting
many more connections in the FIN_WAIT_2 state (as reported
by <code>netstat</code>) than they saw using older
versions. When the server closes a TCP connection, it sends
a packet with the FIN bit sent to the client, which then
responds with a packet with the ACK bit set. The client
then sends a packet with the FIN bit set to the server,
which responds with an ACK and the connection is closed.
The state that the connection is in during the period
between when the server gets the ACK from the client and
the server gets the FIN from the client is known as
FIN_WAIT_2. See the <a
href="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</a> for
the technical details of the state transitions.
<p>The FIN_WAIT_2 state is somewhat unusual in that there
is no timeout defined in the standard for it. This means
that on many operating systems, a connection in the
FIN_WAIT_2 state will stay around until the system is
rebooted. If the system does not have a timeout and too
many FIN_WAIT_2 connections build up, it can fill up the
space allocated for storing information about the
connections and crash the kernel. The connections in
FIN_WAIT_2 do not tie up an httpd process.</p>
</li>
<li>
<h2>But why does it happen?</h2>
There are numerous reasons for it happening, some of them
may not yet be fully clear. What is known follows.
<h3>Buggy clients and persistent connections</h3>
Several clients have a bug which pops up when dealing with
keepalives). When the connection is idle and the server
closes the connection (based on the <a
the client is programmed so that the client does not send
back a FIN and ACK to the server. This means that the
connection stays in the FIN_WAIT_2 state until one of the
following happens:
<ul>
<li>The client opens a new connection to the same or a
different site, which causes it to fully close the older
connection on that socket.</li>
<li>The user exits the client, which on some (most?)
clients causes the OS to fully shutdown the
connection.</li>
<li>The FIN_WAIT_2 times out, on servers that have a
timeout for this state.</li>
</ul>
<p>If you are lucky, this means that the buggy client will
fully close the connection and release the resources on
your server. However, there are some cases where the socket
is never fully closed, such as a dialup client
disconnecting from their provider before closing the
client. In addition, a client might sit idle for days
without making another connection, and thus may hold its
end of the socket open for days even though it has no
further use for it. <strong>This is a bug in the browser or
in its operating system's TCP implementation.</strong></p>
<p>The clients on which this problem has been verified to
exist:</p>
<ul>
i386)</li>
i386)</li>
<li>MSIE 3.01 on the Macintosh</li>
<li>MSIE 3.01 on Windows 95</li>
</ul>
<p>This does not appear to be a problem on:</p>
<ul>
</ul>
<p>It is expected that many other clients have the same
problem. What a client <strong>should do</strong> is
periodically check its open socket(s) to see if they have
been closed by the server, and close their side of the
connection if the server has closed. This check need only
occur once every few seconds, and may even be detected by a
clients have this capability, but they seem to be ignoring
it).</p>
<p>Apache <strong>cannot</strong> avoid these FIN_WAIT_2
states unless it disables persistent connections for the
buggy clients, just like we recommend doing for Navigator
2.x clients due to other bugs. However, non-persistent
connections increase the total number of connections needed
per client and slow retrieval of an image-laden web page.
Since non-persistent connections have their own resource
consumptions and a short waiting period after each closure,
a busy server may need persistence in order to best serve
its clients.</p>
<p>As far as we know, the client-caused FIN_WAIT_2 problem
is present for all servers that support persistent
connections, including Apache 1.1.x and 1.2.</p>
<h3>A necessary bit of code introduced in 1.2</h3>
While the above bug is a problem, it is not the whole
problem. Some users have observed no FIN_WAIT_2 problems
with Apache 1.1.x, but with 1.2b enough connections build
up in the FIN_WAIT_2 state to crash their server. The most
likely source for additional FIN_WAIT_2 states is a
function called <code>lingering_close()</code> which was
added between 1.1 and 1.2. This function is necessary for
the proper handling of persistent connections and any
request which includes content in the message body
data sent by the client for a certain time after the server
closes the connection. The exact reasons for doing this are
somewhat complicated, but involve what happens if the
client is making a request at the same time the server
sends a response and closes the connection. Without
lingering, the client might be forced to reset its TCP
input buffer before it has a chance to read the server's
response, and thus understand why the connection has
closed. See the <a href="#appendix">appendix</a> for more
details.
<p>The code in <code>lingering_close()</code> appears to
cause problems for a number of factors, including the
change in traffic patterns that it causes. The code has
been thoroughly reviewed and we are not aware of any bugs
in it. It is possible that there is some problem in the BSD
TCP stack, aside from the lack of a timeout for the
FIN_WAIT_2 state, exposed by the
<code>lingering_close</code> code that causes the observed
problems.</p>
</li>
<li>
What can I do about it? There are several possible
workarounds to the problem, some of which work better than
others.
<h3>Add a timeout for FIN_WAIT_2</h3>
The obvious workaround is to simply have a timeout for the
FIN_WAIT_2 state. This is not specified by the RFC, and
could be claimed to be a violation of the RFC, but it is
widely recognized as being necessary. The following systems
are known to have a timeout:
<ul>
versions starting at 2.0 or possibly earlier.</li>
1.2(?)</li>
versions(?)</li>
the <a
K210-027</a> patch installed.</li>
around version 2.2. The timeout can be tuned by using
<code>ndd</code> to modify
<code>tcp_fin_wait_2_flush_interval</code>, but the
default should be appropriate for most servers and
improper tuning can have negative impacts.</li>
earlier(?)</li>
to terminating connections in the FIN_WAIT_2 state after
the normal keepalive timeouts. This does not refer to the
persistent connection or HTTP keepalive timeouts, but the
<code>SO_LINGER</code> socket option which is enabled by
Apache. This parameter can be adjusted by using
<code>nettune</code> to modify parameters such as
<code>tcp_keepstart</code> and <code>tcp_keepstop</code>.
In later revisions, there is an explicit timer for
connections in FIN_WAIT_2 that can be modified; contact
HP support for details.</li>
patched to support a timeout. For IRIX 5.3, 6.2, and 6.3,
use patches 1654, 1703 and 1778 respectively. If you have
trouble locating these patches, please contact your SGI
support channel for help.</li>
is non-tunable at 600 seconds, while in 3.xx it defaults
to 600 seconds and is calculated based on the tunable
"max keep alive probes" (default of 8) multiplied by the
"keep alive interval" (default 75 seconds).</li>
around release 4.1 in mid-1994.</li>
</ul>
<p>The following systems are known to not have a
timeout:</p>
<ul>
and almost certainly never will have one because it as at
the very end of its development cycle for Sun. If you
have kernel source should be easy to patch.</li>
</ul>
<p>There is a <a
patch available</a> for adding a timeout to the FIN_WAIT_2
adaptable to most systems using BSD networking code. You
need kernel source code to be able to use it. If you do
adapt it to work for any other systems, please drop me a
note at <a
href="mailto:marc@apache.org">marc@apache.org</a>.</p>
<h3>Compile without using
<code>lingering_close()</code></h3>
It is possible to compile Apache 1.2 without using the
<code>lingering_close()</code> function. This will result
in that section of code being similar to that which was in
1.1. If you do this, be aware that it can cause problems
with PUTs, POSTs and persistent connections, especially if
the client uses pipelining. That said, it is no worse than
on 1.1, and we understand that keeping your server running
is quite important.
<p>To compile without the <code>lingering_close()</code>
function, add <code>-DNO_LINGCLOSE</code> to the end of the
<code>EXTRA_CFLAGS</code> line in your
<code>Configuration</code> file, rerun
<code>Configure</code> and rebuild the server.</p>
<h3>Use <code>SO_LINGER</code> as an alternative to
<code>lingering_close()</code></h3>
On most systems, there is an option called
<code>SO_LINGER</code> that can be set with
<code>setsockopt(2)</code>. It does something very similar
to <code>lingering_close()</code>, except that it is broken
on many systems so that it causes far more problems than
<code>lingering_close</code>. On some systems, it could
possibly work better so it may be worth a try if you have
no other alternatives.
<p>To try it, add <code>-DUSE_SO_LINGER
-DNO_LINGCLOSE</code> to the end of the
<code>EXTRA_CFLAGS</code> line in your
<code>Configuration</code> file, rerun
<code>Configure</code> and rebuild the server.</p>
<p><strong>NOTE:</strong> Attempting to use
<code>SO_LINGER</code> and <code>lingering_close()</code>
at the same time is very likely to do very bad things, so
don't.</p>
<h3>Increase the amount of memory used for storing
connection state</h3>
<dl>
<dt>BSD based networking code:</dt>
<dd>
BSD stores network data, such as connection states, in
something called an mbuf. When you get so many
connections that the kernel does not have enough mbufs
to put them all in, your kernel will likely crash. You
can reduce the effects of the problem by increasing the
number of mbufs that are available; this will not
prevent the problem, it will just make the server go
longer before crashing.
<p>The exact way to increase them may depend on your
OS; look for some reference to the number of "mbufs" or
"mbuf clusters". On many systems, this can be done by
adding the line <code>NMBCLUSTERS="n"</code>, where
<code>n</code> is the number of mbuf clusters you want
to your kernel config file and rebuilding your
kernel.</p>
</dd>
</dl>
<h3>Disable KeepAlive</h3>
<p>If you are unable to do any of the above then you
should, as a last resort, disable KeepAlive. Edit your
httpd.conf and change "KeepAlive On" to "KeepAlive
Off".</p>
</li>
<li>
Feedback If you have any information to add to this page,
please contact me at <a
href="mailto:marc@apache.org">marc@apache.org</a>.
<h2><a id="appendix" name="appendix"></a></h2>
</li>
<li>
Appendix
<p>Below is a message from Roy Fielding, one of the authors
<h3>Why the lingering close functionality is necessary with
HTTP</h3>
The need for a server to linger on a socket after a close
is noted a couple times in the HTTP specs, but not
explained. This explanation is based on discussions between
myself, Henrik Frystyk, Robert S. Thau, Dave Raggett, and
John C. Mallery in the hallways of MIT while I was at W3C.
<p>If a server closes the input side of the connection
while the client is sending data (or is planning to send
data), then the server's TCP stack will signal an RST
(reset) back to the client. Upon receipt of the RST, the
client will flush its own incoming TCP buffer back to the
un-ACKed packet indicated by the RST packet argument. If
the server has sent a message, usually an error response,
to the client just before the close, and the client
receives the RST packet before its application code has
read the error message from its incoming TCP buffer and
before the server has received the ACK sent by the client
upon receipt of that buffer, then the RST will flush the
error message before the client application has a chance to
see it. The result is that the client is left thinking that
the connection failed for no apparent reason.</p>
<p>There are two conditions under which this is likely to
occur:</p>
<ol>
<li>sending POST or PUT data without proper
authorization</li>
<li>sending multiple requests before each response
(pipelining) and one of the middle requests resulting in
an error or other break-the-connection result.</li>
</ol>
<p>The solution in all cases is to send the response, close
only the write half of the connection (what shutdown is
supposed to do), and continue reading on the socket until
it is either closed by the client (signifying it has
finally read the response) or a timeout occurs. That is
what the kernel is supposed to do if SO_LINGER is set.
Unfortunately, SO_LINGER has no effect on some systems; on
some other systems, it does not have its own timeout and
thus the TCP memory segments just pile-up until the next
reboot (planned or not).</p>
<p>Please note that simply removing the linger code will
not solve the problem -- it only moves it to a different
and much harder one to detect.</p>
</li>
</ol>
<!--#include virtual="footer.html" -->
</body>
</html>