event.c revision 95817edd05387a5276f51fcd5db79fc21b89b55b
369N/A/* Copyright 2001-2005 The Apache Software Foundation or its licensors, as 369N/A * Licensed under the Apache License, Version 2.0 (the "License"); 369N/A * you may not use this file except in compliance with the License. 369N/A * You may obtain a copy of the License at 369N/A * Unless required by applicable law or agreed to in writing, software 369N/A * distributed under the License is distributed on an "AS IS" BASIS, 369N/A * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 369N/A * See the License for the specific language governing permissions and 369N/A * limitations under the License. 369N/A * This MPM tries to fix the 'keep alive problem' in HTTP. 369N/A * After a client completes the first request, the client can keep the 844N/A * connection open to send more requests with the same socket. This can save 369N/A * signifigant overhead in creating TCP connections. However, the major 369N/A * disadvantage is that Apache traditionally keeps an entire child 369N/A * this MPM has a dedicated thread for handling both the Listenting sockets, 369N/A * and all sockets that are in a Keep Alive status. 369N/A * The MPM assumes the underlying apr_pollset implmentation is somewhat 844N/A * threadsafe. This currently is only compatible with KQueue and EPoll. This 844N/A * enables the MPM to avoid extra high level locking or having to wake up the 369N/A * listener thread when a keep-alive socket needs to be sent to it. 369N/A * This MPM not preform well on older platforms that do not have very good 369N/A * threading, like Linux with a 2.4 kernel, but this does not matter, since we 369N/A * require EPoll or KQueue. 369N/A * For FreeBSD, use 5.3. It is possible to run this MPM on FreeBSD 5.2.1, if 369N/A * For NetBSD, use at least 2.0. 369N/A * For Linux, you should use a 2.6 kernel, and make sure your glibc has epoll /* Limit on the total --- clients will be locked out if more servers than * this are needed. It is intended solely to keep the server from crashing * when things get out of hand. * We keep a hard maximum number of servers, for two reasons --- first off, * in case something goes seriously wrong, we want to stop the fork bomb * short of actually crashing the machine we're running on by filling some * kernel table. Secondly, it keeps the size of the scoreboard file small * enough that we can read the whole thing without worrying too much about /* Admin can't tune ServerLimit beyond MAX_SERVER_LIMIT. We want * some sort of compile-time limit to help catch typos. /* Limit on the threads per process. Clients will be locked out if more than * We keep this for one reason it keeps the size of the scoreboard file small * enough that we can read the whole thing without worrying too much about /* Admin can't tune ThreadLimit beyond MAX_THREAD_LIMIT. We want * some sort of compile-time limit to help catch typos. * Actual definitions of config globals /* The structure used to pass unique initialization info to each thread */ /* Structure used to pass information to the thread responsible for * creating the rest of the threads. int status;
/*XXX what is this for? 0 and 1 don't make it clear */ * The max child slot ever assigned, preserved across restarts. Necessary * to deal with MaxClients changes across AP_SIG_GRACEFUL restarts. We * use this value to optimize routines that have to scan the entire /* *Non*-shared http_main globals... */ /* The worker MPM respects a couple of runtime flags that can aid * in debugging. Setting the -DNO_DETACH flag will prevent the root process * from detaching from its controlling terminal. Additionally, setting * the -DONE_PROCESS flag (which implies -DNO_DETACH) will get you the * child_main loop running in the process which originally started up. * This gives you a pretty nice debugging environment. (You'll get a SIGHUP * early in standalone_main; just continue through. This is the server * trying to kill off any child processes which it might have lying * around --- Apache doesn't keep track of their pids, it just sends * SIGHUP to the process group, ignoring it in the root process. * Continue through and you'll be fine.). thread. Use this instead */ /* The LISTENER_SIGNAL signal will be sent from the main thread to the * listener thread to wake it up for graceful termination (what a child * process from an old generation does when the admin does "apachectl * graceful"). This signal will be blocked in all threads of a child * process except for the listener thread. /* An array of socket descriptors in use by each thread used to * perform a non-graceful (forced) shutdown of the server. /* XXX there is an obscure path that this doesn't handle perfectly: * right after listener thread is created but before * listener_os_thread is set, the first worker thread hits an * error and starts graceful termination * we should just be able to "kill(ap_my_pid, LISTENER_SIGNAL)" on all * platforms and wake up the listener thread since it is the only thread * with SIGHUP unblocked, but that doesn't work on Linux /* in case we weren't called from the listener thread, wake up the /* for ungraceful termination, let the workers exit now; * for graceful termination, the listener thread will notify the * workers to exit once it has stopped accepting new connections /* a clean exit from a child with proper cleanup */ /***************************************************************** * Connection structures and accounting... /* volatile just in case */ * ap_start_shutdown() and ap_start_restart(), below, are a first stab at * functions to initiate shutdown or restart without relying on signals. * Previously this was initiated in sig_term() and restart() signal handlers, * e.g. on Win32, from the service manager. Now the service manager can * call ap_start_shutdown() or ap_start_restart() as appropiate. Note that * these functions can also be called by the child processes, since global * variables are no longer used to pass on the required action to the parent. * These should only be called from the parent process itself, since the * parent process will use the shutdown_pending and restart_pending variables * to determine whether to shutdown or restart. The child process should * call signal_parent() directly to tell the parent to die -- this will * cause neither of those variable to be set, which the parent will * assume means something serious is wrong (which it will be, for the * child to force an exit) and so do an exit anyway. /* Um, is this _probably_ not an error, if the user has * tried to do a shutdown twice quickly, so we won't * worry about reporting it. /* do a graceful restart if graceful == 1 */ /* Probably not an error - don't bother reporting it */ /* we want to ignore HUPs and AP_SIG_GRACEFUL while we're busy #
endif /* AP_SIG_GRACEFUL */#
endif /* AP_SIG_GRACEFUL_STOP *//***************************************************************** * Here follows a long bunch of generic server bookkeeping stuff... /* XXX this is really a bad confusing obsolete name * maybe it should be ap_mpm_process_exiting? /* note: for a graceful termination, listener_may_exit will be set before * workers_may_exit, so check listener_may_exit /***************************************************************** * Child process main loop. if (
cs ==
NULL) {
/* This is a new connection */ "process_socket: connection aborted");
* XXX If the platform does not have a usable way of bundling * accept() with a socket readability check, like Win32, * and there are measurable delays before the * socket is readable due to the first data packet arriving, * it might be better to create the cs on the listener thread * with the state set to CONN_STATE_CHECK_REQUEST_LINE_READABLE * FreeBSD users will want to enable the HTTP accept filter * module in their kernel for the highest performance * When the accept filter is active, sockets are kept in the * kernel until a HTTP request is received. /* state will be updated upon return "network write failure in core output filter");
/* Still in WRITE_COMPLETION_STATE: * Set a write timeout for this connection, and let the * event thread poll for writeability. /* It greatly simplifies the logic to use a single timeout value here * because the new element can just be added to the end of the list and * it will stay sorted in expiration time sequence. If brand new * sockets are sent to the event thread for a readability check, this * will be a slight behavior change - they use the non-keepalive * timeout today. With a normal client, the socket will be readable in * a few milliseconds anyway. /* Add work to pollset. */ "process_socket: apr_pollset_add failure");
/* requests_this_child has gone to zero or below. See if the admin coded "MaxRequestsPerChild 0", and keep going in that case. Doing it this way simplifies the hot path in worker_thread */ /* wow! if you're executing this code, you may have set a record. * either this child process has served over 2 billion requests, or * you're running a threaded 2.0 on a 16 bit machine. * I'll buy pizza and beers at Apachecon for the first person to do * the former without cheating (dorking with INT_MAX, or running with * uncommitted performance patches, for example). * for the latter case, you probably deserve a beer too. Greg Ames /* XXX If specifying SIG_IGN is guaranteed to unblock a syscall, * then we don't need this goofy function. "creation of the timeout mutex failed.");
/* Create the main pollset */ "apr_pollset_create with Thread Safety failed.");
* Some of the pollset backends, like KQueue or Epoll * automagically remove the FD if the socket is closed, * therefore, we can accept _SUCCESS or _NOTFOUND, * and we still want to keep going /* trash the connection; we couldn't queue the connected * reserve a worker thread, block if all are currently busy. * this prevents the worker queue from overflowing and lets * other processes accept new connections in the mean time. "ap_queue_info_wait_for_idler failed. " "Attempting to shutdown process gracefully");
/* already reserved a worker thread - must have hit a * transient error on a previous pass /* We set this to force apr_pollset to wakeup if there hasn't been any IO * on any of its sockets. This allows sockets to have been added * when no other keepalive operations where going on. * current value is 1 second /* the following times out events that are really close in the future * to prevent extra poll calls * current value is .1 second "failed to initialize pollset, " "attempting to shutdown process gracefully");
/* Unblock the signal used to wake this thread up, and set a handler for "apr_pollset_poll failed. Attempting to " "shutdown process gracefully");
/* one of the sockets is readable */ "event_loop: unexpected state %d",
/* A Listener Socket is ready for an accept() */ /* create a new transaction pool for each accepted socket */ "Failed to create transaction pool");
/* later we trash rv and rely on csd to indicate /* E[NM]FILE, ENOMEM, etc */ /* trash the connection; we couldn't queue the connected }
/* if:else on pt->type */ }
/* while for processing poll */ /* XXX possible optimization: stash the current time for use as * r->request_time for new requests /* handle timed out sockets */ /* Step 1: keepalive timeouts */ /* XXX return NULL looks wrong - not an init failure * that bypasses all the cleanup outside the main loop * break seems more like it * need to evaluate seriousness of push2worker failures /* Step 2: write completion timeouts */ }
/* listener main loop */ /* wake up the main thread */ * wait for active connections to finish but we may want to wait * for idle workers to get out of the queue code and release mutexes, * since those mutexes are cleaned up pretty soon and some systems * may not react favorably (i.e., segfault) if operations are attempted "ap_queue_info_set_idle failed. Attempting to " "shutdown process gracefully.");
/* We get APR_EOF during a graceful shutdown once all the * connections accepted by this server process have been handled. /* We get APR_EINTR whenever ap_queue_pop() has been interrupted * from an explicit call to ap_queue_interrupt_all(). This allows * us to unblock threads stuck in ap_queue_pop() when a shutdown * If workers_may_exit is set and this is ungraceful termination/ * restart, we are bound to get an error on some systems (e.g., * AIX, which sanity-checks mutex operations) since the queue * may have already been cleaned up. Don't log the "error" if * workers_may_exit is set. /* We got some other error. */ my_info->
tid = -
1;
/* listener thread doesn't have a thread slot */ "apr_thread_create: unable to create listener thread");
/* let the parent decide how bad this really is */ /* XXX under some circumstances not understood, children can get stuck * in start_threads forever trying to take over slots which will * never be cleaned up; for now there is an APLOG_DEBUG message issued * every so often when this condition occurs /* We must create the fd queues before we start up the listener "ap_queue_init() failed");
"ap_queue_info_create() failed");
/* ap_threads_per_child does not include the listener thread */ "malloc: out of memory");
/* We are creating threads right now */ /* We let each thread update its own scoreboard entry. This is * done because it lets us deal with tid better. "apr_thread_create: unable to create worker thread");
/* let the parent decide how bad this really is */ /* Start the listener only when there are workers available */ /* wait for previous generation to clean up an entry */ if (
loops %
120 == 0) {
/* every couple of minutes */ "slots very quickly (%d of %d)",
/* What state should this child_main process be listed as in the * ap_update_child_status_from_indexes(my_child_num, i, SERVER_STARTING, * This state should be listed separately in the scoreboard, in some kind * of process_status, not mixed in with the worker threads' status. * "life_status" is almost right, but it's in the worker's structure, and * the name could be clearer. gla /* deal with a rare timing window which affects waking up the * listener thread... if the signal sent to the listener thread * is delivered between the time it verifies that the * listener_may_exit flag is clear and the time it enters a * blocking syscall, the signal didn't do any good... work around * that by sleeping briefly and sending it again /* listener not dead yet */ "the listener thread didn't exit");
"apr_thread_join: unable to join listener thread");
if (
threads[i]) {
/* if we ever created this thread */ "apr_thread_join: unable to join worker " * trying to take over slots from a "apr_thread_join: unable to join the start " "thread");
/*stuff to do before we switch id's, so we have permissions. */ /* done with init critical section */ /* Just use the standard apr_setup_signal_thread to block all signals * from being received. The child processes no longer use signals for * any communication with the parent process. "Couldn't initialize signal thread");
/* coding a value of zero means infinity */ /* Setup worker threads */ /* clear the storage; we may not create all our threads immediately, * and we want a 0 entry to indicate a thread which was not created "malloc: out of memory");
/* 0 means PTHREAD_CREATE_JOINABLE */ "apr_thread_create: unable to create worker thread");
/* let the parent decide how bad this really is */ /* If we are only running in one_process mode, we will want to * still handle signals. */ /* Block until we get a terminating signal. */ /* make sure the start thread has finished; signal_threads() * and join_workers() depend on that /* XXX join_start_thread() won't be awakened if one of our * threads encounters a critical error and attempts to /* helps us terminate a little more quickly than the dispatch of the * signal thread; beats the Pipe of Death and the browsers /* A terminating signal was received. Now join each of the * workers to clean them up. * If the worker already exited, then the join frees * their resources and returns. * If the worker hasn't exited, then this blocks until * they have (then cleans up). else {
/* !one_process */ /* remove SIGTERM from the set of blocked signals... if one of * the other threads in the process needs to take us down * (e.g., for MaxRequestsPerChild) it will send us SIGTERM /* Watch for any messages from the parent over the POD */ /* see if termination was triggered while we slept */ /* make sure the start thread has finished; * signal_threads() and join_workers depend on that /* A terminating signal was received. Now join each of the * workers to clean them up. * If the worker already exited, then the join frees * their resources and returns. * If the worker hasn't exited, then this blocks until * they have (then cleans up). "fork: Unable to fork new process");
/* fork didn't succeed. Fix the scoreboard or else * it will say SERVER_STARTING forever and ever /* In case system resources are maxxed out, we don't want Apache running away with the CPU trying to fork over and /* By default, AIX binds to a single processor. This bit unbinds * children which will then bind to another CPU. "processor unbind failed %d",
status);
/* start up a bunch of children */ * idle_spawn_rate is the number of children that will be spawned on the * next maintenance cycle if there aren't enough idle servers. It is * doubled up to MAX_SPAWN_RATE, and reset only when a cycle goes by * without the need to spawn. /* initialize the free_list */ /* Initialization to satisfy the compiler. It doesn't know * that ap_threads_per_child is always > 0 */ /* XXX any_dying_threads is probably no longer needed GLA */ /* We consider a starting server as idle because we started it * at least a cycle ago, and if it still hasn't finished starting * then we're just going to swamp things worse by forking more. * So we hopefully won't need to fork more if we count it. * This depends on the ordering of SERVER_READY and SERVER_STARTING. if (
ps->
pid != 0) {
/* XXX just set all_dead_threads in outer for loop if no pid? not much else matters */ && (!
ps->
pid /* no process in the slot */ ||
ps->
quiescing)) {
/* or at least one is going away */ /* great! we prefer these, because the new process can * start more threads sooner. So prioritize this slot * by putting it ahead of any slots with active threads. * first, make room by moving a slot that's potentially still * in use to the end of the array /* slot is still in use - back of the bus /* XXX if (!ps->quiescing) is probably more reliable GLA */ /* some child processes appear to be working. don't kill the /* looks like a basket case. give up. "No active workers found..." /* the child already logged the failure details */ /* terminate the free list */ /* only report this condition once */ "server reached MaxClients setting, consider" " raising the MaxClients setting");
"server seems busy, (you may need " "to increase StartServers, ThreadsPerChild " "spawning %d children, there are around %d idle " /* the next time around we want to spawn twice as many if this * wasn't good enough, but not if we've just done a graceful /* tell perform_idle_server_maintenance to check into this /* non-fatal death... note that it's gone in the scoreboard. */ /* resource shortage, minimize the fork rate */ /* we're still doing a 1-for-1 replacement of dead * children with new children /* Great, we've probably just lost a slot in the * scoreboard. Somehow we don't know about this child. "long lost child came home! (pid %ld)",
/* Don't perform idle maintenance when a child dies, * only do it when there's a timeout. Remember only a * finite number of children can die, and it's pretty * pathological for a lot to die suddenly. /* we hit a 1 second timeout in which none of the previous * generation of children needed to be reaped... so assume * they're all done, and pick up the slack if any is left. /* In any event we really shouldn't do the code below because * few of the servers we just started are in the IDLE state * yet, so we'd mistakenly create an extra server. "WARNING: Attempt to change ServerLimit or ThreadLimit " "ignored during restart");
/* fix the generation number in the global score; we just got a new, /* If we're doing a graceful_restart then we're going to see a lot * of children exiting immediately when we get into the main loop * below (because we just sent them AP_SIG_GRACEFUL). This happens pretty * rapidly... and for each one that exits we'll start a new one until * we reach at least daemons_min_free. But we may be permitted to * start more than that, so we'll just keep track of how many we're * supposed to start up without the 1 second penalty between each fork. /* give the system some time to recover before kicking into "%s configured -- resuming normal operations",
* Kill child processes, tell them to call child_exit, etc... /* cleanup pid file on normal shutdown */ "removed PID file %s (pid=%ld)",
/* Time to gracefully shut down: * Kill child processes, tell them to call child_exit, etc... /* Close our listeners, and then ask our children to do same */ /* cleanup pid file on normal shutdown */ "removed PID file %s (pid=%ld)",
", shutting down gracefully");
/* Don't really exit until each child has finished */ /* Relieve any children which have now exited */ /* Having just one child is enough to stay around */ /* We might be here because we received SIGTERM, either * way, try and make sure that all of our processes are /* we've been told to restart */ /* not worth thinking about */ /* advance to the next generation */ /* XXX: we really need to make sure this new generation number isn't in * use by any of the children. " received. Doing graceful restart");
/* wake up the children...time to die. But we'll have more soon */ /* This is mostly for debugging... so that we know what is still * gracefully dealing with existing request. /* Kill 'em all. Since the child acts the same on the parents SIGTERM * and a SIGHUP, we may as well use the same signal, because some user * pthreads are stealing signals from us left and right. "SIGHUP received. Attempting to restart");
/* This really should be a post_config hook, but the error log is already * redirected by that point, so we need to do this in the open_logs phase. NULL,
"no listening sockets available, shutting down");
"Could not open pipe-of-death.");
/* make sure that "ThreadsPerChild" gets set before "MaxClients" */ /* we're in the clear, got ThreadsPerChild first */ /* now to swap the data */ /* Make sure you don't change 'next', or you may get loops! */ /* XXX: first_child, parent, and data can never be set * for these directives, right? -aaron */ /* sigh, want this only the second time around */ "Couldn't create a Thread Safe Pollset. " "Is it supported on your platform?");
"apr_proc_detach failed");
/* The worker open_logs phase must run before the core's, or stderr * will be redirected to a file, and the messages won't print to the /* we need to set the MPM state before other pre-config hooks use MPM query * to retrieve it, so register as REALLY_FIRST "WARNING: detected MinSpareThreads set to non-positive.");
"Resetting to 1 to avoid almost certain Apache failure.");
"Please read the documentation.");
/* It is ok to use ap_threads_per_child here because we are * sure that it gets set before MaxClients in the pre_config stage. */ "WARNING: MaxClients (%d) must be at least as large",
" as ThreadsPerChild (%d). Automatically",
"WARNING: MaxClients (%d) is not an integer multiple",
" of ThreadsPerChild (%d), lowering MaxClients to %d",
" for a maximum of %d child processes,",
"WARNING: MaxClients of %d would require %d servers,",
" and would exceed the ServerLimit value of %d.",
" Automatically lowering MaxClients to %d. To increase,",
" please see the ServerLimit directive.");
"WARNING: Require MaxClients > 0, setting to 1");
"WARNING: ThreadsPerChild of %d exceeds ThreadLimit " "threads, lowering ThreadsPerChild to %d. To increase, " " ThreadLimit directive.");
"WARNING: Require ThreadsPerChild > 0, setting to 1");
/* you cannot change ServerLimit across a restart; ignore /* how do we log a message? the error log is a bit bucket at this * point; we'll just have to set a flag so that ap_mpm_run() "WARNING: ServerLimit of %d exceeds compile time limit " "WARNING: Require ServerLimit > 0, setting to 1");
/* you cannot change ThreadLimit across a restart; ignore /* how do we log a message? the error log is a bit bucket at this * point; we'll just have to set a flag so that ap_mpm_run() "WARNING: ThreadLimit of %d exceeds compile time limit " "WARNING: Require ThreadLimit > 0, setting to 1");
"Number of child processes launched at server startup"),
"Maximum number of child processes for this run of Apache"),
"Minimum number of idle threads, to handle request spikes"),
"Maximum number of idle threads"),
"Maximum number of threads alive at the same time"),
"Number of threads each child creates"),
"Maximum number of worker threads per child process for this " "run of Apache - Upper limit for ThreadsPerChild"),
NULL,
/* create per-directory config structure */ NULL,
/* merge per-directory config structures */ NULL,
/* create per-server config structure */ NULL,
/* merge per-server config structures */