event.c revision f58a640f5fcb90f5baf1814479a60cb2179c2ce6
* limitations under the License. * This MPM tries to fix the 'keep alive problem' in HTTP. * After a client completes the first request, the client can keep the * connection open to send more requests with the same socket. This can save * signifigant overhead in creating TCP connections. However, the major * disadvantage is that Apache traditionally keeps an entire child * process/thread waiting for data from the client. To solve this problem, * this MPM has a dedicated thread for handling both the Listenting sockets, * and all sockets that are in a Keep Alive status. * The MPM assumes the underlying apr_pollset implmentation is somewhat * threadsafe. This currently is only compatible with KQueue and EPoll. This * enables the MPM to avoid extra high level locking or having to wake up the * listener thread when a keep-alive socket needs to be sent to it. * This MPM not preform well on older platforms that do not have very good * threading, like Linux with a 2.4 kernel, but this does not matter, since we * require EPoll or KQueue. * For FreeBSD, use 5.3. It is possible to run this MPM on FreeBSD 5.2.1, if * For NetBSD, use at least 2.0. * For Linux, you should use a 2.6 kernel, and make sure your glibc has epoll /* Limit on the total --- clients will be locked out if more servers than * this are needed. It is intended solely to keep the server from crashing * when things get out of hand. * We keep a hard maximum number of servers, for two reasons --- first off, * in case something goes seriously wrong, we want to stop the fork bomb * short of actually crashing the machine we're running on by filling some * kernel table. Secondly, it keeps the size of the scoreboard file small * enough that we can read the whole thing without worrying too much about /* Admin can't tune ServerLimit beyond MAX_SERVER_LIMIT. We want * some sort of compile-time limit to help catch typos. /* Limit on the threads per process. Clients will be locked out if more than * We keep this for one reason it keeps the size of the scoreboard file small * enough that we can read the whole thing without worrying too much about /* Admin can't tune ThreadLimit beyond MAX_THREAD_LIMIT. We want * some sort of compile-time limit to help catch typos. * Actual definitions of config globals /* The structure used to pass unique initialization info to each thread */ /* Structure used to pass information to the thread responsible for * creating the rest of the threads. /* data retained by event across load/unload of the module * allocated on first call to pre-config hook; located on * subsequent calls to pre-config hook * The max child slot ever assigned, preserved across restarts. Necessary * to deal with MaxClients changes across AP_SIG_GRACEFUL restarts. We * use this value to optimize routines that have to scan the entire /* The event MPM respects a couple of runtime flags that can aid * in debugging. Setting the -DNO_DETACH flag will prevent the root process * from detaching from its controlling terminal. Additionally, setting * the -DONE_PROCESS flag (which implies -DNO_DETACH) will get you the * child_main loop running in the process which originally started up. * This gives you a pretty nice debugging environment. (You'll get a SIGHUP * early in standalone_main; just continue through. This is the server * trying to kill off any child processes which it might have lying * around --- Apache doesn't keep track of their pids, it just sends * SIGHUP to the process group, ignoring it in the root process. * Continue through and you'll be fine.). thread. Use this instead */ /* The LISTENER_SIGNAL signal will be sent from the main thread to the * listener thread to wake it up for graceful termination (what a child * process from an old generation does when the admin does "apachectl * graceful"). This signal will be blocked in all threads of a child * process except for the listener thread. /* An array of socket descriptors in use by each thread used to * perform a non-graceful (forced) shutdown of the server. /* XXX there is an obscure path that this doesn't handle perfectly: * right after listener thread is created but before * listener_os_thread is set, the first worker thread hits an * error and starts graceful termination /* unblock the listener if it's waiting for a worker */ * we should just be able to "kill(ap_my_pid, LISTENER_SIGNAL)" on all * platforms and wake up the listener thread since it is the only thread * with SIGHUP unblocked, but that doesn't work on Linux /* in case we weren't called from the listener thread, wake up the /* for ungraceful termination, let the workers exit now; * for graceful termination, the listener thread will notify the * workers to exit once it has stopped accepting new connections /* a clean exit from a child with proper cleanup */ /***************************************************************** * Connection structures and accounting... /* volatile just in case */ * ap_start_shutdown() and ap_start_restart(), below, are a first stab at * functions to initiate shutdown or restart without relying on signals. * Previously this was initiated in sig_term() and restart() signal handlers, * e.g. on Win32, from the service manager. Now the service manager can * call ap_start_shutdown() or ap_start_restart() as appropiate. Note that * these functions can also be called by the child processes, since global * variables are no longer used to pass on the required action to the parent. * These should only be called from the parent process itself, since the * parent process will use the shutdown_pending and restart_pending variables * to determine whether to shutdown or restart. The child process should * call signal_parent() directly to tell the parent to die -- this will * cause neither of those variable to be set, which the parent will * assume means something serious is wrong (which it will be, for the * child to force an exit) and so do an exit anyway. /* Um, is this _probably_ not an error, if the user has * tried to do a shutdown twice quickly, so we won't * worry about reporting it. /* do a graceful restart if graceful == 1 */ /* Probably not an error - don't bother reporting it */ /* we want to ignore HUPs and AP_SIG_GRACEFUL while we're busy #
endif /* AP_SIG_GRACEFUL */#
endif /* AP_SIG_GRACEFUL_STOP *//***************************************************************** * Child process main loop. if (
cs ==
NULL) {
/* This is a new connection */ "process_socket: connection aborted");
* XXX If the platform does not have a usable way of bundling * accept() with a socket readability check, like Win32, * and there are measurable delays before the * socket is readable due to the first data packet arriving, * it might be better to create the cs on the listener thread * with the state set to CONN_STATE_CHECK_REQUEST_LINE_READABLE * FreeBSD users will want to enable the HTTP accept filter * module in their kernel for the highest performance * When the accept filter is active, sockets are kept in the * kernel until a HTTP request is received. /* Since we have an input filter which 'cloggs' the input stream, * like mod_ssl, lets just do the normal read from input filters, * like the Worker MPM does. /* state will be updated upon return "network write failure in core output filter");
/* Still in WRITE_COMPLETION_STATE: * Set a write timeout for this connection, and let the * event thread poll for writeability. /* It greatly simplifies the logic to use a single timeout value here * because the new element can just be added to the end of the list and * it will stay sorted in expiration time sequence. If brand new * sockets are sent to the event thread for a readability check, this * will be a slight behavior change - they use the non-keepalive * timeout today. With a normal client, the socket will be readable in * a few milliseconds anyway. /* Add work to pollset. */ "process_socket: apr_pollset_add failure");
/* requests_this_child has gone to zero or below. See if the admin coded "MaxRequestsPerChild 0", and keep going in that case. Doing it this way simplifies the hot path in worker_thread */ /* XXX If specifying SIG_IGN is guaranteed to unblock a syscall, * then we don't need this goofy function. /* XXXXX: recycle listener_poll_types */ "creation of the timeout mutex failed.");
/* Create the main pollset */ "apr_pollset_create with Thread Safety failed.");
/* TODO: subpools, threads, reuse, etc. -- currently use malloc() inside :( */ * Some of the pollset backends, like KQueue or Epoll * automagically remove the FD if the socket is closed, * therefore, we can accept _SUCCESS or _NOTFOUND, * and we still want to keep going /* trash the connection; we couldn't queue the connected * reserve a worker thread, block if all are currently busy. * this prevents the worker queue from overflowing and lets * other processes accept new connections in the mean time. "ap_queue_info_wait_for_idler failed. " "Attempting to shutdown process gracefully");
/* already reserved a worker thread - must have hit a * transient error on a previous pass /* XXXXXX: Convert to skiplist or other better data structure * (yes, this is VERY VERY VERY VERY BAD) /* Structures to reuse */ /* XXXXX: lol, pool allocation without a context from any thread.Yeah. Right. MPMs Suck. */ /* Okay, insert sorted by when.. */ /* the following times out events that are really close in the future * to prevent extra poll calls * current value is .1 second "failed to initialize pollset, " "attempting to shutdown process gracefully");
/* Unblock the signal used to wake this thread up, and set a handler for /* TOOD: what should do here? ugh. */ "apr_pollset_poll failed. Attempting to " "shutdown process gracefully");
/* one of the sockets is readable */ "event_loop: unexpected state %d",
/* A Listener Socket is ready for an accept() */ /* create a new transaction pool for each accepted socket */ "Failed to create transaction pool");
/* later we trash rv and rely on csd to indicate /* E[NM]FILE, ENOMEM, etc */ /* trash the connection; we couldn't queue the connected }
/* if:else on pt->type */ /* send socket to serf. */ /* XXXX: this doesn't require get_worker(&have_idle_worker) */ }
/* while for processing poll */ /* XXX possible optimization: stash the current time for use as * r->request_time for new requests /* handle timed out sockets */ /* Step 1: keepalive timeouts */ /* XXX return NULL looks wrong - not an init failure * that bypasses all the cleanup outside the main loop * break seems more like it * need to evaluate seriousness of push2worker failures /* Step 2: write completion timeouts */ }
/* listener main loop */ /* wake up the main thread */ * wait for active connections to finish but we may want to wait * for idle workers to get out of the queue code and release mutexes, * since those mutexes are cleaned up pretty soon and some systems * may not react favorably (i.e., segfault) if operations are attempted "ap_queue_info_set_idle failed. Attempting to " "shutdown process gracefully.");
/* We get APR_EOF during a graceful shutdown once all the * connections accepted by this server process have been handled. /* We get APR_EINTR whenever ap_queue_pop() has been interrupted * from an explicit call to ap_queue_interrupt_all(). This allows * us to unblock threads stuck in ap_queue_pop() when a shutdown * If workers_may_exit is set and this is ungraceful termination/ * restart, we are bound to get an error on some systems (e.g., * AIX, which sanity-checks mutex operations) since the queue * may have already been cleaned up. Don't log the "error" if * workers_may_exit is set. /* We got some other error. */ my_info->
tid = -
1;
/* listener thread doesn't have a thread slot */ "apr_thread_create: unable to create listener thread");
/* let the parent decide how bad this really is */ /* XXX under some circumstances not understood, children can get stuck * in start_threads forever trying to take over slots which will * never be cleaned up; for now there is an APLOG_DEBUG message issued * every so often when this condition occurs /* We must create the fd queues before we start up the listener "ap_queue_init() failed");
"ap_queue_info_create() failed");
/* threads_per_child does not include the listener thread */ "malloc: out of memory");
/* We are creating threads right now */ /* We let each thread update its own scoreboard entry. This is * done because it lets us deal with tid better. "apr_thread_create: unable to create worker thread");
/* let the parent decide how bad this really is */ /* Start the listener only when there are workers available */ /* wait for previous generation to clean up an entry */ if (
loops %
120 == 0) {
/* every couple of minutes */ "slots very quickly (%d of %d)",
/* What state should this child_main process be listed as in the * ap_update_child_status_from_indexes(my_child_num, i, SERVER_STARTING, * This state should be listed separately in the scoreboard, in some kind * of process_status, not mixed in with the worker threads' status. * "life_status" is almost right, but it's in the worker's structure, and * the name could be clearer. gla /* deal with a rare timing window which affects waking up the * listener thread... if the signal sent to the listener thread * is delivered between the time it verifies that the * listener_may_exit flag is clear and the time it enters a * blocking syscall, the signal didn't do any good... work around * that by sleeping briefly and sending it again /* listener not dead yet */ "the listener thread didn't exit");
"apr_thread_join: unable to join listener thread");
if (
threads[i]) {
/* if we ever created this thread */ "apr_thread_join: unable to join worker " * trying to take over slots from a "apr_thread_join: unable to join the start " "thread");
/*stuff to do before we switch id's, so we have permissions. */ /* done with init critical section */ /* Just use the standard apr_setup_signal_thread to block all signals * from being received. The child processes no longer use signals for * any communication with the parent process. "Couldn't initialize signal thread");
/* coding a value of zero means infinity */ /* Setup worker threads */ /* clear the storage; we may not create all our threads immediately, * and we want a 0 entry to indicate a thread which was not created "malloc: out of memory");
/* 0 means PTHREAD_CREATE_JOINABLE */ "apr_thread_create: unable to create worker thread");
/* let the parent decide how bad this really is */ /* If we are only running in one_process mode, we will want to * still handle signals. */ /* Block until we get a terminating signal. */ /* make sure the start thread has finished; signal_threads() * and join_workers() depend on that /* XXX join_start_thread() won't be awakened if one of our * threads encounters a critical error and attempts to /* helps us terminate a little more quickly than the dispatch of the * signal thread; beats the Pipe of Death and the browsers /* A terminating signal was received. Now join each of the * workers to clean them up. * If the worker already exited, then the join frees * their resources and returns. * If the worker hasn't exited, then this blocks until * they have (then cleans up). else {
/* !one_process */ /* remove SIGTERM from the set of blocked signals... if one of * the other threads in the process needs to take us down * (e.g., for MaxRequestsPerChild) it will send us SIGTERM /* Watch for any messages from the parent over the POD */ /* see if termination was triggered while we slept */ /* make sure the start thread has finished; * signal_threads() and join_workers depend on that /* A terminating signal was received. Now join each of the * workers to clean them up. * If the worker already exited, then the join frees * their resources and returns. * If the worker hasn't exited, then this blocks until * they have (then cleans up). "fork: Unable to fork new process");
/* fork didn't succeed. There's no need to touch the scoreboard; * if we were trying to replace a failed child process, then * server_main_loop() marked its workers SERVER_DEAD, and if * we were trying to replace a child process that exited normally, * its worker_thread()s left SERVER_DEAD or SERVER_GRACEFUL behind. /* In case system resources are maxxed out, we don't want Apache running away with the CPU trying to fork over and /* By default, AIX binds to a single processor. This bit unbinds * children which will then bind to another CPU. "processor unbind failed %d",
status);
/* This new child process is squatting on the scoreboard * entry owned by an exiting child process, which cannot * exit until all active requests complete. * Don't forget about this exiting child process, or we * won't be able to kill it if it doesn't exit by the * time the server is shut down. /* start up a bunch of children */ * idle_spawn_rate is the number of children that will be spawned on the * next maintenance cycle if there aren't enough idle servers. It is * doubled up to MAX_SPAWN_RATE, and reset only when a cycle goes by * without the need to spawn. /* initialize the free_list */ /* Initialization to satisfy the compiler. It doesn't know * that threads_per_child is always > 0 */ /* short cut if all active processes have been examined and * enough empty scoreboard slots have been found /* XXX any_dying_threads is probably no longer needed GLA */ /* We consider a starting server as idle because we started it * at least a cycle ago, and if it still hasn't finished starting * then we're just going to swamp things worse by forking more. * So we hopefully won't need to fork more if we count it. * This depends on the ordering of SERVER_READY and SERVER_STARTING. if (
ps->
pid != 0) {
/* XXX just set all_dead_threads in outer for loop if no pid? not much else matters */ && (!
ps->
pid /* no process in the slot */ ||
ps->
quiescing)) {
/* or at least one is going away */ /* great! we prefer these, because the new process can * start more threads sooner. So prioritize this slot * by putting it ahead of any slots with active threads. * first, make room by moving a slot that's potentially still * in use to the end of the array /* slot is still in use - back of the bus /* XXX if (!ps->quiescing) is probably more reliable GLA */ /* some child processes appear to be working. don't kill the /* looks like a basket case. give up. "No active workers found..." /* the child already logged the failure details */ /* terminate the free list */ if (
free_length == 0) {
/* scoreboard is full, can't fork */ /* only report this condition once */ "server reached MaxClients setting, consider" " raising the MaxClients setting");
"scoreboard is full, not at MaxClients");
"server seems busy, (you may need " "to increase StartServers, ThreadsPerChild " "spawning %d children, there are around %d idle " /* the next time around we want to spawn twice as many if this * wasn't good enough, but not if we've just done a graceful /* tell perform_idle_server_maintenance to check into this /* non-fatal death... note that it's gone in the scoreboard. */ /* resource shortage, minimize the fork rate */ /* we're still doing a 1-for-1 replacement of dead * children with new children /* Great, we've probably just lost a slot in the * scoreboard. Somehow we don't know about this child. "long lost child came home! (pid %ld)",
/* Don't perform idle maintenance when a child dies, * only do it when there's a timeout. Remember only a * finite number of children can die, and it's pretty * pathological for a lot to die suddenly. /* we hit a 1 second timeout in which none of the previous * generation of children needed to be reaped... so assume * they're all done, and pick up the slack if any is left. /* In any event we really shouldn't do the code below because * few of the servers we just started are in the IDLE state * yet, so we'd mistakenly create an extra server. /* fix the generation number in the global score; we just got a new, /* If we're doing a graceful_restart then we're going to see a lot * of children exiting immediately when we get into the main loop * below (because we just sent them AP_SIG_GRACEFUL). This happens pretty * rapidly... and for each one that exits we may start a new one, until * there are at least min_spare_threads idle threads, counting across * all children. But we may be permitted to start more children than * that, so we'll just keep track of how many we're * supposed to start up without the 1 second penalty between each fork. /* give the system some time to recover before kicking into "%s configured -- resuming normal operations",
* Kill child processes, tell them to call child_exit, etc... /* cleanup pid file on normal shutdown */ "removed PID file %s (pid=%ld)",
/* Time to gracefully shut down: * Kill child processes, tell them to call child_exit, etc... /* Close our listeners, and then ask our children to do same */ /* cleanup pid file on normal shutdown */ "removed PID file %s (pid=%ld)",
", shutting down gracefully");
/* Don't really exit until each child has finished */ /* Relieve any children which have now exited */ /* Having just one child is enough to stay around */ /* We might be here because we received SIGTERM, either * way, try and make sure that all of our processes are /* we've been told to restart */ /* not worth thinking about */ /* advance to the next generation */ /* XXX: we really need to make sure this new generation number isn't in * use by any of the children. " received. Doing graceful restart");
/* wake up the children...time to die. But we'll have more soon */ /* This is mostly for debugging... so that we know what is still * gracefully dealing with existing request. /* Kill 'em all. Since the child acts the same on the parents SIGTERM * and a SIGHUP, we may as well use the same signal, because some user * pthreads are stealing signals from us left and right. "SIGHUP received. Attempting to restart");
/* This really should be a post_config hook, but the error log is already * redirected by that point, so we need to do this in the open_logs phase. /* the reverse of pre_config, we want this only the first time around */ "no listening sockets available, shutting down");
"could not open pipe-of-death");
/* sigh, want this only the second time around */ "Couldn't create a Thread Safe Pollset. " "Is it supported on your platform?" "Also check system or user limits!");
"apr_proc_detach failed");
/* the reverse of pre_config, we want this only the first time around */ "WARNING: ServerLimit of %d exceeds compile-time " " %d servers, decreasing to %d.",
"ServerLimit of %d exceeds compile-time limit " "of %d, decreasing to match",
"WARNING: ServerLimit of %d not allowed, " "ServerLimit of %d not allowed, increasing to 1",
/* you cannot change ServerLimit across a restart; ignore /* don't need a startup console version here */ "changing ServerLimit to %d from original value of %d " "not allowed during restart",
"WARNING: ThreadLimit of %d exceeds compile-time " " %d threads, decreasing to %d.",
"ThreadLimit of %d exceeds compile-time limit " "of %d, decreasing to match",
"WARNING: ThreadLimit of %d not allowed, " "ThreadLimit of %d not allowed, increasing to 1",
/* you cannot change ThreadLimit across a restart; ignore /* don't need a startup console version here */ "changing ThreadLimit to %d from original value of %d " "not allowed during restart",
"WARNING: ThreadsPerChild of %d exceeds ThreadLimit " " %d threads, decreasing to %d.",
" To increase, please see the ThreadLimit " "ThreadsPerChild of %d exceeds ThreadLimit " "of %d, decreasing to match",
"WARNING: ThreadsPerChild of %d not allowed, " "ThreadsPerChild of %d not allowed, increasing to 1",
"WARNING: MaxClients of %d is less than " " %d, increasing to %d. MaxClients must be at " " as the number of threads in a single server.");
"MaxClients of %d is less than ThreadsPerChild " "of %d, increasing to match",
"WARNING: MaxClients of %d is not an integer " " ThreadsPerChild of %d, decreasing to nearest " " for a maximum of %d servers.",
"MaxClients of %d is not an integer multiple of " "ThreadsPerChild of %d, decreasing to nearest " "WARNING: MaxClients of %d would require %d " " would exceed ServerLimit of %d, decreasing to %d.",
" To increase, please see the ServerLimit " "MaxClients of %d would require %d servers and " "exceed ServerLimit of %d, decreasing to %d",
/* ap_daemons_to_start > ap_daemons_limit checked in ap_mpm_run() */ "WARNING: StartServers of %d not allowed, " "StartServers of %d not allowed, increasing to 1",
"WARNING: MinSpareThreads of %d not allowed, " " to avoid almost certain server failure.");
" Please read the documentation.");
"MinSpareThreads of %d not allowed, increasing to 1",
/* max_spare_threads < min_spare_threads + threads_per_child * checked in ap_mpm_run() /* Our open_logs hook function must run before the core's, or stderr * will be redirected to a file, and the messages won't print to the /* we need to set the MPM state before other pre-config hooks use MPM query * to retrieve it, so register as REALLY_FIRST "Number of child processes launched at server startup"),
"Maximum number of child processes for this run of Apache"),
"Minimum number of idle threads, to handle request spikes"),
"Maximum number of idle threads"),
"Maximum number of threads alive at the same time"),
"Number of threads each child creates"),
"Maximum number of worker threads per child process for this " "run of Apache - Upper limit for ThreadsPerChild"),
NULL,
/* hook to run before apache parses args */ NULL,
/* create per-directory config structure */ NULL,
/* merge per-directory config structures */ NULL,
/* create per-server config structure */ NULL,
/* merge per-server config structures */