proxy_http.c revision c1bf42dc465137de1fdb8f3d9d1c3e4d2db5c003
0N/A/* ==================================================================== 3261N/A * The Apache Software License, Version 1.1 0N/A * Copyright (c) 2000-2002 The Apache Software Foundation. All rights 2362N/A * Redistribution and use in source and binary forms, with or without 0N/A * modification, are permitted provided that the following conditions 0N/A * 1. Redistributions of source code must retain the above copyright 0N/A * notice, this list of conditions and the following disclaimer. 0N/A * 2. Redistributions in binary form must reproduce the above copyright 0N/A * notice, this list of conditions and the following disclaimer in 0N/A * the documentation and/or other materials provided with the 0N/A * 3. The end-user documentation included with the redistribution, 0N/A * if any, must include the following acknowledgment: 2362N/A * "This product includes software developed by the 2362N/A * Alternately, this acknowledgment may appear in the software itself, 0N/A * if and wherever such third-party acknowledgments normally appear. 0N/A * 4. The names "Apache" and "Apache Software Foundation" must 0N/A * not be used to endorse or promote products derived from this 0N/A * software without prior written permission. For written 0N/A * permission, please contact apache@apache.org. 0N/A * 5. Products derived from this software may not be called "Apache", 0N/A * nor may "Apache" appear in their name, without prior written 0N/A * permission of the Apache Software Foundation. 0N/A * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED 0N/A * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 0N/A * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 0N/A * DISCLAIMED. IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR 0N/A * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 0N/A * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 0N/A * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF 0N/A * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 0N/A * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 955N/A * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 3259N/A * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 955N/A * ==================================================================== 0N/A * This software consists of voluntary contributions made by many 2157N/A * individuals on behalf of the Apache Software Foundation. For more 2157N/A * information on the Apache Software Foundation, please see 0N/A * Portions of this software are based upon public domain software 0N/A * originally written at the National Center for Supercomputing Applications, 0N/A * University of Illinois, Urbana-Champaign. 0N/A/* HTTP routines for Apache proxy */ 0N/A * Canonicalise http-like URLs. 0N/A * scheme is the scheme for the URL 0N/A * url is the URL starting with the first '/' 0N/A * def_port is the default port for this scheme. 0N/A /* ap_port_of_scheme() */ 0N/A "proxy: HTTP: canonicalising URL %s",
url);
0N/A /* do syntatic check. 0N/A * We break the URL into host, port, path, search 0N/A /* N.B. if this isn't a true proxy request, then the URL _path_ 0N/A * has already been decoded. True proxy requests have r->uri 0N/A * == r->unparsed_uri, and no others have that property. /* XXX FIXME: Make sure this handled the ambiguous case of the :80 /* Clear all connection-based headers from the incoming headers table */ * Break up the URL to determine the host to connect to /* we break the URL into host, port, uri */ /* do a DNS lookup for the destination host */ /* see memory note above */ /* allocate these out of the connection pool - the check on * r->connection->id makes sure that this string does not get accessed * past the connection lifetime */ /* are we connecting directly, or via a proxy? */ /* see memory note above */ /* Get the server port for the Via headers */ /* check if ProxyBlock directive on this host */ "Connect to remote machine blocked");
/* We have determined who to connect to. Now make the connection, supporting * a KeepAlive connection. /* get all the possible IP addresses for the destname and loop through them * until we get a successful connection /* if a keepalive socket is already open, check whether it must stay * open, or whether it should be closed and a new socket created. /* see memory note above */ "proxy: keepalive address match (keep original socket)");
"proxy: keepalive address mismatch / connection has" " changed (close old socket (%s/%s, %d/%d))",
/* get a socket - either a keepalive one, or a new one */ /* use previous keepalive socket */ /* put back old timeout */ "proxy: HTTP: previous connection is closed");
/* create a new socket */ * At this point we have a list of one or more IP addresses of * the machine to connect to. If configured, reorder this * list so that the "best candidate" is first try. "best * candidate" could mean the least loaded server, the fastest * responding server, whatever. * For now we do nothing, ie we get DNS round robin. /* handle a permanent error on the connect */ "proxy: socket is connected");
/* the socket is now open, create a new backend server connection */ /* the peer reset the connection already; ap_run_create_connection() r->
server,
"proxy: an error occurred creating a " r->
server,
"proxy: failed to enable ssl support " "proxy: connection complete to %pI (%s)",
/* set up the connection filters */ * Send the HTTP/1.1 request to the remote server /* strip connection listed hop-by-hop headers from the request */ /* even though in theory a connection: close coming from the client * should not affect the connection to the server, it's unlikely * that subsequent client requests will hit this thread/process, so * we cancel server keepalive if the client does. /* sub-requests never use keepalives */ /* don't want to use r->hostname, as the incoming header might have a "proxy: no HTTP 0.9 request (with no host line) " "on incoming request and preserve host set " "forcing hostname to be %s for uri %s",
/* Block all outgoing Via: headers */ /* Create a "Via:" request header entry and merge it */ /* Generate outgoing Via: header with/without server comment: */ /* X-Forwarded-*: handling * These request headers are only really useful when the mod_proxy * is used in a reverse proxy configuration, so that useful info * about the client can be passed through the reverse proxy and on * to the backend server, which may require the information to * In a forward proxy situation, these options are a potential * privacy violation, as information about clients behind the proxy * are revealed to arbitrary servers out there on the internet. * The HTTP/1.1 Via: header is designed for passing client * information through proxies to a server, and should be used in * a forward proxy configuation instead of X-Forwarded-*. See the * ProxyVia option for details. /* Add X-Forwarded-For: so that the upstream has a chance to * determine, where the original request came from. /* Add X-Forwarded-Host: so that upstream knows what the * original request hostname was. /* Add X-Forwarded-Server: so that upstream knows what the * name of this proxy server is (if there are more than one) * XXX: This duplicates Via: - do we strictly need it? /* send request headers */ /* Clear out hop-by-hop request headers not to send * RFC2616 13.5.1 says we should strip these headers /* XXX: @@@ FIXME: "Proxy-Authorization" should *only* be * suppressed if THIS server requested the authentication, * not when a frontend proxy requested it! * The solution to this problem is probably to strip out * the Proxy-Authorisation header in the authorisation * code itself, not here. This saves us having to signal * somehow whether this request was authenticated or not. /* If you POST to a page that gets server-side parsed * by mod_include, and the parsing results in a reverse * proxy call, the proxied request will be a GET, but * its request_rec will have inherited the Content-Length * of the original request (the POST for the enclosing * page). We can't send the original POST's request body * as part of the proxied subrequest, so we need to avoid * sending the corresponding content length. Otherwise, * the server to which we're proxying will sit there * forever, waiting for a request body that will never /* add empty line at the end of the headers */ "proxy: request failed to %pI (%s)",
/* send the request data, if any. */ /* If this brigade contain EOS, either stop or remove it. */ /* As a shortcut, if this brigade is simply an EOS bucket, * don't send anything down the filter chain. /* We can't pass this EOS to the output_filters. */ "proxy: pass request data failed to %pI (%s)",
* loop over response parsing logic * in the case that the origin told us /* Get response from the remote server, and pass it up the /* In case anyone needs to know, this is a fake request that is really a /* handle one potential stray CRLF */ "proxy: error reading status line from remote " "Error reading from remote server");
/* Is it an HTTP/1 response? * This is buggy if we ever see an HTTP/1.10 /* If not an HTTP/1 message or * if the status line was > 8192 bytes apr_pstrcat(p,
"Corrupt status line returned by remote " r->
server,
"proxy: bad HTTP/%d.%d status line " /* N.B. for HTTP/1.0 clients, we have to fold line-wrapped headers*/ /* Also, take care with headers with multiple occurences. */ r->
server,
"proxy: bad HTTP/%d.%d header " * ap_send_error relies on a headers_out to be present. we * are in a bad position here.. so force everything we send out * to have nothing to do with the incoming packet /* strip connection listed hop-by-hop headers from response */ /* handle Via header in response */ /* create a "Via:" response header entry and merge it */ /* cancel keepalive if HTTP/1.0 or less */ /* an http/0.9 response */ "proxy: HTTP: received 100 CONTINUE");
/* we must accept 3 kinds of date, but generate only 1 kind of date */ /* munge the Location and URI response headers according to const char *
wa =
"WWW-Authenticate";
"proxy: origin server sent 401 without w-a header");
/* Is it an HTTP/0.9 response? If so, send the extra data */ /* send body - but only if a body is expected */ (r->
status >
199) &&
/* not any 1xx response */ /* We need to copy the output headers and treat them as input * headers as well. BUT, we need to do this before we remove * TE, so that they are preserved accordingly for * ap_http_filter to know where to end. "proxy: start body send");
* if we are overriding the errors, we can't put the content * of the page into the brigade /* read the body, pass it to the output filters */ r->
server,
"proxy (PID %d): readbytes: %#x",
/* found the last brigade? */ /* if this is the last brigade, cleanup the * backend connection first to prevent the * backend server from hanging around waiting * for a slow client to eat these bytes /* signal that we must leave */ /* try send what we read */ /* Ack! Phbtt! Die! User aborted! */ p_conn->
close =
1;
/* this causes socket close below */ /* make sure we always clean up after ourselves */ /* if we are done, leave */ /* the code above this checks for 'OK' which is what the hook expects */ /* clear r->status for override error, otherwise ErrorDocument * thinks that this is a recursive error, and doesn't find the /* If there are no KeepAlives, or if the connection has been signalled * to close, close the socket and clean up /* if the connection is < HTTP/1.1, or Connection: close, * we close the socket, otherwise we leave it open for KeepAlive support * This handles http:// URLs, and other URLs using a remote proxy over http * If proxyhost is NULL, then contact the server directly, otherwise * Note that if a proxy is used, then URLs other than http: can be accessed, * also, if we have trouble which is clearly specific to the proxy, then * we return DECLINED so that we can try another proxy. (Or the direct /* Note: Memory pool allocation. * A downstream keepalive connection is always connected to the existence * (or not) of an upstream keepalive connection. If this is not done then * load balancing against multiple backend servers breaks (one backend * server ends up taking 100% of the load), and the risk is run of * downstream keepalive connections being kept open unnecessarily. This * keeps webservers busy and ties up resources. * As a result, we allocate all sockets out of the upstream connection * pool, and when we want to reuse a socket, we check first whether the * connection ID of the current upstream connection is the same as that * of the connection when the socket was opened. "proxy: HTTPS: declining URL %s" " (mod_ssl not configured?)",
url);
"proxy: HTTP: declining URL %s",
url);
return DECLINED;
/* only interested in HTTP, or FTP via proxy */ "proxy: HTTP: serving URL %s",
url);
/* only use stored info for top-level pages. Sub requests don't share /* create space for state information */ /* Step One: Determine Who To Connect To */ /* Step Two: Make the Connection */ /* Step Three: Send the Request */ /* Step Four: Receive the Response */ /* clean up even if there is an error */ /* Step Five: Clean Up */ NULL,
/* create per-directory config structure */ NULL,
/* merge per-directory config structures */ NULL,
/* create per-server config structure */ NULL,
/* merge per-server config structures */ NULL,
/* command apr_table_t */