mod_proxy_http.c revision 40b22d3b20454959fe51fdc89907908d77701078
* See the License for the specific language governing permissions and * limitations under the License. /* HTTP routines for Apache proxy */ * Canonicalise http-like URLs. * scheme is the scheme for the URL * url is the URL starting with the first '/' * def_port is the default port for this scheme. /* ap_port_of_scheme() */ "proxy: HTTP: canonicalising URL %s",
url);
* We break the URL into host, port, path, search "error parsing URL %s: %s",
* now parse path/search args, according to rfc1738: * In a reverse proxy, our URL has been processed, so canonicalise * unless proxy-nocanon is set to say it's raw * In a forward proxy, we have and MUST NOT MANGLE the original. default:
/* wtf are we doing here? */ path =
url;
/* this is the raw path */ /* Clear all connection-based headers from the incoming headers table */ * Warning = "Warning" ":" 1#warning-value * warning-value = warn-code SP warn-agent SP warn-text * warn-agent = ( host [ ":" port ] ) | pseudonym * ; the name or pseudonym of the server adding * ; the Warning header, for use in debugging * warn-text = quoted-string * warn-date = <"> HTTP-date <"> * Buggrit, use a bloomin' regexp! * (\d{3}\s+\S+\s+\".*?\"(\s+\"(.*?)\")?) --> whole in $1, date in $3 /* OK, we have a date here */ const char te_hdr[] =
"Transfer-Encoding: chunked" CRLF;
/* add empty line at the end of the headers */ "proxy: pass request body failed to %pI (%s)",
char chunk_hdr[
20];
/* must be here due to transient bucket. */ /* If this brigade contains EOS, either stop or remove it. */ /* We can't pass this EOS to the output_filters. */ * Append the end-of-chunk CRLF /* we never sent the header brigade, so go ahead and * Save input_brigade in bb brigade. (At least) in the SSL case * input_brigade contains transient buckets whose data would get * overwritten during the next call of ap_get_brigade in the loop. * ap_save_brigade ensures these buckets to be set aside. * Calling ap_save_brigade with NULL as filter is OK, because * bb brigade already has been created and does not need to get * created by ap_save_brigade. /* The request is flushed below this loop with chunk EOS header */ /* we never sent the header brigade because there was no request body; /* input brigade still has an EOS which we can't pass to the output_filters. */ /* Now we have headers-only, or the chunk EOS mark; flush it */ "proxy: could not parse request Content-Length (%s)",
/* If this brigade contains EOS, either stop or remove it. */ /* We can't pass this EOS to the output_filters. */ /* C-L < bytes streamed?!? * We will error out after the body is completely * consumed, but we can't stream more bytes at the * back end since they would in part be interpreted * as another request! If nothing is sent, then * Prevents HTTP Response Splitting. "proxy: read more bytes of request body than expected " /* we never sent the header brigade, so go ahead and * Save input_brigade in bb brigade. (At least) in the SSL case * input_brigade contains transient buckets whose data would get * overwritten during the next call of ap_get_brigade in the loop. * ap_save_brigade ensures these buckets to be set aside. * Calling ap_save_brigade with NULL as filter is OK, because * bb brigade already has been created and does not need to get * created by ap_save_brigade. /* Once we hit EOS, we are ready to flush. */ "proxy: client %s given Content-Length did not match" /* we never sent the header brigade since there was no request * body; send it now with the flush flag /* If this brigade contains EOS, either stop or remove it. */ /* We can't pass this EOS to the output_filters. */ * LimitRequestBody does not affect Proxy requests (Should it?). * Let it take effect if we decide to store the body in a * temporary file on disk. "proxy: Request body is larger than the" /* can't spool any more in memory; write latest brigade to disk */ "proxy: search for temporary directory failed");
"proxy: creation of temporary file in directory %s failed",
"proxy: write to temporary file %s failed",
* Save input_brigade in body_brigade. (At least) in the SSL case * input_brigade contains transient buckets whose data would get * overwritten during the next call of ap_get_brigade in the loop. * ap_save_brigade ensures these buckets to be set aside. * Calling ap_save_brigade with NULL as filter is OK, because * body_brigade already has been created and does not need to get * created by ap_save_brigade. /* This is all a single brigade, pass with flush flagged */ * Send the HTTP/1.1 request to the remote server * To be compliant, we only use 100-Continue for requests with bodies. * We also make sure we won't be talking HTTP/1.0 as well. * According to RFC 2616 8.2.3 we are not allowed to forward an * Expect: 100-continue to an HTTP/1.0 server. Instead we MUST return * a HTTP_EXPECTATION_FAILED /* don't want to use r->hostname, as the incoming header might have a "proxy: no HTTP 0.9 request (with no host line) " "on incoming request and preserve host set " "forcing hostname to be %s for uri %s",
/* Block all outgoing Via: headers */ /* If USE_CANONICAL_NAME_OFF was configured for the proxy virtual host, * then the server name returned by ap_get_server_name() is the * origin server name (which does make too much sense with Via: headers) * so we use the proxy vhost's name instead. /* Create a "Via:" request header entry and merge it */ /* Generate outgoing Via: header with/without server comment: */ /* Use HTTP/1.1 100-Continue as quick "HTTP ping" test /* X-Forwarded-*: handling * These request headers are only really useful when the mod_proxy * is used in a reverse proxy configuration, so that useful info * about the client can be passed through the reverse proxy and on * to the backend server, which may require the information to * In a forward proxy situation, these options are a potential * privacy violation, as information about clients behind the proxy * are revealed to arbitrary servers out there on the internet. * The HTTP/1.1 Via: header is designed for passing client * information through proxies to a server, and should be used in * a forward proxy configuation instead of X-Forwarded-*. See the * ProxyVia option for details. /* Add X-Forwarded-For: so that the upstream has a chance to * determine, where the original request came from. /* Add X-Forwarded-Host: so that upstream knows what the * original request hostname was. /* Add X-Forwarded-Server: so that upstream knows what the * name of this proxy server is (if there are more than one) * XXX: This duplicates Via: - do we strictly need it? * Make a copy of the headers_in table before clearing the connection * headers as we need the connection headers later in the http output * filter to prepare the correct response headers. * Note: We need to take r->pool for apr_table_copy as the key / value * pairs in r->headers_in have been created out of r->pool and * p might be (and actually is) a longer living pool. * This would trigger the bad pool ancestry abort in apr_table_copy if * apr is compiled with APR_POOL_DEBUG. /* send request headers */ /* Clear out hop-by-hop request headers not to send * RFC2616 13.5.1 says we should strip these headers /* Do we want to strip Proxy-Authorization ? * If we haven't used it, then NO * If we have used it then MAYBE: RFC2616 says we MAY propagate it. * So let's make it configurable by env. if (r->
user !=
NULL) {
/* we've authenticated */ /* Skip Transfer-Encoding and Content-Length for now. /* We have headers, let's figure out our request body... */ /* sub-requests never use keepalives, and mustn't pass request bodies. * Because the new logic looks at input_brigade, we will self-terminate * input_brigade and jump past all of the request body logic... * Reading anything with ap_get_brigade is likely to consume the * main request's body or read beyond EOS - which would be unplesant. * An exception: when a kept_body is present, then subrequest CAN use * pass request bodies, and we DONT skip the body. /* XXX: Why DON'T sub-requests use keepalives? */ /* WE only understand chunked. Other modules might inject * (and therefore, decode) other flavors but we don't know * that the can and have done so unless they they remove * their decoding from the headers_in T-E list. * XXX: Make this extensible, but in doing so, presume the * encoding has been done by the extensions' handler, and * do not modify add_te_chunked's logic "proxy: %s Transfer-Encoding is not supported",
"proxy: client %s (%s) requested Transfer-Encoding " "chunked body with Content-Length (C-L ignored)",
/* Prefetch MAX_MEM_SPOOL bytes * This helps us avoid any election of C-L v.s. T-E * request bodies, since we are willing to keep in * memory this much data, in any case. This gives * us an instant C-L election if the body is of some "proxy: prefetch request body failed to %pI (%s)" * Save temp_brigade in input_brigade. (At least) in the SSL case * temp_brigade contains transient buckets whose data would get * overwritten during the next call of ap_get_brigade in the loop. * ap_save_brigade ensures these buckets to be set aside. * Calling ap_save_brigade with NULL as filter is OK, because * input_brigade already has been created and does not need to get * created by ap_save_brigade. "proxy: processing prefetched request body failed" " to %pI (%s) from %s (%s)",
/* Ensure we don't hit a wall where we have a buffer too small * for ap_get_brigade's filters to fetch us another bucket, * surrender once we hit 80 bytes less than MAX_MEM_SPOOL /* Use chunked request body encoding or send a content-length body? * We have no request body (handled by RB_STREAM_CL) * We have a request body length <= MAX_MEM_SPOOL * The administrator has setenv force-proxy-request-1.0 * The client sent a C-L body, and the administrator has * not setenv proxy-sendchunked or has set setenv proxy-sendcl * The client sent a T-E body, and the administrator has * setenv proxy-sendcl, and not setenv proxy-sendchunked * If both proxy-sendcl and proxy-sendchunked are set, the * behavior is the same as if neither were set, large bodies * that can't be read will be forwarded in their original * To ensure maximum compatibility, setenv proxy-sendcl * To reduce server resource use, setenv proxy-sendchunked * Then address specific servers with conditional setenv * options to restore the default behavior where desireable. * We have to compute content length by reading the entire request * body; if request body is not small, we'll spool the remaining * input to a temporary file. Chunked is always preferable. * We can only trust the client-provided C-L if the T-E header * is absent, and the filters are unchanged (the body won't * be resized by another content filter). /* The whole thing fit, so our decision is trivial, use * the filtered bytes read from the client for the request * If we expected no body, and read no body, do not set /* This is an appropriate default; very efficient for no-body * requests, and has the behavior that it will not add any C-L * when the old_cl_val is NULL. /* Yes I hate gotos. This is the subrequest shortcut */ * Handle Connection: header if we do HTTP/1.1 request: * If we plan to close the backend connection sent Connection: close * otherwise sent Connection: Keep-Alive. /* send the request body, if any. */ /* shouldn't be possible */ /* apr_status_t value has been logged in lower level method */ "proxy: pass request body failed to %pI (%s)" = {
"Date",
"Expires",
"Last-Modified",
NULL };
* Note: pread_len is the length of the response that we've mistakenly * read (assuming that we don't consider that an error via * ProxyBadHeader StartBody). This depends on buffer actually being * local storage to the calling code in order for pread_len to make * any sense at all, since we depend on buffer still containing * what was read by ap_getline() upon return. * Read header lines until we get the empty separator line, a read error, * the connection closes (EOF), or we timeout. "Headers received from backend:");
/* We may encounter invalid headers, usually from buggy * MS IIS servers, so we need to determine just how to handle * them. We can either ignore them, assume that they mark the * start-of-body (eg: a missing CRLF) or (the default) mark * the headers as totally bogus and return a 500. The sole * exception is an extra "HTTP/1.0 200, OK" line sprinkled * in between the usual MIME headers, which is a favorite /* XXX: The mask check is buggy if we ever see an HTTP/1.10 */ /* Nope, it wasn't even an extra HTTP header. Give up. */ /* if we've already started loading headers_out, then * return what we've accumulated so far, in the hopes * that they are useful; also note that we likely pre-read * the first line of the response. "proxy: Starting body due to bogus non-header in headers " "proxy: No HTTP headers " /* this is the psc->badopt == bad_ignore case */ "proxy: Ignoring bogus HTTP header " /* XXX: RFC2068 defines only SP and HT as whitespace, this test is * wrong... and so are many others probably. ++
value;
/* Skip to start of value */ /* should strip trailing whitespace as well */ /* make sure we add so as not to destroy duplicated headers * Modify headers requiring canonicalisation and/or affected * by ProxyPassReverse and family with process_proxy_header /* the header was too long; at the least we should skip extra data */ /* soak up the extra data */ if (
len == 0)
/* time to exit the larger loop as well */ * Limit the number of interim respones we sent back to the client. Otherwise * we suffer from a memory build up. Besides there is NO sense in sending back * an unlimited number of interim responses to the client. Thus if we cross * this limit send back a 502 (Bad Gateway). {
"Keep-Alive",
"Proxy-Authenticate",
"TE",
"Trailer",
"Upgrade",
NULL};
/* Setup for 100-Continue timeout if appropriate */ "proxy: could not set 100-Continue timeout");
/* Get response from the remote server, and pass it up the /* In case anyone needs to know, this is a fake request that is really a /* handle one potential stray CRLF */ "proxy: error reading status line from remote " * If we are a reverse proxy request shutdown the connection * WITHOUT ANY response to trigger a retry by the client * if allowed (as for idempotent requests). * BUT currently we should not do this if the request is the * first request on a keepalive connection as browsers like * seamonkey only display an empty page in this case and do * not do a retry. We should also not do this on a * connection which times out; instead handle as * we normally would handle timeouts "proxy: Closing connection to client because" " reading from backend server %s:%d failed." * Add an EOC bucket to signal the ap_http_header_filter * that it should get out of our way, BUT ensure that the * EOC bucket is inserted BEFORE an EOS bucket in bb as * some resource filters like mod_deflate pass everything * up to the EOS down the chain immediately and sent the * remainder of the brigade later (or even never). But in * this case the ap_http_header_filter does not get out of /* Mark the backend connection for closing */ /* Need to return OK to avoid sending an error message */ "proxy: NOT Closing connection to client" " although reading from backend server %s:%d" "Error reading from remote server");
/* XXX: Is this a real headers length send from remote? */ /* Is it an HTTP/1 response? * This is buggy if we ever see an HTTP/1.10 /* If not an HTTP/1 message or * if the status line was > 8192 bytes apr_pstrcat(p,
"Corrupt status line returned by remote " /* 2616 requires the space in Status-Line; the origin * server may have sent one but ap_rgetline_core will /* The status out of the front is the same as the status coming in * from the back, until further notice. /* N.B. for HTTP/1.0 clients, we have to fold line-wrapped headers*/ /* Also, take care with headers with multiple occurences. */ /* First, tuck away all already existing cookies */ /* shove the headers direct into r->headers_out */ r->
server,
"proxy: bad HTTP/%d.%d header " * ap_send_error relies on a headers_out to be present. we * are in a bad position here.. so force everything we send out * to have nothing to do with the incoming packet /* Now, add in the just read cookies */ /* and now load 'em all in */ /* can't have both Content-Length and Transfer-Encoding */ * 2616 section 4.4, point 3: "if both Transfer-Encoding * and Content-Length are received, the latter MUST be * To help mitigate HTTP Splitting, unset Content-Length * and shut down the backend server connection * XXX: We aught to treat such a response as uncachable "proxy: server %s:%d returned Transfer-Encoding" * Save a possible Transfer-Encoding header as we need it later for * ap_http_filter to know where to end. /* strip connection listed hop-by-hop headers from response */ /* Clear hop-by-hop headers */ /* Delete warnings with wrong date */ /* handle Via header in response */ /* If USE_CANONICAL_NAME_OFF was configured for the proxy virtual host, * then the server name returned by ap_get_server_name() is the * origin server name (which does make too much sense with Via: headers) * so we use the proxy vhost's name instead. /* create a "Via:" response header entry and merge it */ /* cancel keepalive if HTTP/1.0 or less */ /* an http/0.9 response */ /* Reset to old timeout iff we've adjusted it */ /* RFC2616 tells us to forward this. * OTOH, an interim response here may mean the backend * is playing sillybuggers. The Client didn't ask for * it within the defined HTTP/1.1 mechanisms, and if * it's an extension, it may also be unsupported by us. * There's also the possibility that changing existing * behaviour here might break something. * So let's make it configurable. "proxy-interim-response");
"proxy: HTTP: received interim %d response",
/* FIXME: refine this to be able to specify per-response-status * policies and maybe also add option to bail out with 502 "undefined proxy interim response policy");
/* Moved the fixups of Date headers and those affected by const char *
wa =
"WWW-Authenticate";
"proxy: origin server sent 401 without WWW-Authenticate header");
* Is it an HTTP/0.9 response or did we maybe preread the 1st line of * the response? If so, load the extra data. These are 2 mutually * exclusive possibilities, that just happen to require very * At this point in response processing of a 0.9 response, * we don't know yet whether data is binary or not. * mod_charset_lite will get control later on, so it cannot * decide on the conversion of this buffer full of data. * However, chances are that we are not really talking to an * HTTP/0.9 server, but to some different protocol, therefore * the best guess IMHO is to always treat the buffer as "text/x": /* PR 41646: get HEAD right with ProxyErrorOverride */ /* clear r->status for override error, otherwise ErrorDocument * thinks that this is a recursive error, and doesn't find the /* Discard body, if one is expected */ /* send body - but only if a body is expected */ /* We need to copy the output headers and treat them as input * headers as well. BUT, we need to do this before we remove * TE, so that they are preserved accordingly for * ap_http_filter to know where to end. * Restore Transfer-Encoding header from response if we saved * one before and there is none left. We need it for the * ap_http_filter. See above. "proxy: start body send");
* if we are overriding the errors, we can't put the content * of the page into the brigade /* read the body, pass it to the output filters */ /* Handle the case where the error document is itself reverse * proxied and was successful. We must maintain any previous * error status so that an underlying error (eg HTTP_NOT_FOUND) * doesn't become an HTTP_OK. /* ap_get_brigade will return success with an empty brigade * for a non-blocking read which would block: */ /* flush to the client and switch to blocking mode */ /* In this case, we are in real trouble because * our backend bailed on us. Pass along a 502 error "proxy: error reading response");
/* next time try a non-blocking read */ r->
server,
"proxy (PID %d): readbytes: %#x",
/* Switch the allocator lifetime of the buckets */ /* found the last brigade? */ /* signal that we must leave */ /* try send what we read */ /* Ack! Phbtt! Die! User aborted! */ /* make sure we always clean up after ourselves */ /* Pass EOS bucket down the filter chain. */ /* Ack! Phbtt! Die! User aborted! */ /* See define of AP_MAX_INTERIM_RESPONSES for why */ "Too many (%d) interim responses from origin server",
/* If our connection with the client is to be aborted, return DONE. */ * This handles http:// URLs, and other URLs using a remote proxy over http * If proxyhost is NULL, then contact the server directly, otherwise * Note that if a proxy is used, then URLs other than http: can be accessed, * also, if we have trouble which is clearly specific to the proxy, then * we return DECLINED so that we can try another proxy. (Or the direct * Use a shorter-lived pool to reduce memory usage * and avoid a memory leak if (u ==
NULL || u[
1] !=
'/' || u[
2] !=
'/' || u[
3] ==
'\0')
/* scheme is lowercase */ "proxy: HTTPS: declining URL %s" " (mod_ssl not configured?)",
url);
"proxy: HTTP: declining URL %s",
url);
return DECLINED;
/* only interested in HTTP, or FTP via proxy */ "proxy: HTTP: serving URL %s",
url);
/* create space for state information */ * In the case that we are handling a reverse proxy connection and this * is not a request that is coming over an already kept alive connection * with the client, do NOT reuse the connection to the backend, because * we cannot forward a failure to the client in this case as the client * does NOT expect this in this situation. * Yes, this creates a performance penalty. /* Step One: Determine Who To Connect To */ /* Step Two: Make the Connection */ "proxy: HTTP: failed to make connection to backend: %s",
/* Step Three: Create conn_rec */ * On SSL connections set a note on the connection what CN is * requested, such that mod_ssl can check if it is requested to do /* Step Four: Send the Request * On the off-chance that we forced a 100-Continue as a * kinda HTTP ping test, allow for retries "proxy: HTTP: 100-Continue failed to %pI (%s)",
/* Step Five: Receive the Response... Fall thru to cleanup */ warn_rx =
ap_pregcomp(p,
"[0-9]{3}[ \t]+[^ \t]+[ \t]+\"[^\"]*\"([ \t]+\"([^\"]+)\")?", 0);
NULL,
/* create per-directory config structure */ NULL,
/* merge per-directory config structures */ NULL,
/* create per-server config structure */ NULL,
/* merge per-server config structures */ NULL,
/* command apr_table_t */