tech.xml revision 53abc235688d883cfa15cdfec354ba03128f357a
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele<?xml version='1.0' encoding='UTF-8' ?>
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele<!DOCTYPE manualpage SYSTEM "/style/manualpage.dtd">
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele<?xml-stylesheet type="text/xsl" href="/style/manual.en.xsl"?>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele<!-- $LastChangedRevision$ -->
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele<!--
5a58787efeb02a1c3f06569d019ad81fd2efa06end Licensed to the Apache Software Foundation (ASF) under one or more
5a58787efeb02a1c3f06569d019ad81fd2efa06end contributor license agreements. See the NOTICE file distributed with
5a58787efeb02a1c3f06569d019ad81fd2efa06end this work for additional information regarding copyright ownership.
5a58787efeb02a1c3f06569d019ad81fd2efa06end The ASF licenses this file to You under the Apache License, Version 2.0
5a58787efeb02a1c3f06569d019ad81fd2efa06end (the "License"); you may not use this file except in compliance with
5a58787efeb02a1c3f06569d019ad81fd2efa06end the License. You may obtain a copy of the License at
5a58787efeb02a1c3f06569d019ad81fd2efa06end
5a58787efeb02a1c3f06569d019ad81fd2efa06end http://www.apache.org/licenses/LICENSE-2.0
5a58787efeb02a1c3f06569d019ad81fd2efa06end
5a58787efeb02a1c3f06569d019ad81fd2efa06end Unless required by applicable law or agreed to in writing, software
5a58787efeb02a1c3f06569d019ad81fd2efa06end distributed under the License is distributed on an "AS IS" BASIS,
5a58787efeb02a1c3f06569d019ad81fd2efa06end WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
5a58787efeb02a1c3f06569d019ad81fd2efa06end See the License for the specific language governing permissions and
3b3b7fc78d1f5bfc2769903375050048ff41ff26nd limitations under the License.
a78048ccbdb6256da15e6b0e7e95355e480c2301nd-->
a78048ccbdb6256da15e6b0e7e95355e480c2301nd
3b3b7fc78d1f5bfc2769903375050048ff41ff26nd<manualpage metafile="tech.xml.meta">
3b3b7fc78d1f5bfc2769903375050048ff41ff26nd<parentdocument href="./">Rewrite</parentdocument>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <title>Apache mod_rewrite Technical Details</title>
5a58787efeb02a1c3f06569d019ad81fd2efa06end
5a58787efeb02a1c3f06569d019ad81fd2efa06end<summary>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<p>This document discusses some of the technical details of mod_rewrite
5a58787efeb02a1c3f06569d019ad81fd2efa06endand URL matching.</p>
5a58787efeb02a1c3f06569d019ad81fd2efa06end</summary>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="/mod/mod_rewrite.html">Module documentation</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="intro.html">mod_rewrite introduction</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="remapping.html">Redirection and remapping</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="access.html">Controlling access</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="vhosts.html">Virtual hosts</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="proxy.html">Proxying</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="rewritemap.html">Using RewriteMap</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="advanced.html">Advanced techniques</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end<seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso>
5a58787efeb02a1c3f06569d019ad81fd2efa06end
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive<section id="Internal"><title>Internal Processing</title>
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive <p>The internal processing of this module is very complex but
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive needs to be explained once even to the average user to avoid
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive common mistakes and to let you exploit its full
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive functionality.</p>
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive</section>
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive<section id="InternalAPI"><title>API Phases</title>
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive <p>First you have to understand that when Apache processes a
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive HTTP request it does this in phases. A hook for each of these
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive phases is provided by the Apache API. Mod_rewrite uses two of
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive these hooks: the URL-to-filename translation hook which is
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive used after the HTTP request has been read but before any
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive authorization starts and the Fixup hook which is triggered
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive after the authorization phases and after the per-directory
5d7e5de2da57434c8e68c8fa49cbf6d70ee0f817slive config files (<code>.htaccess</code>) have been read, but
5a58787efeb02a1c3f06569d019ad81fd2efa06end before the content handler is activated.</p>
5a58787efeb02a1c3f06569d019ad81fd2efa06end
5a58787efeb02a1c3f06569d019ad81fd2efa06end <p>So, after a request comes in and Apache has determined the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele corresponding server (or virtual server) the rewriting engine
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele starts processing of all mod_rewrite directives from the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele per-server configuration in the URL-to-filename phase. A few
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele steps later when the final data directories are found, the
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele per-directory configuration directives of mod_rewrite are
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele triggered in the Fixup phase. In both situations mod_rewrite
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele rewrites URLs either to new URLs or to filenames, although
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele there is no obvious distinction between them. This is a usage
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele of the API which was not intended to be this way when the API
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele was designed, but as of Apache 1.x this is the only way
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele mod_rewrite can operate. To make this point more clear
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele remember the following two points:</p>
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele <ol>
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele <li>Although mod_rewrite rewrites URLs to URLs, URLs to
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele filenames and even filenames to filenames, the API
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele currently provides only a URL-to-filename hook. In Apache
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele 2.0 the two missing hooks will be added to make the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele processing more clear. But this point has no drawbacks for
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele the user, it is just a fact which should be remembered:
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele Apache does more in the URL-to-filename hook than the API
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele intends for it.</li>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <li>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele Unbelievably mod_rewrite provides URL manipulations in
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele per-directory context, <em>i.e.</em>, within
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele <code>.htaccess</code> files, although these are reached
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele a very long time after the URLs have been translated to
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele filenames. It has to be this way because
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <code>.htaccess</code> files live in the filesystem, so
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele processing has already reached this stage. In other
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele words: According to the API phases at this time it is too
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele late for any URL manipulations. To overcome this chicken
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele and egg problem mod_rewrite uses a trick: When you
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele manipulate a URL/filename in per-directory context
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele mod_rewrite first rewrites the filename back to its
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele corresponding URL (which is usually impossible, but see
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele the <code>RewriteBase</code> directive below for the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele trick to achieve this) and then initiates a new internal
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele sub-request with the new URL. This restarts processing of
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele the API phases.
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <p>Again mod_rewrite tries hard to make this complicated
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele step totally transparent to the user, but you should
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele remember here: While URL manipulations in per-server
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele context are really fast and efficient, per-directory
5a58787efeb02a1c3f06569d019ad81fd2efa06end rewrites are slow and inefficient due to this chicken and
5a58787efeb02a1c3f06569d019ad81fd2efa06end egg problem. But on the other hand this is the only way
5a58787efeb02a1c3f06569d019ad81fd2efa06end mod_rewrite can provide (locally restricted) URL
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele manipulations to the average user.</p>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele </li>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele </ol>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <p>Don't forget these two points!</p>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele</section>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele<section id="InternalRuleset"><title>Ruleset Processing</title>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <p>Now when mod_rewrite is triggered in these two API phases, it
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele reads the configured rulesets from its configuration
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele structure (which itself was either created on startup for
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele per-server context or during the directory walk of the Apache
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele kernel for per-directory context). Then the URL rewriting
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele engine is started with the contained ruleset (one or more
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele rules together with their conditions). The operation of the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele URL rewriting engine itself is exactly the same for both
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele configuration contexts. Only the final result processing is
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele different. </p>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <p>The order of rules in the ruleset is important because the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele rewriting engine processes them in a special (and not very
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele obvious) order. The rule is this: The rewriting engine loops
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele through the ruleset rule by rule (<directive
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele module="mod_rewrite">RewriteRule</directive> directives) and
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele when a particular rule matches it optionally loops through
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele existing corresponding conditions (<code>RewriteCond</code>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele directives). For historical reasons the conditions are given
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele first, and so the control flow is a little bit long-winded. See
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele Figure 1 for more details.</p>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele<p class="figure">
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <img src="/images/rewrite_rule_flow.png"
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele alt="Flow of RewriteRule and RewriteCond matching" /><br />
37530e0c0b92e7786a07a7033c98ea4ef5756b75kess <dfn>Figure 1:</dfn>The control flow through the rewriting ruleset
ea49840bfe8467a7d7bd4db27b1d4880a85511aberikabele</p>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <p>As you can see, first the URL is matched against the
5a58787efeb02a1c3f06569d019ad81fd2efa06end <em>Pattern</em> of each rule. When it fails mod_rewrite
5a58787efeb02a1c3f06569d019ad81fd2efa06end immediately stops processing this rule and continues with the
5a58787efeb02a1c3f06569d019ad81fd2efa06end next rule. If the <em>Pattern</em> matches, mod_rewrite looks
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele for corresponding rule conditions. If none are present, it
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele just substitutes the URL with a new value which is
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele constructed from the string <em>Substitution</em> and goes on
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele with its rule-looping. But if conditions exist, it starts an
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele inner loop for processing them in the order that they are
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele listed. For conditions the logic is different: we don't match
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele a pattern against the current URL. Instead we first create a
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele string <em>TestString</em> by expanding variables,
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele back-references, map lookups, <em>etc.</em> and then we try
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele to match <em>CondPattern</em> against it. If the pattern
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele doesn't match, the complete set of conditions and the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele corresponding rule fails. If the pattern matches, then the
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele next condition is processed until no more conditions are
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele available. If all conditions match, processing is continued
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele with the substitution of the URL with
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele <em>Substitution</em>.</p>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele</section>
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele
5a58787efeb02a1c3f06569d019ad81fd2efa06end
5a58787efeb02a1c3f06569d019ad81fd2efa06end</manualpage>
5a58787efeb02a1c3f06569d019ad81fd2efa06end
4ab980a06412fd86f52a6d054fb7e26de155c530erikabele