mod_rewrite.html revision aeb4bede28b50df6c6dff3911e15759d15055061
2454dfa32c93c20a8522c6ed42fe057baaac9f9aStephan Bosch<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<!--%hypertext -->
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<!-- mod_rewrite.html -->
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<!-- Documentation for the mod_rewrite Apache module -->
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<HTML>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<HEAD>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<TITLE>Apache module mod_rewrite</TITLE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</HEAD>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<BODY
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen BGCOLOR="#FFFFFF"
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen TEXT="#000000"
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen LINK="#0000FF"
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen VLINK="#000080"
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen ALINK="#FF0000"
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BLOCKQUOTE><!-- page indentation -->
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<!--#include virtual="header.html" -->
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BR>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<H1 ALIGN="CENTER">Module mod_rewrite<BR>URL Rewriting Engine</H1>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenThis module is contained in the <CODE>mod_rewrite.c</CODE> file, with Apache
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen1.2 and later. It provides a rule-based rewriting engine to rewrite requested
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenURLs on the fly. It is not compiled into the server by default. To use
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<CODE>mod_rewrite</CODE> you have to enable the following line in the server
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenbuild <CODE>Configuration</CODE> file:
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<PRE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen AddModule modules/standard/mod_rewrite.o
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</PRE>
5da08ab71623953f248b24a21d45b02555bbb24bTimo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<HR NOSHADE SIZE=1>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BR>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<H2>Summary</H2>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BLOCKQUOTE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BLOCKQUOTE>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen<BLOCKQUOTE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<EM>``The great thing about mod_rewrite is it gives you all the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenconfigurability and flexibility of Sendmail. The downside to
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenmod_rewrite is that it gives you all the configurability and
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenflexibility of Sendmail.''</EM>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<DIV ALIGN=RIGHT>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen-- Brian Behlendorf<BR>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenApache Group
d861bc0977b229cdaeb3fb77377e2a2bd9d40d3dTimo Sirainen</DIV>
661998e2ccd772ad92a9d4a75cb712692a8c94b3Timo Sirainen</BLOCKQUOTE>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi</BLOCKQUOTE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</BLOCKQUOTE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BLOCKQUOTE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<BLOCKQUOTE>
f39a06c378f6ea80a4ae9d257f0d79221a945a57Timo Sirainen<BLOCKQUOTE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<EM>``
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenDespite the tons of examples and docs, mod_rewrite
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenis voodoo. Damned cool voodoo, but still voodoo.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen''</EM>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<DIV ALIGN=RIGHT>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen-- Brian Moore<BR>
d861bc0977b229cdaeb3fb77377e2a2bd9d40d3dTimo Sirainenbem@news.cmc.net
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi</DIV>
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärvi</BLOCKQUOTE>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi</BLOCKQUOTE>
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärvi</BLOCKQUOTE>
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärvi
cf41318871bd42358df3420e50614f5310b08c77Martti RannanjärviWelcome to mod_rewrite, the Swiss Army Knife of URL manipulation!
d861bc0977b229cdaeb3fb77377e2a2bd9d40d3dTimo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti RannanjärviThis module uses a rule-based rewriting engine (based on a regular-expression
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärviparser) to rewrite requested URLs on the fly. It supports an unlimited number
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärviof rules and an unlimited number of attached rule conditions for each rule to
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärviprovide a really flexible and powerful URL manipulation mechanism. The URL
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärvimanipulations can depend on various tests, for instance server variables,
cf41318871bd42358df3420e50614f5310b08c77Martti Rannanjärvienvironment variables, HTTP headers, time stamps and even external database
d861bc0977b229cdaeb3fb77377e2a2bd9d40d3dTimo Sirainenlookups in various formats can be used to achieve a really granular URL
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenmatching.
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi<P>
2e07e3182f355cf04a1461dd7f893d0ebc818764Timo SirainenThis module operates on the full URLs (including the path-info part) both in
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainenper-server context (<CODE>httpd.conf</CODE>) and per-directory context
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainen(<CODE>.htaccess</CODE>) and even can generate query-string parts on result.
7ee626fd396f5549fb1fc6b0c320038638af9058Martti RannanjärviThe rewritten result can lead to internal sub-processing, external request
7ee626fd396f5549fb1fc6b0c320038638af9058Martti Rannanjärviredirection or even to an internal proxy throughput.
7ee626fd396f5549fb1fc6b0c320038638af9058Martti Rannanjärvi
7ee626fd396f5549fb1fc6b0c320038638af9058Martti Rannanjärvi<P>
7ee626fd396f5549fb1fc6b0c320038638af9058Martti RannanjärviBut all this functionality and flexibility has its drawback: complexity. So
d861bc0977b229cdaeb3fb77377e2a2bd9d40d3dTimo Sirainendon't expect to understand this module in its whole in just one day.
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainen<P>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti RannanjärviThis module was invented and originally written in April 1996<BR>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärviand gifted exclusively to the The Apache Group in July 1997 by
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi
d861bc0977b229cdaeb3fb77377e2a2bd9d40d3dTimo Sirainen<P>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi<BLOCKQUOTE>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi<A HREF="http://www.engelschall.com/"><CODE>Ralf S. Engelschall</CODE></A><BR>
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainen<A HREF="mailto:rse@engelschall.com"><CODE>rse@engelschall.com</CODE></A><BR>
47a5a7e8296f3b8f2fac9a0659d4de3f2723ba4aMartti Rannanjärvi<A HREF="http://www.engelschall.com/"><CODE>www.engelschall.com</CODE></A>
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainen</BLOCKQUOTE>
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi
bb11a1957aefbd2a2edf7ae25af4032899c34c41Martti Rannanjärvi<P>
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainen<HR NOSHADE SIZE=1>
47a5a7e8296f3b8f2fac9a0659d4de3f2723ba4aMartti Rannanjärvi
a022e47f45597650f71f00c3af3fa783982a65adTimo Sirainen<H2>Table Of Contents</H2>
661998e2ccd772ad92a9d4a75cb712692a8c94b3Timo Sirainen
661998e2ccd772ad92a9d4a75cb712692a8c94b3Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<STRONG>Internal Processing</STRONG>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<UL>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen <LI><A HREF="#InternalAPI">API Phases</A>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen <LI><A HREF="#InternalRuleset">Ruleset Processing</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#InternalBackRefs">Regex Back-Reference Availability</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen</UL>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen<P>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen<STRONG>Configuration Directives</STRONG>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen<UL>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteEngine">RewriteEngine</A>
0d1b8b6bec79746c5d89d57dd8c1688946bd9237Josef 'Jeff' Sipek <LI><A HREF="#RewriteOptions">RewriteOptions</A>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen <LI><A HREF="#RewriteLog">RewriteLog</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteLogLevel">RewriteLogLevel</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteLock">RewriteLock</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteMap">RewriteMap</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteBase">RewriteBase</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteCond">RewriteCond</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen <LI><A HREF="#RewriteRule">RewriteRule</A>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen</UL>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen<STRONG>Miscellaneous</STRONG>
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen<UL>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen <LI><A HREF="#EnvVar">Environment Variables</A>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen <LI><A HREF="#Solutions">Practical Solutions</A>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen</UL>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<P>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<HR NOSHADE SIZE=1>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<CENTER>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen<H1><A NAME="Internal">Internal Processing</A></H1>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen</CENTER>
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<HR NOSHADE SIZE=1>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenThe internal processing of this module is very complex but needs to be
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainenexplained once even to the average user to avoid common mistakes and to let
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenyou exploit its full functionality.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<H2><A NAME="InternalAPI">API Phases</A></H2>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenFirst you have to understand that when Apache processes a HTTP request it does
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenthis in phases. A hook for each of these phases is provided by the Apache API.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenMod_rewrite uses two of these hooks: the URL-to-filename translation hook
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenwhich is used after the HTTP request was read and before any authorization
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainenstarts and the Fixup hook which is triggered after the authorization phases
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenand after the per-directory config files (<CODE>.htaccess</CODE>) where read,
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenbut before the content handler is activated.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenSo, after a request comes in and Apache has determined the corresponding
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenserver (or virtual server) the rewriting engine start processing of all
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenmod_rewrite directives from the per-server configuration in the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenURL-to-filename phase. A few steps later when the final data directories are
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenfound, the per-directory configuration directives of mod_rewrite are triggered
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenin the Fixup phase. In both situations mod_rewrite either rewrites URLs to new
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenURLs or to filenames, although there is no obvious distinction between them.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenThis is a usage of the API which was not intended this way when the API
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenwas designed, but as of Apache 1.x this is the only way mod_rewrite can
5de0c65da362236080fa699af3da03e45e480ab8Timo Sirainenoperate. To make this point more clear remember the following two points:
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<OL>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<LI>The API currently provides only a URL-to-filename hook. Although
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen mod_rewrite rewrites URLs to URLs, URLs to filenames and even
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen filenames to filenames. In Apache 2.0 the two missing hooks
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen will be added to make the processing more clear. But this
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen point has no drawbacks for the user, it is just a fact which
8705e45564a2e87d32bd825e0e997a8177846f77Timo Sirainen should be remembered: Apache does more in the URL-to-filename hook
588a0579058849aed9f7b59d8259e0c58d9fd23cTimo Sirainen then the API intends for it.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<LI>Unbelievably mod_rewrite provides URL manipulations in per-directory
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen context, <EM>i.e.</EM>, within <CODE>.htaccess</CODE> files, although
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen these are
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen reached a very long time after the URLs were translated to filenames (this
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen has to be this way, because <CODE>.htaccess</CODE> files stay in the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen filesystem, so processing has already been reached this stage of
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen processing). In other words: According to the API phases at this time it
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen is too late for any URL manipulations. To overcome this chicken and egg
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen problem mod_rewrite uses a trick: When you manipulate a URL/filename in
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen per-directory context mod_rewrite first rewrites the filename back to its
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen corresponding URL (which it usually impossible, but see the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen <CODE>RewriteBase</CODE> directive below for the trick to achieve this)
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen and then initiates a new internal sub-request with the new URL. This leads
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen to a new processing of the API phases from the beginning.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen <P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen Again mod_rewrite tries hard to make this complicated step totally
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen transparent to the user, but you should remember here: While URL
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen manipulations in per-server context are really fast and efficient,
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen per-directory rewrites are slow and inefficient due to this chicken and
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen egg problem. But on the other hand this is the only way mod_rewrite can
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen provide (locally restricted) URL manipulations to the average user.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</OL>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenDon't forget these two points!
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<H2><A NAME="InternalRuleset">Ruleset Processing</A></H2>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenNow when mod_rewrite is triggered in these two API phases, it reads the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenconfigured rulesets from its configuration structure (which itself was either
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainencreated on startup for per-server context or while the directory walk of the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenApache kernel for per-directory context). Then the URL rewriting engine is
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenstarted with the contained ruleset (one or more rules together with their
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenconditions). The operation of the URL rewriting engine itself is exactly the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainensame for both configuration contexts. Just the final result processing is
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainendifferent.
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenThe order of rules in the ruleset is important because the rewriting engine
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenprocesses them in a special order. And this order is not very obvious. The
0d1d485939b9b7f1a0f588aea40c504e3d26e35bJ. Nick Kostonrule is this: The rewriting engine loops through the ruleset rule by rule
0d1d485939b9b7f1a0f588aea40c504e3d26e35bJ. Nick Koston(<CODE>RewriteRule</CODE> directives!) and when a particular rule matched it
0d1d485939b9b7f1a0f588aea40c504e3d26e35bJ. Nick Kostonoptionally loops through existing corresponding conditions
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen(<CODE>RewriteCond</CODE> directives). Because of historical reasons the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenconditions are given first, the control flow is a little bit winded. See
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenFigure 1 for more details.
efe78d3ba24fc866af1c79b9223dc0809ba26cadStephan Bosch
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<DIV ALIGN=CENTER>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<TABLE CELLSPACING=0 CELLPADDING=2 BORDER=0>
20e04227229970d148801c507946666e2a9bd838Timo Sirainen<TR>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<TD BGCOLOR="#CCCCCC"><IMG
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen SRC="/images/mod_rewrite_fig1.gif"
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen WIDTH="428" HEIGHT="385"
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen ALT="[Needs graphics capability to display]"></TD>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</TR>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<TR>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<TD ALIGN=CENTER>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<STRONG>Figure 1:</STRONG> The control flow through the rewriting ruleset
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</TD>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</TR>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</TABLE>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen</DIV>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<P>
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo SirainenAs you can see, first the URL is matched against the <EM>Pattern</EM> of each
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenrule. When it fails mod_rewrite immediately stops processing this rule and
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainencontinues with the next rule. If the <EM>Pattern</EM> matched, mod_rewrite
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenlooks for corresponding rule conditions. If none are present, it just
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainensubstitutes the URL with a new value which is constructed from the string
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<EM>Substitution</EM> and goes on with its rule-looping. But
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenif conditions exists, it starts an inner loop for processing them in order
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenthey are listed. For conditions the logic is different: We don't match a
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenpattern against the current URL. Instead we first create a string
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<EM>TestString</EM> by expanding variables, back-references, map lookups,
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainen<EM>etc.</EM> and then we try to match <EM>CondPattern</EM> against it. If the
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenpattern doesn't match, the complete set of conditions and the corresponding
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenrule fails. If the pattern matches, then the next condition is processed
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainenuntil no more condition is available. If all conditions matched processing is
1f2f38f518ea14d1042c98ab039e6df053f7b285Timo Sirainencontinued with the substitution of the URL with <EM>Substitution</EM>.
<H2><A NAME="InternalBackRefs">Regex Back-Reference Availability</A></H2>
One important thing here has to be remembered: Whenever you
use parenthesis in <EM>Pattern</EM> or in one of the <EM>CondPattern</EM>
back-reference are internally created which can be used with the
strings <CODE>$N</CODE> and <CODE>%N</CODE> (see below). And these
are available for creating the strings <EM>Substitution</EM> and
<EM>TestCond</EM>. Figure 2 shows at which locations the back-references are
transfered to for expansion.
<P>
<DIV ALIGN=CENTER>
<TABLE CELLSPACING=0 CELLPADDING=2 BORDER=0>
<TR>
<TD BGCOLOR="#CCCCCC"><IMG
SRC="/images/mod_rewrite_fig2.gif"
WIDTH="381" HEIGHT="179"
ALT="[Needs graphics capability to display]"></TD>
</TR>
<TR>
<TD ALIGN=CENTER>
<STRONG>Figure 2:</STRONG> The back-reference flow through a rule
</TD>
</TR>
</TABLE>
</DIV>
<P>
We know, this was a crash course of mod_rewrite's internal processing. But
you will benefit from this knowledge when reading the following documentation
of the available directives.
<P>
<HR NOSHADE SIZE=1>
<CENTER>
<H1><A NAME="Configuration">Configuration Directives</A></H1>
</CENTER>
<P>
<HR NOSHADE SIZE=1>
<H3><A NAME="RewriteEngine">RewriteEngine</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A>
<CODE>RewriteEngine</CODE> {<CODE>on,off</CODE>}<BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A>
<STRONG><CODE>RewriteEngine off</CODE></STRONG><BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A>
server config, virtual host, directory, .htaccess<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> FileInfo<BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2<BR>
<P>
The <CODE>RewriteEngine</CODE> directive enables or disables the runtime
rewriting engine. If it is set to <CODE>off</CODE> this module does no runtime
processing at all. It does not even update the <CODE>SCRIPT_URx</CODE>
environment variables.
<P>
Use this directive to disable the module instead of commenting out
all <CODE>RewriteRule</CODE> directives!
<P>
Note that, by default, rewrite configurations are not inherited.
This means that you need to have a <CODE>RewriteEngine on</CODE>
directive for each virtual host you wish to use it in.
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteOptions">RewriteOptions</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteOptions</CODE> <EM>Option</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM><BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config, virtual host, directory,
.htaccess<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> FileInfo<BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2<BR>
<P>
The <CODE>RewriteOptions</CODE> directive sets some special options for the
current per-server or per-directory configuration. The <EM>Option</EM>
strings can be one of the following:
<UL>
<LI>'<STRONG><CODE>inherit</CODE></STRONG>'<BR>
This forces the current configuration to inherit the configuration of the
parent. In per-virtual-server context this means that the maps,
conditions and rules of the main server gets inherited. In per-directory
context this means that conditions and rules of the parent directory's
<CODE>.htaccess</CODE> configuration gets inherited.
</UL>
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteLog">RewriteLog</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteLog</CODE> <EM>Filename</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM><BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config, virtual host<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>Not applicable</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2<BR>
<P>
The <CODE>RewriteLog</CODE> directive sets the name of the file to which the
server logs any rewriting actions it performs. If the name does not begin
with a slash ('<CODE>/</CODE>') then it is assumed to be relative to the
<EM>Server Root</EM>. The directive should occur only once per server
config.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice</STRONG>: To disable the logging of rewriting actions it is
not recommended to set <EM>Filename</EM>
to <CODE>/dev/null</CODE>, because although the rewriting engine does
not create output to a logfile it still creates the logfile
output internally. <STRONG>This will slow down the server with no advantage
to the administrator!</STRONG>
To disable logging either remove or comment out the
<CODE>RewriteLog</CODE> directive or use <CODE>RewriteLogLevel 0</CODE>!
</TD></TR>
</TABLE>
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Security</STRONG>: See the <A
HREF="/misc/security_tips.html">Apache Security
Tips</A> document for details on why your security could be compromised if the
directory where logfiles are stored is writable by anyone other than the user
that starts the server.
</TD></TR>
</TABLE>
<P>
<STRONG>Example:</STRONG>
<BLOCKQUOTE>
<PRE>
RewriteLog "/usr/local/var/apache/logs/rewrite.log"
</PRE>
</BLOCKQUOTE>
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteLogLevel">RewriteLogLevel</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteLogLevel</CODE> <EM>Level</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <STRONG><CODE>RewriteLogLevel 0</CODE></STRONG>
<BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config, virtual host<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>Not applicable</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2<BR>
<P>
The <CODE>RewriteLogLevel</CODE> directive set the verbosity level of the
rewriting
logfile. The default level 0 means no logging, while 9 or more means
that practically all actions are logged.
<P>
To disable the logging of rewriting actions simply set <EM>Level</EM> to 0.
This disables all rewrite action logs.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> Using a high value for <EM>Level</EM> will slow down
your Apache
server dramatically! Use the rewriting logfile only for debugging or at least
at <EM>Level</EM> not greater than 2!
</TD></TR>
</TABLE>
<P>
<STRONG>Example:</STRONG>
<BLOCKQUOTE>
<PRE>
RewriteLogLevel 3
</PRE>
</BLOCKQUOTE>
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteLock">RewriteLock</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteLock</CODE> <EM>Filename</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM><BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>Not applicable</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.3<BR>
<P>
This directive sets the filename for a synchronization lockfile which
mod_rewrite needs to communicate with <SAMP>RewriteMap</SAMP>
<EM>programs</EM>. Set this lockfile to a local path (not on a NFS-mounted
device) when you want to use a rewriting map-program. It is not required for
all other types of rewriting maps.
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteMap">RewriteMap</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteMap</CODE> <EM>MapName </EM>
<EM>MapType</EM><CODE>:</CODE><EM>MapSource</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> not used per default<BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config, virtual host<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>Not applicable</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2 (partially), Apache 1.3<BR>
<P>
The <CODE>RewriteMap</CODE> directive defines a <EM>Rewriting Map</EM>
which can be used inside rule substitution strings by the mapping-functions
to insert/substitute fields through a key lookup. The source of this
lookup can be of various types.
<P>
The <A NAME="mapfunc"><EM>MapName</EM></A> is the name of the map and will
be used to specify a mapping-function for the substitution strings of a
rewriting rule via one of the following constructs:
<BLOCKQUOTE><STRONG>
<CODE>${</CODE> <EM>MapName</EM> <CODE>:</CODE> <EM>LookupKey</EM>
<CODE>}</CODE><BR>
<CODE>${</CODE> <EM>MapName</EM> <CODE>:</CODE> <EM>LookupKey</EM>
<CODE>|</CODE> <EM>DefaultValue</EM> <CODE>}</CODE>
</STRONG></BLOCKQUOTE>
When such a construct occurs the map <EM>MapName</EM>
is consulted and the key <EM>LookupKey</EM> is looked-up. If the key is
found, the map-function construct is substituted by <EM>SubstValue</EM>. If
the key is not found then it is substituted by <EM>DefaultValue</EM> or
the empty string if no <EM>DefaultValue</EM> was specified.
<P>
The following combinations for <EM>MapType</EM> and <EM>MapSource</EM>
can be used:
<UL>
<LI><STRONG>Standard Plain Text</STRONG><BR>
MapType: <CODE>txt</CODE>, MapSource: Unix filesystem path to valid regular
file
<P>
This is the standard rewriting map feature where the <EM>MapSource</EM> is
a plain ASCII file containing either blank lines, comment lines (starting
with a '#' character) or pairs like the following - one per line.
<BLOCKQUOTE><STRONG>
<EM>MatchingKey</EM> <EM>SubstValue</EM>
</STRONG></BLOCKQUOTE>
<P>
Example:
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
##
## map.txt -- rewriting map
##
Ralf.S.Engelschall rse # Bastard Operator From Hell
Mr.Joe.Average joe # Mr. Average
</PRE></TD></TR>
</TABLE>
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
RewriteMap real-to-user txt:/path/to/file/map.txt
</PRE></TD></TR>
</TABLE>
<P>
<LI><STRONG>Randomized Plain Text</STRONG><BR>
MapType: <CODE>rnd</CODE>, MapSource: Unix filesystem path to valid regular
file
<P>
This is identical to the Standard Plain Text variant above but with a
special
post-processing feature: After looking up a value it is parsed according
to contained ``<CODE>|</CODE>'' characters which have the meaning of
``or''. Or
in other words: they indicate a set of alternatives from which the actual
returned value is chosen randomly. Although this sounds crazy and useless,
it
was actually designed for load balancing in a reverse proxy situation where
the looked up values are server names.
Example:
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
##
## map.txt -- rewriting map
##
static www1|www2|www3|www4
dynamic www5|www6
</PRE></TD></TR>
</TABLE>
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
RewriteMap servers rnd:/path/to/file/map.txt
</PRE></TD></TR>
</TABLE>
<P>
<LI><STRONG>Hash File</STRONG><BR>
MapType: <CODE>dbm</CODE>, MapSource: Unix filesystem path to valid
regular file
<P>
Here the source is a binary NDBM format file containing the same contents
as a <EM>Plain Text</EM> format file, but in a special representation
which is optimized for really fast lookups. You can create such a file with
any NDBM tool or with the following Perl script:
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
#!/path/to/bin/perl
##
## txt2dbm -- convert txt map to dbm format
##
($txtmap, $dbmmap) = @ARGV;
open(TXT, "&lt;$txtmap");
dbmopen(%DB, $dbmmap, 0644);
while (&lt;TXT&gt;) {
next if (m|^s*#.*| or m|^s*$|);
$DB{$1} = $2 if (m|^\s*(\S+)\s+(\S+)$|);
}
dbmclose(%DB);
close(TXT)</PRE></TD></TR>
</TABLE>
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>$ txt2dbm map.txt map.db </PRE></TD></TR>
</TABLE>
<P>
<LI><STRONG>Internal Function</STRONG><BR>
MapType: <CODE>int</CODE>, MapSource: Internal Apache function
<P>
Here the source is an internal Apache function. Currently you cannot
create your own, but the following functions already exists:
<UL>
<LI><STRONG>toupper</STRONG>:<BR>
Converts the looked up key to all upper case.
<LI><STRONG>tolower</STRONG>:<BR>
Converts the looked up key to all lower case.
<LI><STRONG>escape</STRONG>:<BR>
Translates special characters in the looked up key to hex-encodings.
<LI><STRONG>unescape</STRONG>:<BR>
Translates hex-encodings in the looked up key back to special characters.
</UL>
<P>
<LI><STRONG>External Rewriting Program</STRONG><BR>
MapType: <CODE>prg</CODE>, MapSource: Unix filesystem path to valid
regular file
<P>
Here the source is a Unix program, not a map file. To create it you can use
the language of your choice, but the result has to be a run-able Unix
executable (<EM>i.e.</EM>, either object-code or a script with the
magic cookie trick '<CODE>#!/path/to/interpreter</CODE>' as the first
line).
<P>
This program gets started once at startup of the Apache servers and then
communicates with the rewriting engine over its <CODE>stdin</CODE> and
<CODE>stdout</CODE> file-handles. For each map-function lookup it will
receive the key to lookup as a newline-terminated string on
<CODE>stdin</CODE>. It then has to give back the looked-up value as a
newline-terminated string on <CODE>stdout</CODE> or the four-character
string ``<CODE>NULL</CODE>'' if it fails (<EM>i.e.</EM>, there is no
corresponding value
for the given key). A trivial program which will implement a 1:1 map
(<EM>i.e.</EM>, key == value) could be:
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
#!/usr/bin/perl
$| = 1;
while (&lt;STDIN&gt;) {
# ...here any transformations
# or lookups should occur...
print $_;
}
</PRE></TD></TR>
</TABLE>
<P>
But be very careful:<BR>
<OL>
<LI>``<EM>Keep the program simple, stupid</EM>'' (KISS), because
if this program hangs it will lead to a hang of the Apache server
when the rule occurs.
<LI>Avoid one common mistake: never do buffered I/O on <CODE>stdout</CODE>!
This will cause a deadloop! Hence the ``<CODE>$|=1</CODE>'' in the
above example...
<LI>Use the <SAMP>RewriteLock</SAMP> directive to define a lockfile
mod_rewrite can use to synchronize the communication to the program.
Per default no such synchronization takes place.
</OL>
</UL>
The <CODE>RewriteMap</CODE> directive can occur more than once. For each
mapping-function use one <CODE>RewriteMap</CODE> directive to declare its
rewriting mapfile. While you cannot <STRONG>declare</STRONG> a map in
per-directory context it is of course possible to <STRONG>use</STRONG>
this map in per-directory context.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> For plain text and DBM format files the looked-up
keys are cached in-core
until the <CODE>mtime</CODE> of the mapfile changes or the server does a
restart. This way you can have map-functions in rules which are used
for <STRONG>every</STRONG> request. This is no problem, because the
external lookup only happens once!
</TD></TR>
</TABLE>
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteBase">RewriteBase</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteBase</CODE> <EM>BaseURL</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>default is the physical directory path</EM>
<BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> directory, .htaccess<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>FileInfo</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2<BR>
<P>
The <CODE>RewriteBase</CODE> directive explicitly sets the base URL for
per-directory rewrites. As you will see below, <CODE>RewriteRule</CODE> can be
used in per-directory config files (<CODE>.htaccess</CODE>). There it will act
locally, <EM>i.e.</EM>, the local directory prefix is stripped at this stage of
processing and your rewriting rules act only on the remainder. At the end
it is automatically added.
<P>
When a substitution occurs for a new URL, this module has to re-inject the URL
into the server processing. To be able to do this it needs to know what the
corresponding URL-prefix or URL-base is. By default this prefix is the
corresponding filepath itself. <STRONG>But at most websites URLs are
<STRONG>NOT</STRONG> directly related to physical filename paths, so this
assumption will be usually be wrong!</STRONG> There you have to use the
<CODE>RewriteBase</CODE> directive to specify the correct URL-prefix.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> If your webserver's URLs are <STRONG>not</STRONG>
directly related to physical file paths, you have to use
<CODE>RewriteBase</CODE> in every
<CODE>.htaccess</CODE> files where you want to use <CODE>RewriteRule</CODE>
directives.
</TD></TR>
</TABLE>
<P>
<STRONG>Example:</STRONG>
<BLOCKQUOTE>
Assume the following per-directory config file:
<P>
<TABLE BORDER=0 CELLSPACING=1 CELLPADDING=5 BGCOLOR="#F0F0F0">
<TR><TD><PRE>
#
# /abc/def/.htaccess -- per-dir config file for directory /abc/def
# Remember: /abc/def is the physical path of /xyz, <EM>i.e.</EM>, the server
# has a 'Alias /xyz /abc/def' directive <EM>e.g.</EM>
#
RewriteEngine On
# let the server know that we are reached via /xyz and not
# via the physical path prefix /abc/def
RewriteBase /xyz
# now the rewriting rules
RewriteRule ^oldstuff\.html$ newstuff.html
</PRE></TD></TR>
</TABLE>
<P>
In the above example, a request to <CODE>/xyz/oldstuff.html</CODE>
gets correctly
rewritten to the physical file <CODE>/abc/def/newstuff.html</CODE>.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<FONT SIZE=-1>
<STRONG>Notice - For the Apache hackers:</STRONG><BR>
The following list gives detailed information about the internal
processing steps:
<P>
<PRE>
Request:
/xyz/oldstuff.html
Internal Processing:
/xyz/oldstuff.html -&gt; /abc/def/oldstuff.html (per-server Alias)
/abc/def/oldstuff.html -&gt; /abc/def/newstuff.html (per-dir RewriteRule)
/abc/def/newstuff.html -&gt; /xyz/newstuff.html (per-dir RewriteBase)
/xyz/newstuff.html -&gt; /abc/def/newstuff.html (per-server Alias)
Result:
/abc/def/newstuff.html
</PRE>
This seems very complicated but is the correct Apache internal processing,
because the per-directory rewriting comes too late in the process. So,
when it occurs the (rewritten) request has to be re-injected into the Apache
kernel! BUT: While this seems like a serious overhead, it really isn't, because
this re-injection happens fully internal to the Apache server and the same
procedure is used by many other operations inside Apache. So, you can be
sure the design and implementation is correct.
</FONT>
</TD></TR>
</TABLE>
</BLOCKQUOTE>
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteCond">RewriteCond</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteCond</CODE> <EM>TestString</EM>
<EM>CondPattern</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM><BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config, virtual host, directory,
.htaccess<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>FileInfo</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2 (partially), Apache 1.3<BR>
<P>
The <CODE>RewriteCond</CODE> directive defines a rule condition. Precede a
<CODE>RewriteRule</CODE> directive with one or more <CODE>RewriteCond</CODE>
directives.
The following rewriting rule is only used if its pattern matches the current
state of the URI <STRONG>and</STRONG> if these additional conditions apply
too.
<P>
<EM>TestString</EM> is a string which can contains the following
expanded constructs in addition to plain text:
<UL>
<LI><STRONG>RewriteRule backreferences</STRONG>: These are backreferences of
the form
<BLOCKQUOTE><STRONG>
<CODE>$N</CODE>
</STRONG></BLOCKQUOTE>
(0 &lt;= N &lt;= 9) which provide access to the grouped parts (parenthesis!)
of the pattern from the corresponding <CODE>RewriteRule</CODE> directive (the
one following the current bunch of <CODE>RewriteCond</CODE> directives).
<P>
<LI><STRONG>RewriteCond backreferences</STRONG>: These are backreferences of
the form
<BLOCKQUOTE><STRONG>
<CODE>%N</CODE>
</STRONG></BLOCKQUOTE>
(1 &lt;= N &lt;= 9) which provide access to the grouped parts (parenthesis!) of
the pattern from the last matched <CODE>RewriteCond</CODE> directive in the
current bunch of conditions.
<P>
<LI><STRONG>Server-Variables</STRONG>: These are variables
of the form
<BLOCKQUOTE><STRONG>
<CODE>%{</CODE> <EM>NAME_OF_VARIABLE</EM> <CODE>}</CODE>
</STRONG></BLOCKQUOTE>
where <EM>NAME_OF_VARIABLE</EM> can be a string
of the following list:
<P>
<TABLE BGCOLOR="#F0F0F0" CELLSPACING=0 CELLPADDING=5>
<TR>
<TD VALIGN=TOP>
<STRONG>HTTP headers:</STRONG><P>
<FONT SIZE=-1>
HTTP_USER_AGENT<BR>
HTTP_REFERER<BR>
HTTP_COOKIE<BR>
HTTP_FORWARDED<BR>
HTTP_HOST<BR>
HTTP_PROXY_CONNECTION<BR>
HTTP_ACCEPT<BR>
</FONT>
</TD>
<TD VALIGN=TOP>
<STRONG>connection &amp; request:</STRONG><P>
<FONT SIZE=-1>
REMOTE_ADDR<BR>
REMOTE_HOST<BR>
REMOTE_USER<BR>
REMOTE_IDENT<BR>
REQUEST_METHOD<BR>
SCRIPT_FILENAME<BR>
PATH_INFO<BR>
QUERY_STRING<BR>
AUTH_TYPE<BR>
</FONT>
</TD>
</TR>
<TR>
<TD VALIGN=TOP>
<STRONG>server internals:</STRONG><P>
<FONT SIZE=-1>
DOCUMENT_ROOT<BR>
SERVER_ADMIN<BR>
SERVER_NAME<BR>
SERVER_ADDR<BR>
SERVER_PORT<BR>
SERVER_PROTOCOL<BR>
SERVER_SOFTWARE<BR>
</FONT>
</TD>
<TD VALIGN=TOP>
<STRONG>system stuff:</STRONG><P>
<FONT SIZE=-1>
TIME_YEAR<BR>
TIME_MON<BR>
TIME_DAY<BR>
TIME_HOUR<BR>
TIME_MIN<BR>
TIME_SEC<BR>
TIME_WDAY<BR>
TIME<BR>
</FONT>
</TD>
<TD VALIGN=TOP>
<STRONG>specials:</STRONG><P>
<FONT SIZE=-1>
API_VERSION<BR>
THE_REQUEST<BR>
REQUEST_URI<BR>
REQUEST_FILENAME<BR>
IS_SUBREQ<BR>
</FONT>
</TD>
</TR>
</TABLE>
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> These variables all correspond to the similar named
HTTP MIME-headers, C variables of the Apache server or <CODE>struct tm</CODE>
fields of the Unix system.
</TD></TR>
</TABLE>
</UL>
<P>
Special Notes:
<OL>
<LI>The variables SCRIPT_FILENAME and REQUEST_FILENAME contain the same
value, <EM>i.e.</EM>, the value of the <CODE>filename</CODE> field of
the internal
<CODE>request_rec</CODE> structure of the Apache server. The first name is
just the
commonly known CGI variable name while the second is the consistent
counterpart to REQUEST_URI (which contains the value of the <CODE>uri</CODE>
field of <CODE>request_rec</CODE>).
<P>
<LI>There is the special format: <CODE>%{ENV:variable}</CODE> where
<EM>variable</EM> can be any environment variable. This is looked-up via
internal Apache structures and (if not found there) via <CODE>getenv()</CODE>
from the Apache server process.
<P>
<LI>There is the special format: <CODE>%{HTTP:header}</CODE> where
<EM>header</EM> can be any HTTP MIME-header name. This is looked-up
from the HTTP request. Example: <CODE>%{HTTP:Proxy-Connection}</CODE>
is the value of the HTTP header ``<CODE>Proxy-Connection:</CODE>''.
<P>
<LI>There is the special format <CODE>%{LA-U:variable}</CODE> for look-aheads
which perform an internal (URL-based) sub-request to determine the final value
of <EM>variable</EM>. Use this when you want to use a variable for rewriting
which actually is set later in an API phase and thus is not available at the
current stage. For instance when you want to rewrite according to the
<CODE>REMOTE_USER</CODE> variable from within the per-server context
(<CODE>httpd.conf</CODE> file) you have to use <CODE>%{LA-U:REMOTE_USER}</CODE>
because this variable is set by the authorization phases which come
<EM>after</EM> the URL translation phase where mod_rewrite operates. On the
other hand, because mod_rewrite implements its per-directory context
(<CODE>.htaccess</CODE> file) via the Fixup phase of the API and because the
authorization phases come <EM>before</EM> this phase, you just can use
<CODE>%{REMOTE_USER}</CODE> there.
<P>
<LI>There is the special format: <CODE>%{LA-F:variable}</CODE> which perform an
internal (filename-based) sub-request to determine the final value of
<EM>variable</EM>. This is the most of the time the same as LA-U above.
</OL>
<P>
<EM>CondPattern</EM> is the condition pattern, <EM>i.e.</EM>, a regular
expression
which gets applied to the current instance of the <EM>TestString</EM>,
<EM>i.e.</EM>, <EM>TestString</EM> gets evaluated and then matched against
<EM>CondPattern</EM>.
<P>
<STRONG>Remember:</STRONG> <EM>CondPattern</EM> is a standard
<EM>Extended Regular Expression</EM> with some additions:
<OL>
<LI>You can precede the pattern string with a '<CODE>!</CODE>' character
(exclamation mark) to specify a <STRONG>non</STRONG>-matching pattern.
<P>
<LI>
There are some special variants of <EM>CondPatterns</EM>. Instead of real
regular expression strings you can also use one of the following:
<P>
<UL>
<LI>'<STRONG>&lt;CondPattern</STRONG>' (is lexicographically lower)<BR>
Treats the <EM>CondPattern</EM> as a plain string and compares it
lexicographically to <EM>TestString</EM> and results in a true expression if
<EM>TestString</EM> is lexicographically lower than <EM>CondPattern</EM>.
<P>
<LI>'<STRONG>&gt;CondPattern</STRONG>' (is lexicographically greater)<BR>
Treats the <EM>CondPattern</EM> as a plain string and compares it
lexicographically to <EM>TestString</EM> and results in a true expression if
<EM>TestString</EM> is lexicographically greater than <EM>CondPattern</EM>.
<P>
<LI>'<STRONG>=CondPattern</STRONG>' (is lexicographically equal)<BR>
Treats the <EM>CondPattern</EM> as a plain string and compares it
lexicographically to <EM>TestString</EM> and results in a true expression if
<EM>TestString</EM> is lexicographically equal to <EM>CondPattern</EM>, i.e the
two strings are exactly equal (character by character).
If <EM>CondPattern</EM> is just <SAMP>""</SAMP> (two quotation marks) this
compares <EM>TestString</EM> against the empty string.
<P>
<LI>'<STRONG>-d</STRONG>' (is <STRONG>d</STRONG>irectory)<BR>
Treats the <EM>TestString</EM> as a pathname and
tests if it exists and is a directory.
<P>
<LI>'<STRONG>-f</STRONG>' (is regular <STRONG>f</STRONG>ile)<BR>
Treats the <EM>TestString</EM> as a pathname and
tests if it exists and is a regular file.
<P>
<LI>'<STRONG>-s</STRONG>' (is regular file with <STRONG>s</STRONG>ize)<BR>
Treats the <EM>TestString</EM> as a pathname and
tests if it exists and is a regular file with size greater than zero.
<P>
<LI>'<STRONG>-l</STRONG>' (is symbolic <STRONG>l</STRONG>ink)<BR>
Treats the <EM>TestString</EM> as a pathname and
tests if it exists and is a symbolic link.
<P>
<LI>'<STRONG>-F</STRONG>' (is existing file via subrequest)<BR>
Checks if <EM>TestString</EM> is a valid file and accessible via all the
server's currently-configured access controls for that path. This uses an
internal subrequest to determine the check, so use it with care because it
decreases your servers performance!
<P>
<LI>'<STRONG>-U</STRONG>' (is existing URL via subrequest)<BR>
Checks if <EM>TestString</EM> is a valid URL and accessible via all the
server's
currently-configured access controls for that path. This uses an internal
subrequest to determine the check, so use it with care because it decreases
your server's performance!
</UL>
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG>
All of these tests can also be prefixed by a not ('!') character
to negate their meaning.
</TD></TR>
</TABLE>
</OL>
<P>
Additionally you can set special flags for <EM>CondPattern</EM> by appending
<BLOCKQUOTE><STRONG>
<CODE>[</CODE><EM>flags</EM><CODE>]</CODE>
</STRONG></BLOCKQUOTE>
as the third argument to the <CODE>RewriteCond</CODE> directive. <EM>Flags</EM>
is a comma-separated list of the following flags:
<UL>
<LI>'<STRONG><CODE>nocase|NC</CODE></STRONG>' (<STRONG>n</STRONG>o <STRONG>c</STRONG>ase)<BR>
This makes the condition test case-insensitive, <EM>i.e.</EM>, there is
no difference between 'A-Z' and 'a-z' both in the expanded
<EM>TestString</EM> and the <EM>CondPattern</EM>.
<P>
<LI>'<STRONG><CODE>ornext|OR</CODE></STRONG>' (<STRONG>or</STRONG> next condition)<BR>
Use this to combine rule conditions with a local OR instead of the
implicit AND. Typical example:
<P>
<BLOCKQUOTE><PRE>
RewriteCond %{REMOTE_HOST} ^host1.* [OR]
RewriteCond %{REMOTE_HOST} ^host2.* [OR]
RewriteCond %{REMOTE_HOST} ^host3.*
RewriteRule ...some special stuff for any of these hosts...
</PRE></BLOCKQUOTE>
Without this flag you had to write down the cond/rule three times.
</UL>
<P>
<STRONG>Example:</STRONG>
<BLOCKQUOTE>
To rewrite the Homepage of a site according to the ``<CODE>User-Agent:</CODE>''
header of the request, you can use the following:
<BLOCKQUOTE><PRE>
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*
RewriteRule ^/$ /homepage.max.html [L]
RewriteCond %{HTTP_USER_AGENT} ^Lynx.*
RewriteRule ^/$ /homepage.min.html [L]
RewriteRule ^/$ /homepage.std.html [L]
</PRE></BLOCKQUOTE>
Interpretation: If you use Netscape Navigator as your browser (which identifies
itself as 'Mozilla'), then you get the max homepage, which includes
Frames, <EM>etc.</EM> If you use the Lynx browser (which is Terminal-based), then you
get the min homepage, which contains no images, no tables, <EM>etc.</EM> If you
use any other browser you get the standard homepage.
</BLOCKQUOTE>
<P>
<HR NOSHADE SIZE=1>
<P>
<H3><A NAME="RewriteRule">RewriteRule</A></H3>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> <CODE>RewriteRule</CODE> <EM>Pattern</EM> <EM>Substitution</EM><BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM><BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> server config, virtual host, directory, .htaccess<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>FileInfo</EM><BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Extension<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_rewrite.c<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Apache 1.2 (partially), Apache 1.3<BR>
<P>
The <CODE>RewriteRule</CODE> directive is the real rewriting workhorse. The
directive can occur more than once. Each directive then defines one single
rewriting rule. The <STRONG>definition order</STRONG> of these rules is
<STRONG>important</STRONG>, because this order is used when applying the rules at
run-time.
<P>
<A NAME="patterns"><EM>Pattern</EM></A> can be (for Apache 1.1.x a System
V8 and for Apache 1.2.x a POSIX) <A NAME="regexp">regular expression</A>
which gets applied to the current URL. Here ``current'' means the value of the
URL when this rule gets applied. This may not be the original requested
URL, because there could be any number of rules before which already matched
and made alterations to it.
<P>
Some hints about the syntax of regular expressions:
<P>
<TABLE BGCOLOR="#F0F0F0" CELLSPACING=0 CELLPADDING=5>
<TR>
<TD VALIGN=TOP>
<PRE>
<STRONG>Text:</STRONG>
<STRONG><CODE>.</CODE></STRONG> Any single character
<STRONG><CODE>[</CODE></STRONG>chars<STRONG><CODE>]</CODE></STRONG> Character class: One of chars
<STRONG><CODE>[^</CODE></STRONG>chars<STRONG><CODE>]</CODE></STRONG> Character class: None of chars
text1<STRONG><CODE>|</CODE></STRONG>text2 Alternative: text1 or text2
<STRONG>Quantifiers:</STRONG>
<STRONG><CODE>?</CODE></STRONG> 0 or 1 of the preceding text
<STRONG><CODE>*</CODE></STRONG> 0 or N of the preceding text (N &gt; 1)
<STRONG><CODE>+</CODE></STRONG> 1 or N of the preceding text (N &gt; 1)
<STRONG>Grouping:</STRONG>
<STRONG><CODE>(</CODE></STRONG>text<STRONG><CODE>)</CODE></STRONG> Grouping of text
(either to set the borders of an alternative or
for making backreferences where the <STRONG>N</STRONG>th group can
be used on the RHS of a RewriteRule with <CODE>$</CODE><STRONG>N</STRONG>)
<STRONG>Anchors:</STRONG>
<STRONG><CODE>^</CODE></STRONG> Start of line anchor
<STRONG><CODE>$</CODE></STRONG> End of line anchor
<STRONG>Escaping:</STRONG>
<STRONG><CODE>\</CODE></STRONG>char escape that particular char
(for instance to specify the chars "<CODE>.[]()</CODE>" <EM>etc.</EM>)
</PRE>
</TD>
</TR>
</TABLE>
<P>
For more information about regular expressions either have a look at your
local regex(3) manpage or its <CODE>src/regex/regex.3</CODE> copy in the
Apache 1.3 distribution. When you are interested in more detailed and deeper
information about regular expressions and its variants (POSIX regex, Perl
regex, <EM>etc.</EM>) have a look at the following dedicated book on this topic:
<BLOCKQUOTE>
<EM>Mastering Regular Expressions</EM><BR>
Jeffrey E.F. Friedl<BR>
Nutshell Handbook Series<BR>
O'Reilly &amp; Associates, Inc. 1997<BR>
ISBN 1-56592-257-3<BR>
</BLOCKQUOTE>
<P>
Additionally in mod_rewrite the NOT character ('<CODE>!</CODE>') is a possible
pattern prefix. This gives you the ability to negate a pattern; to say, for
instance: ``<EM>if the current URL does <STRONG>NOT</STRONG> match to this
pattern</EM>''. This can be used for special cases where it is better to match
the negative pattern or as a last default rule.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> When using the NOT character to negate a pattern you cannot
have grouped wildcard parts in the pattern. This is impossible because when
the pattern does NOT match, there are no contents for the groups. In
consequence, if negated patterns are used, you cannot use <CODE>$N</CODE> in the
substitution string!
</TD></TR>
</TABLE>
<P>
<A NAME="rhs"><EM>Substitution</EM></A> of a rewriting rule is the string
which is substituted for (or replaces) the original URL for which
<EM>Pattern</EM> matched. Beside plain text you can use
<OL>
<LI>back-references <CODE>$N</CODE> to the RewriteRule pattern
<LI>back-references <CODE>%N</CODE> to the last matched RewriteCond pattern
<LI>server-variables as in rule condition test-strings (<CODE>%{VARNAME}</CODE>)
<LI><A HREF="#mapfunc">mapping-function</A> calls (<CODE>${mapname:key|default}</CODE>)
</OL>
Back-references are <CODE>$</CODE><STRONG>N</STRONG> (<STRONG>N</STRONG>=0..9) identifiers which
will be replaced by the contents of the <STRONG>N</STRONG>th group of the matched
<EM>Pattern</EM>. The server-variables are the same as for the
<EM>TestString</EM> of a <CODE>RewriteCond</CODE> directive. The
mapping-functions come from the <CODE>RewriteMap</CODE> directive and are
explained there. These three types of variables are expanded in the order of
the above list.
<P>
As already mentioned above, all the rewriting rules are applied to the
<EM>Substitution</EM> (in the order of definition in the config file). The
URL is <STRONG>completely replaced</STRONG> by the <EM>Substitution</EM> and the
rewriting process goes on until there are no more rules (unless explicitly
terminated by a <CODE><STRONG>L</STRONG></CODE> flag - see below).
<P>
There is a special substitution string named '<CODE>-</CODE>' which means:
<STRONG>NO substitution</STRONG>! Sounds silly? No, it is useful to provide rewriting
rules which <STRONG>only</STRONG> match some URLs but do no substitution, <EM>e.g.</EM>, in
conjunction with the <STRONG>C</STRONG> (chain) flag to be able to have more than one
pattern to be applied before a substitution occurs.
<P>
One more note: You can even create URLs in the substitution string containing
a query string part. Just use a question mark inside the substitution string
to indicate that the following stuff should be re-injected into the
QUERY_STRING. When you want to erase an existing query string, end the
substitution string with just the question mark.
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice</STRONG>: There is a special feature. When you prefix a substitution
field with <CODE>http://</CODE><EM>thishost</EM>[<EM>:thisport</EM>] then
<STRONG>mod_rewrite</STRONG> automatically strips it out. This auto-reduction on
implicit external redirect URLs is a useful and important feature when
used in combination with a mapping-function which generates the hostname
part. Have a look at the first example in the example section below to
understand this.
</TD></TR>
</TABLE>
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Remember:</STRONG> An unconditional external redirect to your own server will
not work with the prefix <CODE>http://thishost</CODE> because of this feature.
To achieve such a self-redirect, you have to use the <STRONG>R</STRONG>-flag (see
below).
</TD></TR>
</TABLE>
<P>
Additionally you can set special flags for <EM>Substitution</EM> by appending
<BLOCKQUOTE><STRONG>
<CODE>[</CODE><EM>flags</EM><CODE>]</CODE>
</STRONG></BLOCKQUOTE>
as the third argument to the <CODE>RewriteRule</CODE> directive. <EM>Flags</EM> is a
comma-separated list of the following flags:
<UL>
<LI>'<STRONG><CODE>redirect|R</CODE> [=<EM>code</EM>]</STRONG>' (force <A NAME="redirect"><STRONG>r</STRONG>edirect</A>)<BR>
Prefix <EM>Substitution</EM>
with <CODE>http://thishost[:thisport]/</CODE> (which makes the new URL a URI) to
force a external redirection. If no <EM>code</EM> is given a HTTP response
of 302 (MOVED TEMPORARILY) is used. If you want to use other response
codes in the range 300-400 just specify them as a number or use
one of the following symbolic names: <CODE>temp</CODE> (default), <CODE>permanent</CODE>,
<CODE>seeother</CODE>.
Use it for rules which should
canonicalize the URL and gives it back to the client, <EM>e.g.</EM>, translate
``<CODE>/~</CODE>'' into ``<CODE>/u/</CODE>'' or always append a slash to
<CODE>/u/</CODE><EM>user</EM>, etc.<BR>
<P>
<STRONG>Notice:</STRONG> When you use this flag, make sure that the
substitution field is a valid URL! If not, you are redirecting to an
invalid location! And remember that this flag itself only prefixes the
URL with <CODE>http://thishost[:thisport]/</CODE>, but rewriting goes on.
Usually you also want to stop and do the redirection immediately. To stop
the rewriting you also have to provide the 'L' flag.
<P>
<LI>'<STRONG><CODE>forbidden|F</CODE></STRONG>' (force URL to be <STRONG>f</STRONG>orbidden)<BR>
This forces the current URL to be forbidden, <EM>i.e.</EM>, it immediately sends
back a HTTP response of 403 (FORBIDDEN). Use this flag in conjunction with
appropriate RewriteConds to conditionally block some URLs.
<P>
<LI>'<STRONG><CODE>gone|G</CODE></STRONG>' (force URL to be <STRONG>g</STRONG>one)<BR>
This forces the current URL to be gone, <EM>i.e.</EM>, it immediately sends back a
HTTP response of 410 (GONE). Use this flag to mark no longer existing
pages as gone.
<P>
<LI>'<STRONG><CODE>proxy|P</CODE></STRONG>' (force <STRONG>p</STRONG>roxy)<BR>
This flag forces the substitution part to be internally forced as a proxy
request and immediately (<EM>i.e.</EM>, rewriting rule processing stops here) put
through the <A HREF="mod_proxy.html">proxy module</A>. You have to make
sure that the substitution string is a valid URI (<EM>e.g.</EM>, typically starting
with <CODE>http://</CODE><EM>hostname</EM>) which can be handled by the
Apache proxy module. If not you get an error from the proxy module. Use
this flag to achieve a more powerful implementation of the <A
HREF="mod_proxy.html#proxypass">ProxyPass</A> directive, to map some
remote stuff into the namespace of the local server.
<P>
Notice: To use this functionality make sure you have the proxy module
compiled into your Apache server program. If you don't know please check
whether <CODE>mod_proxy.c</CODE> is part of the ``<CODE>httpd -l</CODE>''
output. If yes, this functionality is available to mod_rewrite. If not,
then you first have to rebuild the ``<CODE>httpd</CODE>'' program with
mod_proxy enabled.
<P>
<LI>'<STRONG><CODE>last|L</CODE></STRONG>' (<STRONG>l</STRONG>ast rule)<BR>
Stop the rewriting process here and
don't apply any more rewriting rules. This corresponds to the Perl
<CODE>last</CODE> command or the <CODE>break</CODE> command from the C
language. Use this flag to prevent the currently rewritten URL from being
rewritten further by following rules which may be wrong. For
example, use it to rewrite the root-path URL ('<CODE>/</CODE>') to a real
one, <EM>e.g.</EM>, '<CODE>/e/www/</CODE>'.
<P>
<LI>'<STRONG><CODE>next|N</CODE></STRONG>' (<STRONG>n</STRONG>ext round)<BR>
Re-run the rewriting process (starting again with the first rewriting
rule). Here the URL to match is again not the original URL but the URL
from the last rewriting rule. This corresponds to the Perl
<CODE>next</CODE> command or the <CODE>continue</CODE> command from the C
language. Use this flag to restart the rewriting process, <EM>i.e.</EM>, to
immediately go to the top of the loop. <BR>
<STRONG>But be careful not to create a deadloop!</STRONG>
<P>
<LI>'<STRONG><CODE>chain|C</CODE></STRONG>' (<STRONG>c</STRONG>hained with next rule)<BR>
This flag chains the current rule with the next rule (which itself can
also be chained with its following rule, <EM>etc.</EM>). This has the following
effect: if a rule matches, then processing continues as usual, <EM>i.e.</EM>, the
flag has no effect. If the rule does <STRONG>not</STRONG> match, then all following
chained rules are skipped. For instance, use it to remove the
``<CODE>.www</CODE>'' part inside a per-directory rule set when you let an
external redirect happen (where the ``<CODE>.www</CODE>'' part should not to
occur!).
<P>
<LI>'<STRONG><CODE>type|T</CODE></STRONG>=<EM>MIME-type</EM>' (force MIME <STRONG>t</STRONG>ype)<BR>
Force the MIME-type of the target file to be <EM>MIME-type</EM>. For
instance, this can be used to simulate the <CODE>mod_alias</CODE>
directive <CODE>ScriptAlias</CODE> which internally forces all files inside
the mapped directory to have a MIME type of
``<CODE>application/x-httpd-cgi</CODE>''.
<P>
<LI>'<STRONG><CODE>nosubreq|NS</CODE></STRONG>' (used only if <STRONG>n</STRONG>o internal <STRONG>s</STRONG>ub-request)<BR>
This flag forces the rewriting engine to skip a rewriting rule if the
current request is an internal sub-request. For instance, sub-requests
occur internally in Apache when <CODE>mod_include</CODE> tries to find out
information about possible directory default files (<CODE>index.xxx</CODE>).
On sub-requests it is not always useful and even sometimes causes a failure to
if the complete set of rules are applied. Use this flag to exclude some rules.<BR>
<P>
Use the following rule for your decision: whenever you prefix some URLs
with CGI-scripts to force them to be processed by the CGI-script, the
chance is high that you will run into problems (or even overhead) on sub-requests.
In these cases, use this flag.
<P>
<LI>'<STRONG><CODE>nocase|NC</CODE></STRONG>' (<STRONG>n</STRONG>o <STRONG>c</STRONG>ase)<BR>
This makes the <EM>Pattern</EM> case-insensitive, <EM>i.e.</EM>, there is
no difference between 'A-Z' and 'a-z' when <EM>Pattern</EM> is matched
against the current URL.
<P>
<LI>'<STRONG><CODE>qsappend|QSA</CODE></STRONG>' (<STRONG>q</STRONG>uery <STRONG>s</STRONG>tring
<STRONG>a</STRONG>ppend)<BR>
This flag forces the rewriting engine to append a query
string part in the substitution string to the existing one instead of
replacing it. Use this when you want to add more data to the query string
via a rewrite rule.
<P>
<LI>'<STRONG><CODE>passthrough|PT</CODE></STRONG>' (<STRONG>p</STRONG>ass <STRONG>t</STRONG>hrough to next handler)<BR>
This flag forces the rewriting engine to set the <CODE>uri</CODE> field
of the internal <CODE>request_rec</CODE> structure to the value
of the <CODE>filename</CODE> field. This flag is just a hack to be able
to post-process the output of <CODE>RewriteRule</CODE> directives by
<CODE>Alias</CODE>, <CODE>ScriptAlias</CODE>, <CODE>Redirect</CODE>, <EM>etc.</EM> directives
from other URI-to-filename translators. A trivial example to show the
semantics:
If you want to rewrite <CODE>/abc</CODE> to <CODE>/def</CODE> via the rewriting
engine of <CODE>mod_rewrite</CODE> and then <CODE>/def</CODE> to <CODE>/ghi</CODE>
with <CODE>mod_alias</CODE>:
<PRE>
RewriteRule ^/abc(.*) /def$1 [PT]
Alias /def /ghi
</PRE>
If you omit the <CODE>PT</CODE> flag then <CODE>mod_rewrite</CODE>
will do its job fine, <EM>i.e.</EM>, it rewrites <CODE>uri=/abc/...</CODE> to
<CODE>filename=/def/...</CODE> as a full API-compliant URI-to-filename
translator should do. Then <CODE>mod_alias</CODE> comes and tries to do a
URI-to-filename transition which will not work.
<P>
Notice: <STRONG>You have to use this flag if you want to intermix directives
of different modules which contain URL-to-filename translators</STRONG>. The
typical example is the use of <CODE>mod_alias</CODE> and
<CODE>mod_rewrite</CODE>..
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<font size=-1>
<STRONG>Notice - For the Apache hackers:</STRONG><BR>
If the current Apache API had a
filename-to-filename hook additionally to the URI-to-filename hook then
we wouldn't need this flag! But without such a hook this flag is the
only solution. The Apache Group has discussed this problem and will
add such hooks into Apache version 2.0.
</FONT>
</TD></TR>
</TABLE>
<P>
<LI>'<STRONG><CODE>skip|S</CODE></STRONG>=<EM>num</EM>' (<STRONG>s</STRONG>kip next rule(s))<BR>
This flag forces the rewriting engine to skip the next <EM>num</EM> rules
in sequence when the current rule matches. Use this to make pseudo
if-then-else constructs: The last rule of the then-clause becomes
a <CODE>skip=N</CODE> where N is the number of rules in the else-clause.
(This is <STRONG>not</STRONG> the same as the 'chain|C' flag!)
<P>
<LI>'<STRONG><CODE>env|E=</CODE></STRONG><EM>VAR</EM>:<EM>VAL</EM>' (set <STRONG>e</STRONG>nvironment variable)<BR>
This forces an environment variable named <EM>VAR</EM> to be set to the
value <EM>VAL</EM>, where <EM>VAL</EM> can contain regexp backreferences
<CODE>$N</CODE> and <CODE>%N</CODE> which will be expanded. You can use this flag
more than once to set more than one variable. The variables can be later
dereferenced at a lot of situations, but the usual location will be from
within XSSI (via <CODE>&lt;!--#echo var="VAR"--&gt;</CODE>) or CGI (<EM>e.g.</EM>
<CODE>$ENV{'VAR'}</CODE>). But additionally you can also dereference it in a
following RewriteCond pattern via <CODE>%{ENV:VAR}</CODE>. Use this to strip
but remember information from URLs.
</UL>
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> Never forget that <EM>Pattern</EM> gets applied to a complete URL
in per-server configuration files. <STRONG>But in per-directory configuration
files, the per-directory prefix (which always is the same for a specific
directory!) gets automatically <EM>removed</EM> for the pattern matching and
automatically <EM>added</EM> after the substitution has been done.</STRONG> This feature is
essential for many sorts of rewriting, because without this prefix stripping
you have to match the parent directory which is not always possible.
<P>
There is one exception: If a substitution string starts with
``<CODE>http://</CODE>'' then the directory prefix will be <STRONG>not</STRONG> added and a
external redirect or proxy throughput (if flag <STRONG>P</STRONG> is used!) is forced!
</TD></TR>
</TABLE>
<P>
<TABLE WIDTH="70%" BORDER=0 BGCOLOR="#E0E0F0" CELLSPACING=0 CELLPADDING=10>
<TR><TD>
<STRONG>Notice:</STRONG> To enable the rewriting engine for per-directory configuration files
you need to set ``<CODE>RewriteEngine On</CODE>'' in these files <STRONG>and</STRONG>
``<CODE>Option FollowSymLinks</CODE>'' enabled. If your administrator has
disabled override of <CODE>FollowSymLinks</CODE> for a user's directory, then
you cannot use the rewriting engine. This restriction is needed for
security reasons.
</TD></TR>
</TABLE>
<P>
Here are all possible substitution combinations and their meanings:
<P>
<STRONG>Inside per-server configuration (<CODE>httpd.conf</CODE>)<BR>
for request ``<CODE>GET /somepath/pathinfo</CODE>'':</STRONG><BR>
<P>
<TABLE BGCOLOR="#F0F0F0" CELLSPACING=0 CELLPADDING=5>
<TR>
<TD>
<PRE>
<STRONG>Given Rule</STRONG> <STRONG>Resulting Substitution</STRONG>
---------------------------------------------- ----------------------------------
^/somepath(.*) otherpath$1 not supported, because invalid!
^/somepath(.*) otherpath$1 [R] not supported, because invalid!
^/somepath(.*) otherpath$1 [P] not supported, because invalid!
---------------------------------------------- ----------------------------------
^/somepath(.*) /otherpath$1 /otherpath/pathinfo
^/somepath(.*) /otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^/somepath(.*) /otherpath$1 [P] not supported, because silly!
---------------------------------------------- ----------------------------------
^/somepath(.*) http://thishost/otherpath$1 /otherpath/pathinfo
^/somepath(.*) http://thishost/otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^/somepath(.*) http://thishost/otherpath$1 [P] not supported, because silly!
---------------------------------------------- ----------------------------------
^/somepath(.*) http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo
via external redirection
^/somepath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
via external redirection
(the [R] flag is redundant)
^/somepath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
via internal proxy
</PRE>
</TD>
</TR>
</TABLE>
<P>
<STRONG>Inside per-directory configuration for <CODE>/somepath</CODE><BR>
(<EM>i.e.</EM>, file <CODE>.htaccess</CODE> in dir <CODE>/physical/path/to/somepath</CODE> containing
<CODE>RewriteBase /somepath</CODE>)<BR> for
request ``<CODE>GET /somepath/localpath/pathinfo</CODE>'':</STRONG><BR>
<P>
<TABLE BGCOLOR="#F0F0F0" CELLSPACING=0 CELLPADDING=5>
<TR>
<TD>
<PRE>
<STRONG>Given Rule</STRONG> <STRONG>Resulting Substitution</STRONG>
---------------------------------------------- ----------------------------------
^localpath(.*) otherpath$1 /somepath/otherpath/pathinfo
^localpath(.*) otherpath$1 [R] http://thishost/somepath/otherpath/pathinfo
via external redirection
^localpath(.*) otherpath$1 [P] not supported, because silly!
---------------------------------------------- ----------------------------------
^localpath(.*) /otherpath$1 /otherpath/pathinfo
^localpath(.*) /otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^localpath(.*) /otherpath$1 [P] not supported, because silly!
---------------------------------------------- ----------------------------------
^localpath(.*) http://thishost/otherpath$1 /otherpath/pathinfo
^localpath(.*) http://thishost/otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^localpath(.*) http://thishost/otherpath$1 [P] not supported, because silly!
---------------------------------------------- ----------------------------------
^localpath(.*) http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo
via external redirection
^localpath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
via external redirection
(the [R] flag is redundant)
^localpath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
via internal proxy
</PRE>
</TD>
</TR>
</TABLE>
<P>
<STRONG>Example:</STRONG>
<P>
<BLOCKQUOTE>
We want to rewrite URLs of the form
<BLOCKQUOTE>
<CODE>/</CODE> <EM>Language</EM>
<CODE>/~</CODE> <EM>Realname</EM>
<CODE>/.../</CODE> <EM>File</EM>
</BLOCKQUOTE>
into
<BLOCKQUOTE>
<CODE>/u/</CODE> <EM>Username</EM>
<CODE>/.../</CODE> <EM>File</EM>
<CODE>.</CODE> <EM>Language</EM>
</BLOCKQUOTE>
<P>
We take the rewrite mapfile from above and save it under
<CODE>/path/to/file/map.txt</CODE>. Then we only have to add the
following lines to the Apache server configuration file:
<BLOCKQUOTE>
<PRE>
RewriteLog /path/to/file/rewrite.log
RewriteMap real-to-user txt:/path/to/file/map.txt
RewriteRule ^/([^/]+)/~([^/]+)/(.*)$ /u/${real-to-user:$2|nobody}/$3.$1
</PRE>
</BLOCKQUOTE>
</BLOCKQUOTE>
<P>
<HR NOSHADE SIZE=1>
<CENTER>
<H1><A NAME="Miscelleneous">Miscellaneous</A></H1>
</CENTER>
<P>
<HR NOSHADE SIZE=1>
<H2><A NAME="EnvVar">Environment Variables</A></H2>
This module keeps track of two additional (non-standard) CGI/SSI environment
variables named <CODE>SCRIPT_URL</CODE> and <CODE>SCRIPT_URI</CODE>. These contain
the <EM>logical</EM> Web-view to the current resource, while the standard CGI/SSI
variables <CODE>SCRIPT_NAME</CODE> and <CODE>SCRIPT_FILENAME</CODE> contain the
<EM>physical</EM> System-view.
<P>
Notice: These variables hold the URI/URL <EM>as they were initially
requested</EM>, <EM>i.e.</EM>, in a state <EM>before</EM> any rewriting. This is
important because the rewriting process is primarily used to rewrite logical
URLs to physical pathnames.
<P>
<STRONG>Example:</STRONG>
<BLOCKQUOTE>
<PRE>
SCRIPT_NAME=/sw/lib/w3s/tree/global/u/rse/.www/index.html
SCRIPT_FILENAME=/u/rse/.www/index.html
SCRIPT_URL=/u/rse/
SCRIPT_URI=http://en1.engelschall.com/u/rse/
</PRE>
</BLOCKQUOTE>
<P>
<HR NOSHADE SIZE=1>
<H2><A NAME="Solutions">Practical Solutions</A></H2>
There is a comprehensive collection of practical solutions for URL-based
problems available by the author of mod_rewrite. Here you will find real-life
rulesets and additional information.
<BLOCKQUOTE>
<STRONG>Apache URL Rewriting Guide</STRONG><BR>
<STRONG><A HREF="http://www.engelschall.com/pw/apache/rewriteguide/"
>http://www.engelschall.com/pw/apache/rewriteguide/</A></STRONG>
</BLOCKQUOTE>
<!--#include virtual="footer.html" -->
</BLOCKQUOTE><!-- page indentation -->
</BODY>
</HTML>
<!--/%hypertext -->