mod_charset_lite.html revision 5dcf6eeded9c64f26444bd6dfa855f6dfa431ec5
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<TITLE>Apache module mod_charset_lite</TITLE>
</HEAD>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<BODY
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#000080"
ALINK="#FF0000"
>
<!--#include virtual="header.html" -->
<H1 ALIGN="CENTER">Module mod_charset_lite</H1>
<P>
This module is contained in the <CODE>mod_charset_lite.c</CODE> file, with
Apache 2.0 and later. It provides the ability to specify character set
translation, or recoding, by directory or location or virtual server. It
is not compiled into the server by default. <CODE>mod_charset_lite</CODE>
requires that Apache is compiled with APACHE_XLATE defined.
</P>
<P>
This module provides a small subset of configuration mechanisms
implemented by Russian Apache and its associated <CODE>mod_charset</CODE>.
</P>
<H2>Summary</H2>
<P>
This is an <STRONG>experimental</STRONG> module and should be used with
care. Experiment with your <CODE>mod_charset_lite</CODE> configuration to
ensure that it performs the desired function.
</P>
<P>
<CODE>mod_charset_lite</CODE> allows the administrator to specify the
source character set of objects as well as the character set they should
be translated into before sending to the client.
<CODE>mod_charset_lite</CODE> does not translate the data itself but
instead tells Apache what translation to perform.
<CODE>mod_charset_lite</CODE> is applicable to EBCDIC and ASCII
host environments. In an EBCDIC environment, Apache normally translates
text content from the code page of the Apache process locale to
ISO-8859-1. <CODE>mod_charset_lite</CODE> can be used to specify that
a different translation is to be performed. In an ASCII environment,
Apache normally performs no translation, so <CODE>mod_charset_lite</CODE>
is needed in order for any translation to take place.
</P>
<H2>Directives</H2>
<UL>
<LI><A HREF="#charsetsourceenc">CharsetSourceEnc</A>
<LI><A HREF="#charsetdefault">CharsetDefault</A>
<LI><A HREF="#charsetdebug">CharsetDebug</A>
</LI>
</UL>
<HR>
<H2><A NAME="charsetsourceenc">CharsetSourceEnc</A></H2>
<P>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> CharsetSourceEnc <EM>charset</EM>
<BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM>
<BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> directory, virtual host
<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>FileInfo</EM>
<BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Experimental
<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_charset_lite
<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Only available in Apache 2.0 or later
<P>
The <CODE>CharsetSourceEnc</CODE> directive specifies the source charset
of files in the associated container.
</P>
<P>
The value of the <EM>charset</EM> argument must be accepted as a valid
character set name by the character set support in APR. Generally, this
means that it must be supported by iconv.
</P>
Example:
<PRE>
&lt;Directory "/export/home/trawick/apacheinst/htdocs/convert"&gt;
CharsetSourceEnc UTF-16BE
CharsetDefault ISO8859-1
&lt;/Directory&gt;
</PRE>
The character set names in this example work with the iconv
translation support in Solaris 8.
<P>
<H2><A NAME="charsetdefault">CharsetDefault</A></H2>
<P>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> CharsetDefault <EM>charset</EM>
<BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>None</EM>
<BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> directory, virtual host
<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>FileInfo</EM>
<BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Experimental
<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_charset_lite
<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Only available in Apache 2.0 or later
<P>
The <CODE>CharsetDefault</CODE> directive specifies the charset that
content in the associated container should be translated to.
</P>
<P>
The value of the <EM>charset</EM> argument must be accepted as a valid
character set name by the character set support in APR. Generally, this
means that it must be supported by iconv.
</P>
Example:
<PRE>
&lt;Directory "/export/home/trawick/apacheinst/htdocs/convert"&gt;
CharsetSourceEnc UTF-16BE
CharsetDefault ISO8859-1
&lt;/Directory&gt;
</PRE>
<P>
<H2><A NAME="charsetdebug">CharsetDebug</A></H2>
<P>
<A
HREF="directive-dict.html#Syntax"
REL="Help"
><STRONG>Syntax:</STRONG></A> CharsetDebug <EM>on/off</EM>
<BR>
<A
HREF="directive-dict.html#Default"
REL="Help"
><STRONG>Default:</STRONG></A> <EM>off</EM>
<BR>
<A
HREF="directive-dict.html#Context"
REL="Help"
><STRONG>Context:</STRONG></A> directory, virtual host
<BR>
<A
HREF="directive-dict.html#Override"
REL="Help"
><STRONG>Override:</STRONG></A> <EM>FileInfo</EM>
<BR>
<A
HREF="directive-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Experimental
<BR>
<A
HREF="directive-dict.html#Module"
REL="Help"
><STRONG>Module:</STRONG></A> mod_charset_lite
<BR>
<A
HREF="directive-dict.html#Compatibility"
REL="Help"
><STRONG>Compatibility:</STRONG></A> Only available in Apache 2.0 or later
<P>
The <CODE>CharsetDebug</CODE> directive specifies whether or not
verbose logging should be performed by <CODE>mod_charset_lite</CODE>.
Such logging is written to the Apache error log with level
<EM>debug</EM>.
</P>
<H2>Common Problems</H2>
<H3>Invalid character set names</H3>
<P>
The character set name parameters of CharsetSourceEnc and CharsetDefault
must be acceptable to the translation mechanism used by APR on the system
where mod_charset_lite is deployed. These character set names are not
standardized and are usually not the same as the corresponding values used
in http headers. Currently, APR can only use iconv(3), so you can easily
test your character set names using the iconv(1) program, as follows:
</P>
<PRE>
iconv -f charsetsourceenc-value -t charsetdefault-value
</PRE>
<H3>Mismatch between character set of content and translation rules</H3>
<P>
If the translation rules don't make sense for the content, translation
can fail in various ways, including:
</P>
<SL>
<LI>
The translation mechanism may return a bad return code, and the connection
will be aborted.
<LI>
The translation mechanism may silently place special characters (e.g., question
marks) in the output buffer when it cannot translate the input buffer.
</SL>
<!--#include virtual="footer.html" -->
</BODY>
</HTML>