863N/A<!
DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
863N/A<
TITLE>Apache Content Negotiation</
TITLE>
863N/A<!-- Background white, links blue (unvisited), navy (visited), red (active) --> 863N/A<
H1 ALIGN="CENTER">Content Negotiation</
H1>
873N/AApache's support for content negotiation has been updated to meet the
863N/AHTTP/
1.1 specification. It can choose the best representation of a
863N/Aresource based on the browser-supplied preferences for media type,
863N/Alanguages, character set and encoding. It is also implements a
863N/Acouple of features to give more intelligent handling of requests from
863N/Abrowsers which send incomplete negotiation information. <
P>
5409N/AContent negotiation is provided by the
1183N/Awhich is compiled in by default.
863N/A<
H2>About Content Negotiation</
H2>
1183N/AA resource may be available in several different representations. For
1182N/Aexample, it might be available in different languages or different
1182N/Amedia types, or a combination. One way of selecting the most
1182N/Aappropriate choice is to give the user an index page, and let them
863N/Aselect. However it is often possible for the server to choose
863N/Aautomatically. This works because browsers can send as part of each
3853N/Arequest information about what representations they prefer. For
5065N/Aexample, a browser could indicate that it would like to see
5065N/Ainformation in French, if possible, else English will do. Browsers
863N/Aindicate their preferences by headers in the request. To request only
863N/AFrench representations, the browser would send
5065N/ANote that this preference will only be applied when there is a choice
5065N/Aof representations and they vary by language.
5065N/AAs an example of a more complex request, this browser has been
5065N/Aconfigured to accept French and English, but prefer French, and to
5065N/Aaccept various media types, preferring HTML over plain text or other
5065N/Atext types, and preferring GIF or JPEG over other media types, but also
5065N/Aallowing any other media type as a last resort:
5065N/A Accept-Language: fr; q=1.0, en; q=0.5
3853N/AApache 1.2 supports 'server driven' content negotiation, as defined in
3853N/AAccept-Language, Accept-Charset and Accept-Encoding request headers.
4658N/AApache 1.3.4 also supports 'transparent' content negotiation, which is
3853N/Aan experimental negotiation protocol defined in RFC 2295 and RFC 2296.
3853N/AIt does not offer support for 'feature negotiation' as defined in
3853N/AA <
STRONG>resource</
STRONG> is a conceptual entity identified by a URI
3853N/A(RFC 2396). An HTTP server like Apache provides access to
3853N/A<
STRONG>representations</
STRONG> of the resource(s) within its namespace,
863N/Awith each representation in the form of a sequence of bytes with a
3853N/Adefined media type, character set, encoding, etc. Each resource may be
3853N/Aassociated with zero, one, or more than one representation
3853N/Aat any given time. If multiple representations are available,
3853N/Athe resource is referred to as <
STRONG>negotiable</
STRONG> and each of its
3853N/Arepresentations is termed a <
STRONG>variant</
STRONG>. The ways in which the
863N/Avariants for a negotiable resource vary are called the
3853N/A<
STRONG>dimensions</
STRONG> of negotiation.
5066N/A<
H2>Negotiation in Apache</
H2>
3853N/AIn order to negotiate a resource, the server needs to be given
3853N/Ainformation about each of the variants. This is done in one of two
3853N/A <
LI> Using a type map (<
EM>
i.e.</
EM>, a <
CODE>*.var</
CODE> file) which
863N/A names the files containing the variants explicitly, or
3853N/A <
LI> Using a 'MultiViews' search, where the server does an implicit
3853N/A filename pattern match and chooses from among the results.
863N/A<
H3>Using a type-map file</
H3>
3853N/AA type map is a document which is associated with the handler
3853N/Anamed <
CODE>type-map</
CODE> (or, for backwards-compatibility with
4658N/Aolder Apache configurations, the mime type
1177N/Ayou must have a handler set in the configuration that defines a
3853N/Afile suffix as <
CODE>type-map</
CODE>; this is best done with a
4658N/Ain the server configuration file. See the comments in the sample config
3853N/AType map files have an entry for each available variant; these entries
3853N/Aconsist of contiguous HTTP-format header lines. Entries for
3853N/Adifferent variants are separated by blank lines. Blank lines are
3853N/Aillegal within an entry. It is conventional to begin a map file with
4658N/Aan entry for the combined entity as a whole (although this
4658N/Ais not required, and if present will be ignored). An example
4658N/AIf the variants have different source qualities, that may be indicated
4658N/Aby the "qs" parameter to the media type, as in this picture (available
4658N/Aas jpeg, gif, or ASCII-art):
4658N/Aqs values can vary in the range 0.000 to 1.000. Note that any variant with
4658N/Aa qs value of 0.000 will never be chosen. Variants with no 'qs'
4658N/Aparameter value are given a qs factor of 1.0. The qs parameter indicates
4658N/Athe relative 'quality' of this variant compared to the other available
4658N/Avariants, independent of the client's capabilities. For example, a jpeg
4658N/Afile is usually of higher source quality than an ascii file if it is
4658N/Aattempting to represent a photograph. However, if the resource being
4658N/Arepresented is an original ascii art, then an ascii representation would
4658N/Ahave a higher source quality than a jpeg representation. A qs value
4658N/Ais therefore specific to a given variant depending on the nature of
4658N/AThe full list of headers recognized is:
4658N/A <
DD> uri of the file containing the variant (of the given media
4658N/A type, encoded with the given content encoding). These are
4658N/A interpreted as URLs relative to the map file; they must be on
4658N/A the same server (!), and they must refer to files to which the
4658N/A client would be granted access if they were to be requested
4658N/A <
DT> <
CODE>Content-Type:</
CODE>
4658N/A <
DD> media type --- charset, level and "qs" parameters may be given. These
4658N/A are often referred to as MIME types; typical media types are
4658N/A <
DT> <
CODE>Content-Language:</
CODE>
4802N/A <
DD> The languages of the variant, specified as an Internet standard
4658N/A language tag from RFC 1766 (<
EM>
e.g.</
EM>, <
CODE>en</
CODE> for English,
3853N/A <
CODE>kr</
CODE> for Korean, <
EM>etc.</
EM>).
3853N/A <
DT> <
CODE>Content-Encoding:</
CODE>
3853N/A <
DD> If the file is compressed, or otherwise encoded, rather than
4658N/A containing the actual raw data, this says how that was done.
4658N/A Apache only recognizes encodings that are defined by an
4658N/A This normally includes the encodings <
CODE>x-compress</
CODE>
4658N/A for compress'd files, and <
CODE>x-gzip</
CODE> for gzip'd files.
4658N/A The <
CODE>x-</
CODE> prefix is ignored for encoding comparisons.
4658N/A <
DT> <
CODE>Content-Length:</
CODE>
4658N/A <
DD> The size of the file. Specifying content
4658N/A lengths in the type-map allows the server to compare file sizes
4658N/A without checking the actual files.
4658N/A <
DT> <
CODE>Description:</
CODE>
4658N/A <
DD> A human-readable textual description of the variant. If Apache cannot
4658N/A find any appropriate variant to return, it will return an error
4658N/A response which lists all available variants instead. Such a variant
4658N/A list will include the human-readable variant descriptions.
4658N/A<
CODE>MultiViews</
CODE> is a per-directory option, meaning it can be set with
4658N/Aan <
CODE>Options</
CODE> directive within a <
CODE><Directory></
CODE>,
4658N/A<
CODE><Location></
CODE> or <
CODE><Files></
CODE>
4658N/Ais properly set) in <
CODE>.htaccess</
CODE> files. Note that
4658N/A<
CODE>Options All</
CODE> does not set <
CODE>MultiViews</
CODE>; you
4658N/AThe effect of <
CODE>MultiViews</
CODE> is as follows: if the server
4658N/A<
CODE>/
some/
dir</
CODE> has <
CODE>MultiViews</
CODE> enabled, and
4658N/Adirectory looking for files named foo.*, and effectively fakes up a
4658N/Atype map which names all those files, assigning them the same media
4658N/Atypes and content-encodings it would have if the client had asked for
4658N/Aone of them by name. It then chooses the best match to the client's
4658N/A<
CODE>MultiViews</
CODE> may also apply to searches for the file named by the
4658N/A<
CODE>DirectoryIndex</
CODE> directive, if the server is trying to
4658N/Aindex a directory. If the configuration files specify
4658N/AIf one of the files found when reading the directive is a CGI script,
4802N/Ait's not obvious what should happen. The code gives that case
3853N/Aspecial treatment --- if the request was a POST, or a GET with
1177N/AQUERY_ARGS or PATH_INFO, the script is given an extremely high quality
3853N/Arating, and generally invoked; otherwise it is given an extremely low
3853N/Aquality rating, which generally causes one of the other views (if any)
3853N/A<
H2>The Negotiation Methods</
H2>
4658N/AAfter Apache has obtained a list of the variants for a given resource,
4658N/Aeither from a type-map file or from the filenames in the directory, it
3853N/Ainvokes one of two methods to decide on the 'best' variant to
3853N/Areturn, if any. It is not necessary to know any of the details of how
3853N/Anegotiation actually takes place in order to use Apache's content
3853N/Anegotiation features. However the rest of this document explains the
3853N/Amethods used for those interested.
3853N/AThere are two negotiation methods:
4658N/A<
LI><
STRONG>Server driven negotiation with the Apache
4658N/Aalgorithm</
STRONG> is used in the normal case. The Apache algorithm is
4658N/Aexplained in more detail below. When this algorithm is used, Apache
3853N/Acan sometimes 'fiddle' the quality factor of a particular dimension to
3853N/Aachieve a better result. The ways Apache can fiddle quality factors is
3853N/Aexplained in more detail below.
3853N/A<
LI><
STRONG>Transparent content negotiation</
STRONG> is used when the
3853N/Abrowser specifically requests this through the mechanism defined in RFC
3853N/A2295. This negotiation method gives the browser full control over
3853N/Adeciding on the 'best' variant, the result is therefore dependent on
863N/Athe specific algorithms used by the browser. As part of the
863N/Atransparent negotiation process, the browser can ask Apache to run the
863N/A'remote variant selection algorithm' defined in RFC 2296.
2350N/A<
H3>Dimensions of Negotiation</
H3>
3853N/A<
TD>Browser indicates preferences with the Accept header field. Each item
3853N/Acan have an associated quality factor. Variant description can also
3853N/Ahave a quality factor (the "qs" parameter).
3853N/A<
TD>Browser indicates preferences with the Accept-Language header field.
3853N/AEach item can have a quality factor. Variants can be associated with none, one
3853N/A<
TD>Browser indicates preference with the Accept-Encoding header field.
3853N/AEach item can have a quality factor.
2350N/A<
TD>Browser indicates preference with the Accept-Charset header field.
3853N/AEach item can have a quality factor.
3853N/AVariants can indicate a charset as a parameter of the media type.
5066N/A<
H3>Apache Negotiation Algorithm</
H3>
3853N/AApache can use the following algorithm to select the 'best' variant
3853N/A(if any) to return to the browser. This algorithm is not
3853N/Afurther configurable. It operates as follows:
3853N/A<
LI>First, for each dimension of the negotiation, check the appropriate
3853N/A<
EM>Accept*</
EM> header field and assign a quality to each
3853N/Avariant. If the <
EM>Accept*</
EM> header for any dimension implies that this
4802N/Avariant is not acceptable, eliminate it. If no variants remain, go
3853N/A<
LI>Select the 'best' variant by a process of elimination. Each of the
4802N/Afollowing tests is applied in order. Any variants not selected at each
3853N/Atest are eliminated. After each test, if only one variant remains,
3853N/Aselect it as the best match and proceed to step 3. If more than one
3853N/Avariant remains, move on to the next test.
3853N/A<
LI>Multiply the quality factor from the Accept header with the
3853N/A quality-of-source factor for this variant's media type, and select
3853N/A the variants with the highest value.
3853N/A<
LI>Select the variants with the highest language quality factor.
3853N/A<
LI>Select the variants with the best language match, using either the
3853N/A order of languages in the Accept-Language header (if present), or else
3853N/A the order of languages in the <
CODE>LanguagePriority</
CODE>
3853N/A<
LI>Select the variants with the highest 'level' media parameter
2350N/A<
LI>Select variants with the best charset media parameters,
3853N/A as given on the Accept-Charset header line. Charset ISO-8859-1
3853N/A is acceptable unless explicitly excluded. Variants with a
3853N/A <
CODE>text/*</
CODE> media type but not explicitly associated
3853N/A with a particular charset are assumed to be in ISO-8859-1.
3853N/A<
LI>Select those variants which have associated
2350N/A charset media parameters that are <
EM>not</
EM> ISO-8859-1.
3853N/A If there are no such variants, select all variants instead.
3853N/A<
LI>Select the variants with the best encoding. If there are
3853N/A variants with an encoding that is acceptable to the user-agent,
3853N/A select only these variants. Otherwise if there is a mix of encoded
3853N/A and non-encoded variants, select only the unencoded variants.
3853N/A If either all variants are encoded or all variants are not encoded,
3853N/A<
LI>Select the variants with the smallest content length.
2350N/A<
LI>Select the first variant of those remaining. This will be either the
5065N/A first listed in the type-map file, or when variants are read from
5065N/A the directory, the one whose file name comes first when sorted using
5065N/A<
LI>The algorithm has now selected one 'best' variant, so return
5065N/A it as the response. The HTTP response header Vary is set to indicate the
6040N/A dimensions of negotiation (browsers and caches can use this
6040N/A information when caching the resource). End.
6040N/A<
LI>To get here means no variant was selected (because none are acceptable
6040N/A to the browser). Return a 406 status (meaning "No acceptable representation")
6040N/A with a response body consisting of an HTML document listing the
6040N/A available variants. Also set the HTTP Vary header to indicate the
6040N/A<
H2><
A NAME="better">Fiddling with Quality Values</
A></
H2>
6040N/AApache sometimes changes the quality values from what would be
6040N/Aexpected by a strict interpretation of the Apache negotiation
6040N/Aalgorithm above. This is to get a better result from the algorithm for
5065N/Abrowsers which do not send full or accurate information. Some of the
5065N/Amost popular browsers send Accept header information which would
5065N/Aotherwise result in the selection of the wrong variant in many
5065N/Acases. If a browser sends full and correct information these fiddles
6040N/A<
H3>Media Types and Wildcards</
H3>
5065N/AThe Accept: request header indicates preferences for media types. It
5065N/Acan also include 'wildcard' media types, such as "image/*" or "*/*"
5065N/Awhere the * matches any string. So a request including:
5065N/Awould indicate that any type starting "image/" is acceptable,
5065N/Aas is any other type (so the first "image/*" is redundant). Some
5065N/Abrowsers routinely send wildcards in addition to explicit types they
5065N/AThe intention of this is to indicate that the explicitly
5065N/Alisted types are preferred, but if a different representation is
5065N/Aavailable, that is ok too. However under the basic algorithm, as given
5065N/Aabove, the */* wildcard has exactly equal preference to all the other
5065N/Atypes, so they are not being preferred. The browser should really have
5065N/Asent a request with a lower quality (preference) value for *.*, such
5065N/AThe explicit types have no quality factor, so they default to a
5065N/Apreference of 1.0 (the highest). The wildcard */* is given
5065N/Aa low preference of 0.01, so other types will only be returned if
5065N/Ano variant matches an explicitly listed type.
5065N/AIf the Accept: header contains <
EM>no</
EM> q factors at all, Apache sets
5065N/Athe q value of "*/*", if present, to 0.01 to emulate the desired
5065N/Abehavior. It also sets the q value of wildcards of the format
5066N/A"type/*" to 0.02 (so these are preferred over matches against
5066N/A"*/*". If any media type on the Accept: header contains a q factor,
5066N/Athese special values are <
EM>not</
EM> applied, so requests from browsers
5065N/Awhich send the correct information to start with work as expected.
5065N/A<
H3>Variants with no Language</
H3>
5065N/AIf some of the variants for a particular resource have a language
5065N/Aattribute, and some do not, those variants with no language
5065N/Aare given a very low language quality factor of 0.001.<
P>
5065N/AThe reason for setting this language quality factor for
5065N/Avariant with no language to a very low value is to allow
5065N/Afor a default variant which can be supplied if none of the
5065N/Aother variants match the browser's language preferences.
5065N/AFor example, consider the situation with three variants:
5065N/AThe meaning of a variant with no language is that it is
5065N/Aalways acceptable to the browser. If the request Accept-Language
5065N/A<
H2>Extensions to Transparent Content Negotiation</
H2>
5065N/AApache extends the transparent content negotiation protocol (RFC 2295)
5065N/Aas follows. A new <
CODE> {encoding ..}</
CODE> element is used in
5065N/Avariant lists to label variants which are available with a specific
5065N/Acontent-encoding only. The implementation of the
5065N/Avariants in the list, and to use them as candidate variants whenever
5065N/Atheir encodings are acceptable according to the Accept-Encoding
5065N/Arequest header. The
RVSA/
1.0 implementation does not round computed
5065N/Aquality factors to 5 decimal places before choosing the best variant.
5065N/A<
H2>Note on hyperlinks and naming conventions</
H2>
5065N/AIf you are using language negotiation you can choose between
5065N/Adifferent naming conventions, because files can have more than one
5065N/Aextension, and the order of the extensions is normally irrelevant
5065N/AA typical file has a MIME-type extension (<
EM>
e.g.</
EM>, <
SAMP>html</
SAMP>),
5065N/Amaybe an encoding extension (<
EM>
e.g.</
EM>, <
SAMP>gz</
SAMP>), and of course a
5065N/Alanguage extension (<
EM>
e.g.</
EM>, <
SAMP>en</
SAMP>) when we have different
5065N/Alanguage variants of this file.
5065N/AHere some more examples of filenames together with valid and invalid
5065N/A<
TABLE BORDER=1 CELLPADDING=8 CELLSPACING=0>
5065N/ALooking at the table above you will notice that it is always possible to
5065N/Ause the name without any extensions in an hyperlink (<
EM>
e.g.</
EM>, <
SAMP>foo</
SAMP>).
5065N/AThe advantage is that you can hide the actual type of a
5065N/Adocument rsp. file and can change it later, <
EM>
e.g.</
EM>, from <
SAMP>html</
SAMP>
5065N/Ato <
SAMP>shtml</
SAMP> or <
SAMP>cgi</
SAMP> without changing any
5065N/AIf you want to continue to use a MIME-type in your hyperlinks (<
EM>
e.g.</
EM>
5065N/A<
SAMP>
foo.html</
SAMP>) the language extension (including an encoding extension
5065N/Aif there is one) must be on the right hand side of the MIME-type extension
5065N/AWhen a cache stores a representation, it associates it with the request URL.
5065N/AThe next time that URL is requested, the cache can use the stored
5065N/Arepresentation. But, if the resource is negotiable at the server,
5065N/Athis might result in only the first requested variant being cached and
5065N/Asubsequent cache hits might return the wrong response. To prevent this,
5065N/AApache normally marks all responses that are returned after content negotiation
5065N/Aprotocol features to allow caching of negotiated responses. <
P>
5065N/AFor requests which come from a
HTTP/
1.0 compliant client (either a
5065N/Abrowser or a cache), the directive <
TT>CacheNegotiatedDocs</
TT> can be
5065N/Aused to allow caching of responses which were subject to negotiation.
5065N/AThis directive can be given in the server config or virtual host, and
6040N/Atakes no arguments. It has no effect on requests from
HTTP/
1.1 clients.