325N/A * Copyright (c) 1998, 2011, Oracle and/or its affiliates. All rights reserved. 325N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 325N/A * This code is free software; you can redistribute it and/or modify it 325N/A * under the terms of the GNU General Public License version 2 only, as 325N/A * published by the Free Software Foundation. Oracle designates this 325N/A * particular file as subject to the "Classpath" exception as provided 325N/A * by Oracle in the LICENSE file that accompanied this code. 325N/A * This code is distributed in the hope that it will be useful, but WITHOUT 325N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 325N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 325N/A * version 2 for more details (a copy is included in the LICENSE file that 325N/A * accompanied this code). 325N/A * You should have received a copy of the GNU General Public License version 325N/A * 2 along with this work; if not, write to the Free Software Foundation, 325N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 325N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 325N/A * or visit www.oracle.com if you need additional information or have any 325N/A * This entity resolver class provides a number of utilities which can help 325N/A * managment of external parsed entities in XML. These are commonly used 325N/A * to hold markup declarations that are to be used as part of a Document 325N/A * Type Declaration (DTD), or to hold text marked up with XML. 325N/A * <P> Features include: <UL> 325N/A * <LI> Static factory methods are provided for constructing SAX InputSource 325N/A * objects from Files, URLs, or MIME objects. This eliminates a class of 325N/A * error-prone coding in applications. 325N/A * <LI> Character encodings for XML documents are correctly supported: <UL> 325N/A * <LI> The encodings defined in the RFCs for MIME content types 325N/A * (2046 for general MIME, and 2376 for XML in particular), are 325N/A * supported, handling <em>charset=...</em> attributes and accepting 325N/A * content types which are known to be safe for use with XML; 325N/A * <LI> The character encoding autodetection algorithm identified 325N/A * in the XML specification is used, and leverages all of 325N/A * the JDK 1.1 (and later) character encoding support. 325N/A * <LI> The use of MIME typing may optionally be disabled, forcing the 325N/A * use of autodetection, to support web servers which don't correctly 325N/A * report MIME types for XML. For example, they may report text that 325N/A * is encoded in EUC-JP as being US-ASCII text, leading to fatal 325N/A * errors during parsing. 325N/A * <LI> The InputSource objects returned by this class always 325N/A * have a <code>java.io.Reader</code> available as the "character 325N/A * <LI> Catalog entries can map public identifiers to Java resources or 325N/A * to local URLs. These are used to reduce network dependencies and loads, 325N/A * and will often be used for external DTD components. For example, packages 325N/A * shipping DTD files as resources in JAR files can eliminate network traffic 325N/A * when accessing them, and sites may provide local caches of common DTDs. 325N/A * Note that no particular catalog syntax is supported by this class, only 325N/A * the notion of a set of entries. 325N/A * <P> Subclasses can perform tasks such as supporting new URI schemes for 325N/A * URIs which are not URLs, such as URNs (see RFC 2396) or for accessing 325N/A * (see RFC 2387). They may also be used to support particular catalog 325N/A * SGML/Open Catalog (SOCAT)</a> which supports the SGML notion of "Formal 325N/A * Public Identifiers (FPIs). 325N/A * @author David Brownell 325N/A * @version 1.3 00/02/24 325N/A // table mapping public IDs to (local) URIs 325N/A // tables mapping public IDs to resources and classloaders 325N/A // table of MIME content types (less attributes!) known 325N/A // to be mostly "OK" to use with XML MIME entities. the 325N/A // be (or become) safe. 325N/A * Constructs a resolver. 325N/A * Returns an input source, using the MIME type information and URL 325N/A * scheme to statically determine the correct character encoding if 325N/A * possible and otherwise autodetecting it. MIME carefully specifies 325N/A * the character encoding defaults, and how attributes of the content 325N/A * type can change it. XML further specifies two mandatory encodings 325N/A * (UTF-8 and UTF-16), and includes an XML declaration which can be 325N/A * used to internally label most documents encoded using US-ASCII 325N/A * supersets (such as Shift_JIS, EUC-JP, ISO-2022-*, ISO-8859-*, and 325N/A * <P> This method can be used to access XML documents which do not 325N/A * have URIs (such as servlet input streams, or most JavaMail message 325N/A * entities) and to support access methods such as HTTP POST or PUT. 325N/A * (URLs normally return content using the GET method.) 325N/A * <P> <em> The caller should set the system ID in order for relative URIs 325N/A * found in this document to be interpreted correctly.</em> In some cases, 325N/A * a custom resolver will need to be used; for example, documents 325N/A * relative URLs would refer to other documents in that bundle. 325N/A * @param contentType The MIME content type for the source for which 325N/A * an InputSource is desired, such as <em>text/xml;charset=utf-8</em>. 325N/A * @param stream The input byte stream for the input source. 325N/A * @param checkType If true, this verifies that the content type is known 325N/A * @param scheme Unless this is "file", unspecified MIME types 325N/A * default to US-ASCII. Files are always autodetected since most 325N/A * file systems discard character encoding information. 325N/A // use "charset=..." if it's available 325N/A // strip out subsequent attributes 325N/A // strip out rfc822 comments 325N/A // double quotes are optional 325N/A // XXX "\;", "\)" etc were mishandled above 325N/A // "text/*" MIME types have hard-wired character set 325N/A // defaults, as specified in the RFCs. For XML, we 325N/A // ignore the system "file.encoding" property since 325N/A // autodetection is more correct. 325N/A // "application/*" has no default 325N/A * Creates an input source from a given URI. 325N/A * @param uri the URI (system ID) for the entity 325N/A * @param checkType if true, the MIME content type for the entity 325N/A * is checked for document type and character set encoding. 325N/A * Creates an input source from a given file, autodetecting 325N/A * the character encoding. 325N/A // On JDK 1.2 and later, simplify this: 325N/A // "path = file.toURL ().toString ()". 325N/A * Resolve the given entity into an input source. If the name can't 325N/A * be mapped to a preferred form of the entity, the URI is used. To 325N/A * resolve the entity, first a local catalog mapping names to URIs is 325N/A * consulted. If no mapping is found there, a catalog mapping names 325N/A * to java resources is consulted. Finally, if neither mapping found 325N/A * a copy of the entity, the specified URI is used. 325N/A * <P> When a URI is used, <a href="#createInputSource"> 325N/A * createInputSource</a> is used to correctly deduce the character 325N/A * encoding used by this entity. No MIME type checking is done. 325N/A * @param name Used to find alternate copies of the entity, when 325N/A * this value is non-null; this is the XML "public ID". 325N/A * @param uri Used when no alternate copy of the entity is found; 325N/A * this is the XML "system ID", normally a URI. 325N/A // prefer explicit URI mappings, then bundled resources... 325N/A // ...and treat all URIs the same (as URLs for now). 325N/A // System.out.println ("++ URI: " + url); 325N/A * Returns true if this resolver is ignoring MIME types in the documents 325N/A * it returns, to work around bugs in how servers have reported the 325N/A * documents' MIME types. 325N/A * Tells the resolver whether to ignore MIME types in the documents it 325N/A * retrieves. Many web servers incorrectly assign text documents a 325N/A * default character encoding, even when that is incorrect. For example, 325N/A * all HTTP text documents default to use ISO-8859-1 (used for Western 325N/A * European languages), and other MIME sources default text documents 325N/A * to use US-ASCII (a seven bit encoding). For XML documents which 325N/A * include text encoding declarations (as most should do), these server 325N/A * bugs can be worked around by ignoring the MIME type entirely. 325N/A // maps the public ID to an alternate URI, if one is registered 325N/A * Registers the given public ID as corresponding to a particular 325N/A * URI, typically a local copy. This URI will be used in preference 325N/A * to ones provided as system IDs in XML entity declarations. This 325N/A * mechanism would most typically be used for Document Type Definitions 325N/A * (DTDs), where the public IDs are formally managed and versioned. 325N/A * @param publicId The managed public ID being mapped 325N/A * @param uri The URI of the preferred copy of that entity 325N/A // return the resource as a stream 325N/A // System.out.println ("++ PUBLIC: " + publicId); 325N/A // System.out.println ("++ Resource: " + resourceName); 325N/A // System.out.println ("++ Loader: " + loader); 325N/A * Registers a given public ID as corresponding to a particular Java 325N/A * resource in a given class loader, typically distributed with a 325N/A * software package. This resource will be preferred over system IDs 325N/A * included in XML documents. This mechanism should most typically be 325N/A * used for Document Type Definitions (DTDs), where the public IDs are 325N/A * formally managed and versioned. 325N/A * <P> If a mapping to a URI has been provided, that mapping takes 325N/A * precedence over this one. 325N/A * @param publicId The managed public ID being mapped 325N/A * @param resourceName The name of the Java resource 325N/A * @param loader The class loader holding the resource, or null if 325N/A * it is a system resource.