IRI.hs revision 5ce020a9257f3aa096a3c8f9cce4eb61d2741011
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederModule : $Header$
e9458b1a7a19a63aa4c179f9ab20f4d50681c168Jens ElknerCopyright : (c) DFKI GmbH 2012
46c318705d1532d90572abf9ee869016583d985bTill MossakowskiLicense : GPLv2 or higher, see LICENSE.txt
98890889ffb2e8f6f722b00e265a211f13b5a861Corneliu-Claudiu ProdescuMaintainer : Eugen Kuksa <eugenk@informatik.uni-bremen.de>
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederStability : provisional
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederPortability : portable
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederThis module defines functions for handling IRIs. It is substantially the
46c318705d1532d90572abf9ee869016583d985bTill Mossakowskisame as the Network.URI module by Graham Klyne, but is extended to IRI
46c318705d1532d90572abf9ee869016583d985bTill Mossakowskisupport [2] and even Manchester-Syntax-IRI [3], [4] and CURIE [5].
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederFour methods are provided for parsing different
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maederkinds of IRI string (as noted in [1], [2]):
a389e88e0acb83d8489bdc5e55bc5522b152bbecEugen Kuksa'parseIRIReference',
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian Maeder'parseRelativeReference' and
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian Maeder'parseAbsoluteIRI'.
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian MaederAn additional method is provided for parsing an abbreviated IRI according to
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian Maeder[3], [4]: 'parseIRIManchester' and according to [5]: 'parseIRICurie'
11c3a215d5cf043181e83929f1ce214df65cb587Christian MaederFurther, four methods are provided for classifying different
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian Maederkinds of IRI string (as noted in [1], [2]):
9f7cd2db42cbc88253af8034f8d1fb83e1ecd4cdChristian Maeder'isIRIReference',
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian Maeder'isRelativeReference' and
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian Maeder'isAbsoluteIRI'.
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaederAdditionally, classification of full, abbreviated and simple IRI is provided
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaederby 'isIRIManchester', isIRICurie.
33f5512f0538c5ec4141205a8440ff6ba9e96139Christian MaederThe abbreviated syntaxes [3], [4], [5] provide three different kinds of IRI.
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederAn existing element of type IRI can be classified in one of those kinds.
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian MaederMost of the code has been copied from the Network.URI implementation,
92dc581bf568c9e225aa9d0570ab0a4b6ebdab69Christian Maederbut it is extended to IRI, Manchester-syntax and CURIE.
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maeder(3) <http://www.w3.org/TR/2009/NOTE-owl2-manchester-syntax-20091027/>
a43c1a7fa08c12524415386aa13a566cc9e53a4fChristian Maeder(4) <http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/>
8acac20a235839e60ea2d43709fce47de1c68bc1Christian Maeder(5) <http://www.w3.org/TR/rdfa-core/#s_curies>
720eeee7c9d8442093c8d05bed743193eee906e0Christian Maeder -- * The IRI type
0789323dfca89bae8f710da5bba20220b9af2feaChristian Maeder , IRIAuth (..)
720eeee7c9d8442093c8d05bed743193eee906e0Christian Maeder , PNameLn(..)
0789323dfca89bae8f710da5bba20220b9af2feaChristian Maeder -- * Conversion
0789323dfca89bae8f710da5bba20220b9af2feaChristian Maeder , simpleIdToIRI
5dc46f6d0fdd8747d730f9e79a93978145ed43bbChristian Maeder , parseIRIReference
5dc46f6d0fdd8747d730f9e79a93978145ed43bbChristian Maeder , parseRelativeReference
5dc46f6d0fdd8747d730f9e79a93978145ed43bbChristian Maeder , parseAbsoluteIRI
720eeee7c9d8442093c8d05bed743193eee906e0Christian Maeder , parseIRICurie
720eeee7c9d8442093c8d05bed743193eee906e0Christian Maeder , parseIRIReferenceCurie
11c3a215d5cf043181e83929f1ce214df65cb587Christian Maeder , parseIRIManchester
92dc581bf568c9e225aa9d0570ab0a4b6ebdab69Christian Maeder -- * Test for strings containing various kinds of IRI
e49fd57c63845c7806860a9736ad09f6d44dbaedChristian Maeder , isIRIReference
db6729e623b4053149084ccf4b35e5308ac7e359Christian Maeder , isRelativeReference
db6729e623b4053149084ccf4b35e5308ac7e359Christian Maeder , isAbsoluteIRI
1a38107941725211e7c3f051f7a8f5e12199f03acmaeder , isIRIReferenceCurie
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maeder , isIRIManchester
002961cfb5c53204887101239d2a47c83d596585Christian Maeder , isIPv6address
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , isIPv4address
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder -- * Relative IRIs
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , relativeTo
002961cfb5c53204887101239d2a47c83d596585Christian Maeder , nonStrictRelativeTo
e49fd57c63845c7806860a9736ad09f6d44dbaedChristian Maeder , relativeFrom
b7bba589fb78fe61379de93d531556c00da36cd9Christian Maeder -- * Operations on IRI strings
002961cfb5c53204887101239d2a47c83d596585Christian Maeder {- | Support for putting strings into IRI-friendly
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder escaped format and getting them back again. -}
002961cfb5c53204887101239d2a47c83d596585Christian Maeder , iriToString
11c3a215d5cf043181e83929f1ce214df65cb587Christian Maeder , iriToStringUnsecure
dc62afbf79603699b39b2387f48298634f642e67cmaeder , iriToStringShort
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maeder , iriToStringShortUnsecure
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , iriToStringFullUnsecure
9f7cd2db42cbc88253af8034f8d1fb83e1ecd4cdChristian Maeder , isReserved, isUnreserved
01ddc4cad68fa84b4e9dd41089ad876329bae5b0Christian Maeder , isAllowedInIRI, isUnescapedInIRI
9f7cd2db42cbc88253af8034f8d1fb83e1ecd4cdChristian Maeder , escapeIRIChar
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , escapeIRIString
1596a4d2cc01bff500afdd3789a43ec93210e81fChristian Maeder , unEscapeString
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder -- * Parser combinators, special additions to export list
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , iriReference
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , irelativeRef
429df04296fa571432f62cbfad6855e1420e0fd6Christian Maeder , absoluteIRI
11c3a215d5cf043181e83929f1ce214df65cb587Christian Maeder , iriReferenceCurie
ab53b2d1773ac020b0df4cc9edeb0debe12e7a09cmaeder , iriManchester
db6729e623b4053149084ccf4b35e5308ac7e359Christian Maeder -- * IRI Normalization functions
fbc1e851413f39999a00a0d3be0edf75bbf42007Ewaryst Schulz , expandCurie
b410420153cc9ac37fb4ebb86699cba7fa19bc35Christian Maeder , normalizeCase
1a38107941725211e7c3f051f7a8f5e12199f03acmaeder , normalizeEscape
1a38107941725211e7c3f051f7a8f5e12199f03acmaeder , normalizePathSegments
14d7908303969441ba30c2748de45f20345c6b31Christian Maeder ( GenParser, ParseError
e49fd57c63845c7806860a9736ad09f6d44dbaedChristian Maeder , parse, (<|>), (<?>), try
54a535fb81b928ac8f99a11bdcfa8998533204a5Christian Maeder , option, many, many1
b410420153cc9ac37fb4ebb86699cba7fa19bc35Christian Maeder , char, satisfy, oneOf, string, digit, eof
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maederimport Control.Monad (MonadPlus (..))
54a535fb81b928ac8f99a11bdcfa8998533204a5Christian Maederimport Data.Char (ord, chr, isHexDigit, toLower, toUpper, digitToInt)
db6729e623b4053149084ccf4b35e5308ac7e359Christian Maederimport Numeric (showIntAtBase)
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maederimport Data.Ord (comparing)
e49fd57c63845c7806860a9736ad09f6d44dbaedChristian Maederimport Data.Map as Map (Map, lookup)
e49fd57c63845c7806860a9736ad09f6d44dbaedChristian Maeder-- * The IRI datatype
697e63e30aa3c309a1ef1f9357745111f8dfc5a9Christian Maeder{- | Represents a general universal resource identifier using
8acac20a235839e60ea2d43709fce47de1c68bc1Christian Maederits component parts.
8acac20a235839e60ea2d43709fce47de1c68bc1Christian MaederFor example, for the (full) IRI
8acac20a235839e60ea2d43709fce47de1c68bc1Christian Maeder> foo://anonymous@www.haskell.org:42/ghc?query#frag
8acac20a235839e60ea2d43709fce47de1c68bc1Christian Maederor the abbreviated IRI
, iriAuthority :: Maybe IRIAuth -- ^ @\/\/anonymous\@www.haskell.org:42@
, iriRegName :: String -- ^ @www.haskell.org@
-- | do we have a full (possibly expanded) IRI (i.e. for comparisons)
-- | do we have an abbreviated IRI (i.e. for pretty printing)
-- compares full/expanded IRI (if expanded) or abbreviated part if not expanded
-- |converts IRI to String of full/expanded form, also showing Auth info, no enclosing brackets
-- | Parses a CURIE <http://www.w3.org/TR/rdfa-core/#s_curies>
-- http://www.w3.org/TR/2009/REC-xml-names-20091208/#NT-NCName
[[[Above was a comment originally in GHC Network/IRI.hs:
alphaChar = satisfy isAlphaChar -- or: Parsec.letter ?
digitChar = satisfy isDigitChar -- or: Parsec.digit ?
hexDigitChar = satisfy isHexDigitChar -- or: Parsec.hexDigit ?
> "http://example.com/Root/sub1/name2#frag"
> `relativeFrom` "http://example.com/Root/sub2/name2#frag"
> == "../sub1/name2#frag"
(i.e. results always ends with '/')
-- @Nothing@ iff there is no IRI @i@ assigned to the prefix of @c@ or the concatenation of @i@ and @abbrevPath c@ is not a valid IRI
case Map.lookup pn prefixMap of