CollationElementIterator.java revision 2362
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * Copyright (c) 1996, 2005, Oracle and/or its affiliates. All rights reserved.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * This code is free software; you can redistribute it and/or modify it
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * under the terms of the GNU General Public License version 2 only, as
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * published by the Free Software Foundation. Oracle designates this
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * particular file as subject to the "Classpath" exception as provided
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * by Oracle in the LICENSE file that accompanied this code.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * This code is distributed in the hope that it will be useful, but WITHOUT
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * version 2 for more details (a copy is included in the LICENSE file that
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * accompanied this code).
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * You should have received a copy of the GNU General Public License version
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * 2 along with this work; if not, write to the Free Software Foundation,
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * or visit www.oracle.com if you need additional information or have any
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * questions.
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings * (C) Copyright Taligent, Inc. 1996, 1997 - All Rights Reserved
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * (C) Copyright IBM Corp. 1996-1998 - All Rights Reserved
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * The original version of this source code and documentation is copyrighted
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings * and owned by Taligent, Inc., a wholly-owned subsidiary of IBM. These
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings * materials are provided under terms of a License Agreement between Taligent
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * and Sun. This technology is protected by multiple US and International
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * patents. This notice and attribution to Taligent may not be removed.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * Taligent is a registered trademark of Taligent, Inc.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * The <code>CollationElementIterator</code> class is used as an iterator
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * to walk through each character of an international string. Use the iterator
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * to return the ordering priority of the positioned character. The ordering
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * priority of a character, which we refer to as a key, defines how a character
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * is collated in the given collation object.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * For example, consider the following in Spanish:
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * <blockquote>
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * "ca" -> the first key is key('c') and second key is key('a').
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * "cha" -> the first key is key('ch') and second key is key('a').
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * </blockquote>
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * And in German,
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * <blockquote>
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * "\u00e4b"-> the first key is key('a'), the second key is key('e'), and
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * the third key is key('b').
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * </blockquote>
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * The key of a character is an integer composed of primary order(short),
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * secondary order(byte), and tertiary order(byte). Java strictly defines
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * the size and signedness of its primitive data types. Therefore, the static
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * functions <code>primaryOrder</code>, <code>secondaryOrder</code>, and
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * <code>tertiaryOrder</code> return <code>int</code>, <code>short</code>,
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * and <code>short</code> respectively to ensure the correctness of the key
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * Example of the iterator usage,
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * <blockquote>
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * String testString = "This is a test";
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * RuleBasedCollator ruleBasedCollator = (RuleBasedCollator)Collator.getInstance();
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * CollationElementIterator collationElementIterator = ruleBasedCollator.getCollationElementIterator(testString);
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * int primaryOrder = CollationElementIterator.primaryOrder(collationElementIterator.next());
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * </blockquote>
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * <code>CollationElementIterator.next</code> returns the collation order
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * of the next character. A collation order consists of primary order,
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * secondary order and tertiary order. The data type of the collation
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * order is <strong>int</strong>. The first 16 bits of a collation order
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * is its primary order; the next 8 bits is the secondary order and the
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * last 8 bits is the tertiary order.
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * @see Collator
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * @see RuleBasedCollator
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * @author Helena Shih, Laura Werner, Richard Gillam
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Fosterpublic final class CollationElementIterator
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * Null order which indicates the end of string is reached by the
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster public final static int NULLORDER = 0xffffffff;
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * CollationElementIterator constructor. This takes the source string and
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings * the collation object. The cursor will walk thru the source string based
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings * on the predefined collation rules. If the source string is empty,
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings * NULLORDER will be returned on the calls to next().
a688bcbb4bcff5398fdd29b86f83450257dc0df4Allan Foster * @param sourceText the source string.
5bdd6bf9211505ff52afc7e32bdc49cdfacf4879Charles Sparey * @param order the collation object.
449854c2a07b50ea64d9d6a8b03d18d4afeeee43Ken Stubbings CollationElementIterator(String sourceText, RuleBasedCollator owner) {
public void reset()
public int next()
return NULLORDER;
return order;
return NULLORDER;
return UNMAPPEDCHARVALUE;
int consonant;
public int previous()
return NULLORDER;
return order;
return NULLORDER;
return ch;
int vowel;
return order;
next();
public int getOffset()
int lastValue,
int[] lastExpansion,
boolean forward) {
int[] result;
if (!forward) {
return result;
--maxLength;
> maxLength) {
return order;
--maxLength;
> maxLength) {
return order;