Character.java revision 1602
658N/A * Copyright 2002-2009 Sun Microsystems, Inc. All Rights Reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. Sun designates this 0N/A * particular file as subject to the "Classpath" exception as provided 0N/A * by Sun in the LICENSE file that accompanied this code. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 0N/A * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, 0N/A * CA 95054 USA or visit www.sun.com if you need additional information or 0N/A * have any questions. 0N/A * The <code>Character</code> class wraps a value of the primitive 0N/A * type <code>char</code> in an object. An object of type 0N/A * <code>Character</code> contains a single field whose type is 0N/A * <code>char</code>. 0N/A * In addition, this class provides several methods for determining 0N/A * a character's category (lowercase letter, digit, etc.) and for converting 0N/A * characters from uppercase to lowercase and vice versa. 0N/A * Character information is based on the Unicode Standard, version 5.1.0. 0N/A * The methods and data of class <code>Character</code> are defined by 0N/A * the information in the <i>UnicodeData</i> file that is part of the 0N/A * Unicode Character Database maintained by the Unicode 0N/A * Consortium. This file specifies various properties including name 0N/A * and general category for every defined Unicode code point or 0N/A * The file and its description are available from the Unicode Consortium at: 0N/A * <h4><a name="unicode">Unicode Character Representations</a></h4> 0N/A * <p>The <code>char</code> data type (and therefore the value that a 0N/A * <code>Character</code> object encapsulates) are based on the 0N/A * original Unicode specification, which defined characters as 0N/A * fixed-width 16-bit entities. The Unicode standard has since been 0N/A * changed to allow for characters whose representation requires more 0N/A * than 16 bits. The range of legal <em>code point</em>s is now 0N/A * U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>. 0N/A * definition</i></a> of the U+<i>n</i> notation in the Unicode 0N/A * <p>The set of characters from U+0000 to U+FFFF is sometimes 0N/A * referred to as the <em>Basic Multilingual Plane (BMP)</em>. <a 0N/A * name="supplementary">Characters</a> whose code points are greater 0N/A * than U+FFFF are called <em>supplementary character</em>s. The Java 0N/A * 2 platform uses the UTF-16 representation in <code>char</code> 0N/A * arrays and in the <code>String</code> and <code>StringBuffer</code> 0N/A * classes. In this representation, supplementary characters are 0N/A * represented as a pair of <code>char</code> values, the first from 0N/A * the <em>high-surrogates</em> range, (\uD800-\uDBFF), the 0N/A * second from the <em>low-surrogates</em> range 0N/A * (\uDC00-\uDFFF). 0N/A * <p>A <code>char</code> value, therefore, represents Basic 0N/A * Multilingual Plane (BMP) code points, including the surrogate 0N/A * code points, or code units of the UTF-16 encoding. An 0N/A * <code>int</code> value represents all Unicode code points, 0N/A * including supplementary code points. The lower (least significant) 0N/A * 21 bits of <code>int</code> are used to represent Unicode code 0N/A * points and the upper (most significant) 11 bits must be zero. 0N/A * Unless otherwise specified, the behavior with respect to 0N/A * supplementary characters and surrogate <code>char</code> values is 0N/A * <li>The methods that only accept a <code>char</code> value cannot support 0N/A * supplementary characters. They treat <code>char</code> values from the 0N/A * surrogate ranges as undefined characters. For example, 0N/A * <code>Character.isLetter('\uD840')</code> returns <code>false</code>, even though 0N/A * this specific value if followed by any low-surrogate value in a string 0N/A * would represent a letter. * <li>The methods that accept an <code>int</code> value support all * Unicode characters, including supplementary characters. For * example, <code>Character.isLetter(0x2F81A)</code> returns * <code>true</code> because the code point value represents a letter * <p>In the Java SE API documentation, <em>Unicode code point</em> is * used for character values in the range between U+0000 and U+10FFFF, * and <em>Unicode code unit</em> is used for 16-bit * <code>char</code> values that are code units of the <em>UTF-16</em> * encoding. For more information on Unicode terminology, refer to the * The minimum radix available for conversion to and from strings. * The constant value of this field is the smallest value permitted * for the radix argument in radix-conversion methods such as the * <code>digit</code> method, the <code>forDigit</code> * method, and the <code>toString</code> method of class * @see java.lang.Character#digit(char, int) * @see java.lang.Character#forDigit(int, int) * @see java.lang.Integer#toString(int, int) * @see java.lang.Integer#valueOf(java.lang.String) * The maximum radix available for conversion to and from strings. * The constant value of this field is the largest value permitted * for the radix argument in radix-conversion methods such as the * <code>digit</code> method, the <code>forDigit</code> * method, and the <code>toString</code> method of class * @see java.lang.Character#digit(char, int) * @see java.lang.Character#forDigit(int, int) * @see java.lang.Integer#toString(int, int) * @see java.lang.Integer#valueOf(java.lang.String) * The constant value of this field is the smallest value of type * <code>char</code>, <code>'\u0000'</code>. public static final char MIN_VALUE =
'\u0000';
* The constant value of this field is the largest value of type * <code>char</code>, <code>'\uFFFF'</code>. public static final char MAX_VALUE =
'\uFFFF';
* The <code>Class</code> instance representing the primitive type * Normative general types * General character types * General category "Cn" in the Unicode specification. * General category "Lu" in the Unicode specification. * General category "Ll" in the Unicode specification. * General category "Lt" in the Unicode specification. * General category "Lm" in the Unicode specification. * General category "Lo" in the Unicode specification. * General category "Mn" in the Unicode specification. * General category "Me" in the Unicode specification. * General category "Mc" in the Unicode specification. * General category "Nd" in the Unicode specification. * General category "Nl" in the Unicode specification. * General category "No" in the Unicode specification. * General category "Zs" in the Unicode specification. * General category "Zl" in the Unicode specification. * General category "Zp" in the Unicode specification. * General category "Cc" in the Unicode specification. * General category "Cf" in the Unicode specification. * General category "Co" in the Unicode specification. * General category "Cs" in the Unicode specification. * General category "Pd" in the Unicode specification. * General category "Ps" in the Unicode specification. * General category "Pe" in the Unicode specification. * General category "Pc" in the Unicode specification. * General category "Po" in the Unicode specification. * General category "Sm" in the Unicode specification. * General category "Sc" in the Unicode specification. * General category "Sk" in the Unicode specification. * General category "So" in the Unicode specification. * General category "Pi" in the Unicode specification. * General category "Pf" in the Unicode specification. * Error flag. Use int (code point) to avoid confusion with U+FFFF. static final int ERROR =
0xFFFFFFFF;
* Undefined bidirectional character type. Undefined <code>char</code> * values have undefined directionality in the Unicode specification. * Strong bidirectional character type "L" in the Unicode specification. * Strong bidirectional character type "R" in the Unicode specification. * Strong bidirectional character type "AL" in the Unicode specification. * Weak bidirectional character type "EN" in the Unicode specification. * Weak bidirectional character type "ES" in the Unicode specification. * Weak bidirectional character type "ET" in the Unicode specification. * Weak bidirectional character type "AN" in the Unicode specification. * Weak bidirectional character type "CS" in the Unicode specification. * Weak bidirectional character type "NSM" in the Unicode specification. * Weak bidirectional character type "BN" in the Unicode specification. * Neutral bidirectional character type "B" in the Unicode specification. * Neutral bidirectional character type "S" in the Unicode specification. * Neutral bidirectional character type "WS" in the Unicode specification. * Neutral bidirectional character type "ON" in the Unicode specification. * Strong bidirectional character type "LRE" in the Unicode specification. * Strong bidirectional character type "LRO" in the Unicode specification. * Strong bidirectional character type "RLE" in the Unicode specification. * Strong bidirectional character type "RLO" in the Unicode specification. * Weak bidirectional character type "PDF" in the Unicode specification. * Unicode high-surrogate code unit</a> * in the UTF-16 encoding, constant <code>'\uD800'</code>. * A high-surrogate is also known as a <i>leading-surrogate</i>. * Unicode high-surrogate code unit</a> * in the UTF-16 encoding, constant <code>'\uDBFF'</code>. * A high-surrogate is also known as a <i>leading-surrogate</i>. * Unicode low-surrogate code unit</a> * in the UTF-16 encoding, constant <code>'\uDC00'</code>. * A low-surrogate is also known as a <i>trailing-surrogate</i>. * Unicode low-surrogate code unit</a> * in the UTF-16 encoding, constant <code>'\uDFFF'</code>. * A low-surrogate is also known as a <i>trailing-surrogate</i>. * The minimum value of a Unicode surrogate code unit in the * UTF-16 encoding, constant <code>'\uD800'</code>. * The maximum value of a Unicode surrogate code unit in the * UTF-16 encoding, constant <code>'\uDFFF'</code>. * Unicode supplementary code point</a>, constant {@code U+10000}. * Unicode code point</a>, constant {@code U+0000}. * Unicode code point</a>, constant {@code U+10FFFF}. * Instances of this class represent particular subsets of the Unicode * character set. The only family of subsets defined in the * <code>Character</code> class is <code>{@link Character.UnicodeBlock * UnicodeBlock}</code>. Other portions of the Java API may define other * subsets for their own purposes. * Constructs a new <code>Subset</code> instance. * @exception NullPointerException if name is <code>null</code> * @param name The name of this subset * Compares two <code>Subset</code> objects for equality. * This method returns <code>true</code> if and only if * <code>this</code> and the argument refer to the same * object; since this method is <code>final</code>, this * guarantee holds for all subclasses. * Returns the standard hash code as defined by the * <code>{@link Object#hashCode}</code> method. This method * is <code>final</code> in order to ensure that the * <code>equals</code> and <code>hashCode</code> methods will * be consistent in all subclasses. * Returns the name of this subset. * A family of character subsets representing the character blocks in the * Unicode specification. Character blocks generally define characters * used for a specific script or purpose. A character is contained by * at most one Unicode block. * Create a UnicodeBlock with the given identifier name. * This name must be the same as the block identifier. * Create a UnicodeBlock with the given identifier name and * Create a UnicodeBlock with the given identifier name and * Constant for the "Basic Latin" Unicode character block. * Constant for the "Latin-1 Supplement" Unicode character block. new UnicodeBlock(
"LATIN_1_SUPPLEMENT",
new String[]{
"Latin-1 Supplement",
"Latin-1Supplement"});
* Constant for the "Latin Extended-A" Unicode character block. new UnicodeBlock(
"LATIN_EXTENDED_A",
new String[]{
"Latin Extended-A",
"LatinExtended-A"});
* Constant for the "Latin Extended-B" Unicode character block. new UnicodeBlock(
"LATIN_EXTENDED_B",
new String[] {
"Latin Extended-B",
"LatinExtended-B"});
* Constant for the "IPA Extensions" Unicode character block. * Constant for the "Spacing Modifier Letters" Unicode character block. "SpacingModifierLetters"});
* Constant for the "Combining Diacritical Marks" Unicode character block. new UnicodeBlock(
"COMBINING_DIACRITICAL_MARKS",
new String[] {
"Combining Diacritical Marks",
"CombiningDiacriticalMarks" });
* Constant for the "Greek and Coptic" Unicode character block. * This block was previously known as the "Greek" block. * Constant for the "Cyrillic" Unicode character block. * Constant for the "Armenian" Unicode character block. * Constant for the "Hebrew" Unicode character block. * Constant for the "Arabic" Unicode character block. * Constant for the "Devanagari" Unicode character block. * Constant for the "Bengali" Unicode character block. * Constant for the "Gurmukhi" Unicode character block. * Constant for the "Gujarati" Unicode character block. * Constant for the "Oriya" Unicode character block. * Constant for the "Tamil" Unicode character block. * Constant for the "Telugu" Unicode character block. * Constant for the "Kannada" Unicode character block. * Constant for the "Malayalam" Unicode character block. * Constant for the "Thai" Unicode character block. * Constant for the "Lao" Unicode character block. * Constant for the "Tibetan" Unicode character block. * Constant for the "Georgian" Unicode character block. * Constant for the "Hangul Jamo" Unicode character block. * Constant for the "Latin Extended Additional" Unicode character block. "LatinExtendedAdditional"});
* Constant for the "Greek Extended" Unicode character block. * Constant for the "General Punctuation" Unicode character block. new UnicodeBlock(
"GENERAL_PUNCTUATION",
new String[] {
"General Punctuation",
"GeneralPunctuation"});
* Constant for the "Superscripts and Subscripts" Unicode character block. new UnicodeBlock(
"SUPERSCRIPTS_AND_SUBSCRIPTS",
new String[] {
"Superscripts and Subscripts",
"SuperscriptsandSubscripts" });
* Constant for the "Currency Symbols" Unicode character block. new UnicodeBlock(
"CURRENCY_SYMBOLS",
new String[] {
"Currency Symbols",
"CurrencySymbols"});
* Constant for the "Combining Diacritical Marks for Symbols" Unicode character block. * This block was previously known as "Combining Marks for Symbols". new UnicodeBlock(
"COMBINING_MARKS_FOR_SYMBOLS",
new String[] {
"Combining Diacritical Marks for Symbols",
"CombiningDiacriticalMarksforSymbols",
"Combining Marks for Symbols",
"CombiningMarksforSymbols" });
* Constant for the "Letterlike Symbols" Unicode character block. new UnicodeBlock(
"LETTERLIKE_SYMBOLS",
new String[] {
"Letterlike Symbols",
"LetterlikeSymbols"});
* Constant for the "Number Forms" Unicode character block. * Constant for the "Arrows" Unicode character block. * Constant for the "Mathematical Operators" Unicode character block. "MathematicalOperators"});
* Constant for the "Miscellaneous Technical" Unicode character block. "MiscellaneousTechnical"});
* Constant for the "Control Pictures" Unicode character block. new UnicodeBlock(
"CONTROL_PICTURES",
new String[] {
"Control Pictures",
"ControlPictures"});
* Constant for the "Optical Character Recognition" Unicode character block. new UnicodeBlock(
"OPTICAL_CHARACTER_RECOGNITION",
new String[] {
"Optical Character Recognition",
"OpticalCharacterRecognition"});
* Constant for the "Enclosed Alphanumerics" Unicode character block. "EnclosedAlphanumerics"});
* Constant for the "Box Drawing" Unicode character block. * Constant for the "Block Elements" Unicode character block. * Constant for the "Geometric Shapes" Unicode character block. new UnicodeBlock(
"GEOMETRIC_SHAPES",
new String[] {
"Geometric Shapes",
"GeometricShapes"});
* Constant for the "Miscellaneous Symbols" Unicode character block. "MiscellaneousSymbols"});
* Constant for the "Dingbats" Unicode character block. * Constant for the "CJK Symbols and Punctuation" Unicode character block. new UnicodeBlock(
"CJK_SYMBOLS_AND_PUNCTUATION",
new String[] {
"CJK Symbols and Punctuation",
"CJKSymbolsandPunctuation"});
* Constant for the "Hiragana" Unicode character block. * Constant for the "Katakana" Unicode character block. * Constant for the "Bopomofo" Unicode character block. * Constant for the "Hangul Compatibility Jamo" Unicode character block. "HangulCompatibilityJamo"});
* Constant for the "Kanbun" Unicode character block. * Constant for the "Enclosed CJK Letters and Months" Unicode character block. new UnicodeBlock(
"ENCLOSED_CJK_LETTERS_AND_MONTHS",
new String[] {
"Enclosed CJK Letters and Months",
"EnclosedCJKLettersandMonths"});
* Constant for the "CJK Compatibility" Unicode character block. new UnicodeBlock(
"CJK_COMPATIBILITY",
new String[] {
"CJK Compatibility",
"CJKCompatibility"});
* Constant for the "CJK Unified Ideographs" Unicode character block. "CJKUnifiedIdeographs"});
* Constant for the "Hangul Syllables" Unicode character block. new UnicodeBlock(
"HANGUL_SYLLABLES",
new String[] {
"Hangul Syllables",
"HangulSyllables"});
* Constant for the "Private Use Area" Unicode character block. new UnicodeBlock(
"PRIVATE_USE_AREA",
new String[] {
"Private Use Area",
"PrivateUseArea"});
* Constant for the "CJK Compatibility Ideographs" Unicode character block. new String[] {
"CJK Compatibility Ideographs",
"CJKCompatibilityIdeographs"});
* Constant for the "Alphabetic Presentation Forms" Unicode character block. new UnicodeBlock(
"ALPHABETIC_PRESENTATION_FORMS",
new String[] {
"Alphabetic Presentation Forms",
"AlphabeticPresentationForms"});
* Constant for the "Arabic Presentation Forms-A" Unicode character block. new UnicodeBlock(
"ARABIC_PRESENTATION_FORMS_A",
new String[] {
"Arabic Presentation Forms-A",
"ArabicPresentationForms-A"});
* Constant for the "Combining Half Marks" Unicode character block. * Constant for the "CJK Compatibility Forms" Unicode character block. "CJKCompatibilityForms"});
* Constant for the "Small Form Variants" Unicode character block. * Constant for the "Arabic Presentation Forms-B" Unicode character block. new UnicodeBlock(
"ARABIC_PRESENTATION_FORMS_B",
new String[] {
"Arabic Presentation Forms-B",
"ArabicPresentationForms-B"});
* Constant for the "Halfwidth and Fullwidth Forms" Unicode character block. new String[] {
"Halfwidth and Fullwidth Forms",
"HalfwidthandFullwidthForms"});
* Constant for the "Specials" Unicode character block. * @deprecated As of J2SE 5, use {@link #HIGH_SURROGATES}, * {@link #HIGH_PRIVATE_USE_SURROGATES}, and * {@link #LOW_SURROGATES}. These new constants match * the block definitions of the Unicode Standard. * The {@link #of(char)} and {@link #of(int)} methods * return the new constants, not SURROGATES_AREA. * Constant for the "Syriac" Unicode character block. * Constant for the "Thaana" Unicode character block. * Constant for the "Sinhala" Unicode character block. * Constant for the "Myanmar" Unicode character block. * Constant for the "Ethiopic" Unicode character block. * Constant for the "Cherokee" Unicode character block. * Constant for the "Unified Canadian Aboriginal Syllabics" Unicode character block. new String[] {
"Unified Canadian Aboriginal Syllabics",
"UnifiedCanadianAboriginalSyllabics"});
* Constant for the "Ogham" Unicode character block. * Constant for the "Runic" Unicode character block. * Constant for the "Khmer" Unicode character block. * Constant for the "Mongolian" Unicode character block. * Constant for the "Braille Patterns" Unicode character block. * Constant for the "CJK Radicals Supplement" Unicode character block. "CJKRadicalsSupplement"});
* Constant for the "Kangxi Radicals" Unicode character block. * Constant for the "Ideographic Description Characters" Unicode character block. new UnicodeBlock(
"IDEOGRAPHIC_DESCRIPTION_CHARACTERS",
new String[] {
"Ideographic Description Characters",
"IdeographicDescriptionCharacters"});
* Constant for the "Bopomofo Extended" Unicode character block. * Constant for the "CJK Unified Ideographs Extension A" Unicode character block. new UnicodeBlock(
"CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A",
new String[] {
"CJK Unified Ideographs Extension A",
"CJKUnifiedIdeographsExtensionA"});
* Constant for the "Yi Syllables" Unicode character block. * Constant for the "Yi Radicals" Unicode character block. * Constant for the "Cyrillic Supplementary" Unicode character block. new String[] {
"Cyrillic Supplementary",
* Constant for the "Tagalog" Unicode character block. * Constant for the "Hanunoo" Unicode character block. * Constant for the "Buhid" Unicode character block. * Constant for the "Tagbanwa" Unicode character block. * Constant for the "Limbu" Unicode character block. * Constant for the "Tai Le" Unicode character block. * Constant for the "Khmer Symbols" Unicode character block. * Constant for the "Phonetic Extensions" Unicode character block. new UnicodeBlock(
"PHONETIC_EXTENSIONS",
new String[] {
"Phonetic Extensions",
"PhoneticExtensions"});
* Constant for the "Miscellaneous Mathematical Symbols-A" Unicode character block. new String[]{
"Miscellaneous Mathematical Symbols-A",
"MiscellaneousMathematicalSymbols-A"});
* Constant for the "Supplemental Arrows-A" Unicode character block. "SupplementalArrows-A"});
* Constant for the "Supplemental Arrows-B" Unicode character block. "SupplementalArrows-B"});
* Constant for the "Miscellaneous Mathematical Symbols-B" Unicode character block. new String[] {
"Miscellaneous Mathematical Symbols-B",
"MiscellaneousMathematicalSymbols-B"});
* Constant for the "Supplemental Mathematical Operators" Unicode character block. new String[]{
"Supplemental Mathematical Operators",
"SupplementalMathematicalOperators"} );
* Constant for the "Miscellaneous Symbols and Arrows" Unicode character block. new UnicodeBlock(
"MISCELLANEOUS_SYMBOLS_AND_ARROWS",
new String[] {
"Miscellaneous Symbols and Arrows",
"MiscellaneousSymbolsandArrows"});
* Constant for the "Katakana Phonetic Extensions" Unicode character block. new UnicodeBlock(
"KATAKANA_PHONETIC_EXTENSIONS",
new String[] {
"Katakana Phonetic Extensions",
"KatakanaPhoneticExtensions"});
* Constant for the "Yijing Hexagram Symbols" Unicode character block. "YijingHexagramSymbols"});
* Constant for the "Variation Selectors" Unicode character block. new UnicodeBlock(
"VARIATION_SELECTORS",
new String[] {
"Variation Selectors",
"VariationSelectors"});
* Constant for the "Linear B Syllabary" Unicode character block. new UnicodeBlock(
"LINEAR_B_SYLLABARY",
new String[] {
"Linear B Syllabary",
"LinearBSyllabary"});
* Constant for the "Linear B Ideograms" Unicode character block. new UnicodeBlock(
"LINEAR_B_IDEOGRAMS",
new String[] {
"Linear B Ideograms",
"LinearBIdeograms"});
* Constant for the "Aegean Numbers" Unicode character block. * Constant for the "Old Italic" Unicode character block. * Constant for the "Gothic" Unicode character block. * Constant for the "Ugaritic" Unicode character block. * Constant for the "Deseret" Unicode character block. * Constant for the "Shavian" Unicode character block. * Constant for the "Osmanya" Unicode character block. * Constant for the "Cypriot Syllabary" Unicode character block. new UnicodeBlock(
"CYPRIOT_SYLLABARY",
new String[] {
"Cypriot Syllabary",
"CypriotSyllabary"});
* Constant for the "Byzantine Musical Symbols" Unicode character block. "ByzantineMusicalSymbols"});
* Constant for the "Musical Symbols" Unicode character block. * Constant for the "Tai Xuan Jing Symbols" Unicode character block. * Constant for the "Mathematical Alphanumeric Symbols" Unicode character block. new String[] {
"Mathematical Alphanumeric Symbols",
"MathematicalAlphanumericSymbols"});
* Constant for the "CJK Unified Ideographs Extension B" Unicode character block. new String[] {
"CJK Unified Ideographs Extension B",
"CJKUnifiedIdeographsExtensionB"});
* Constant for the "CJK Compatibility Ideographs Supplement" Unicode character block. new String[]{
"CJK Compatibility Ideographs Supplement",
"CJKCompatibilityIdeographsSupplement"});
* Constant for the "Tags" Unicode character block. * Constant for the "Variation Selectors Supplement" Unicode character block. new UnicodeBlock(
"VARIATION_SELECTORS_SUPPLEMENT",
new String[] {
"Variation Selectors Supplement",
"VariationSelectorsSupplement"});
* Constant for the "Supplementary Private Use Area-A" Unicode character block. new String[] {
"Supplementary Private Use Area-A",
"SupplementaryPrivateUseArea-A"});
* Constant for the "Supplementary Private Use Area-B" Unicode character block. new String[] {
"Supplementary Private Use Area-B",
"SupplementaryPrivateUseArea-B"});
* Constant for the "High Surrogates" Unicode character block. * This block represents codepoint values in the high surrogate * range: 0xD800 through 0xDB7F * Constant for the "High Private Use Surrogates" Unicode character block. * This block represents codepoint values in the high surrogate * range: 0xDB80 through 0xDBFF new UnicodeBlock(
"HIGH_PRIVATE_USE_SURROGATES",
new String[] {
"High Private Use Surrogates",
"HighPrivateUseSurrogates"});
* Constant for the "Low Surrogates" Unicode character block. * This block represents codepoint values in the high surrogate * range: 0xDC00 through 0xDFFF * Constant for the "Arabic Supplement" Unicode character block. new String[] {
"Arabic Supplement",
* Constant for the "NKo" Unicode character block. * Constant for the "Ethiopic Supplement" Unicode character block. new String[] {
"Ethiopic Supplement",
* Constant for the "New Tai Lue" Unicode character block. * Constant for the "Buginese" Unicode character block. * Constant for the "Balinese" Unicode character block. * Constant for the "Sundanese" Unicode character block. * Constant for the "Lepcha" Unicode character block. * Constant for the "Ol Chiki" Unicode character block. * Constant for the "Phonetic Extensions Supplement" Unicode character new String[] {
"Phonetic Extensions Supplement",
"PhoneticExtensionsSupplement"});
* Constant for the "Combining Diacritical Marks Supplement" Unicode new String[] {
"Combining Diacritical Marks Supplement",
"CombiningDiacriticalMarksSupplement"});
* Constant for the "Glagolitic" Unicode character block. * Constant for the "Latin Extended-C" Unicode character block. new String[] {
"Latin Extended-C",
* Constant for the "Coptic" Unicode character block. * Constant for the "Georgian Supplement" Unicode character block. new String[] {
"Georgian Supplement",
* Constant for the "Tifinagh" Unicode character block. * Constant for the "Ethiopic Extended" Unicode character block. new String[] {
"Ethiopic Extended",
* Constant for the "Cyrillic Extended-A" Unicode character block. new String[] {
"Cyrillic Extended-A",
* Constant for the "Supplemental Punctuation" Unicode character block. new String[] {
"Supplemental Punctuation",
"SupplementalPunctuation"});
* Constant for the "CJK Strokes" Unicode character block. * Constant for the "Vai" Unicode character block. * Constant for the "Cyrillic Extended-B" Unicode character block. new String[] {
"Cyrillic Extended-B",
* Constant for the "Modifier Tone Letters" Unicode character block. new String[] {
"Modifier Tone Letters",
* Constant for the "Latin Extended-D" Unicode character block. new String[] {
"Latin Extended-D",
* Constant for the "Syloti Nagri" Unicode character block. new String[] {
"Syloti Nagri",
* Constant for the "Phags-pa" Unicode character block. * Constant for the "Saurashtra" Unicode character block. * Constant for the "Kayah Li" Unicode character block. * Constant for the "Rejang" Unicode character block. * Constant for the "Cham" Unicode character block. * Constant for the "Vertical Forms" Unicode character block. new String[] {
"Vertical Forms",
* Constant for the "Ancient Greek Numbers" Unicode character block. new String[] {
"Ancient Greek Numbers",
* Constant for the "Ancient Symbols" Unicode character block. new String[] {
"Ancient Symbols",
* Constant for the "Phaistos Disc" Unicode character block. new String[] {
"Phaistos Disc",
* Constant for the "Lycian" Unicode character block. * Constant for the "Carian" Unicode character block. * Constant for the "Old Persian" Unicode character block. * Constant for the "Phoenician" Unicode character block. * Constant for the "Lydian" Unicode character block. * Constant for the "Kharoshthi" Unicode character block. * Constant for the "Cuneiform" Unicode character block. * Constant for the "Cuneiform Numbers and Punctuation" Unicode new String[] {
"Cuneiform Numbers and Punctuation",
"CuneiformNumbersandPunctuation"});
* Constant for the "Ancient Greek Musical Notation" Unicode character new String[] {
"Ancient Greek Musical Notation",
"AncientGreekMusicalNotation"});
* Constant for the "Counting Rod Numerals" Unicode character block. new String[] {
"Counting Rod Numerals",
* Constant for the "Mahjong Tiles" Unicode character block. new String[] {
"Mahjong Tiles",
* Constant for the "Domino Tiles" Unicode character block. new String[] {
"Domino Tiles",
0x0000,
// 0000..007F; Basic Latin 0x0080,
// 0080..00FF; Latin-1 Supplement 0x0100,
// 0100..017F; Latin Extended-A 0x0180,
// 0180..024F; Latin Extended-B 0x0250,
// 0250..02AF; IPA Extensions 0x02B0,
// 02B0..02FF; Spacing Modifier Letters 0x0300,
// 0300..036F; Combining Diacritical Marks 0x0370,
// 0370..03FF; Greek and Coptic 0x0400,
// 0400..04FF; Cyrillic 0x0500,
// 0500..052F; Cyrillic Supplement 0x0530,
// 0530..058F; Armenian 0x0590,
// 0590..05FF; Hebrew 0x0600,
// 0600..06FF; Arabic 0x0700,
// 0700..074F; Syria 0x0750,
// 0750..077F; Arabic Supplement 0x0780,
// 0780..07BF; Thaana 0x07C0,
// 07C0..07FF; NKo 0x0900,
// 0900..097F; Devanagari 0x0980,
// 0980..09FF; Bengali 0x0A00,
// 0A00..0A7F; Gurmukhi 0x0A80,
// 0A80..0AFF; Gujarati 0x0B00,
// 0B00..0B7F; Oriya 0x0B80,
// 0B80..0BFF; Tamil 0x0C00,
// 0C00..0C7F; Telugu 0x0C80,
// 0C80..0CFF; Kannada 0x0D00,
// 0D00..0D7F; Malayalam 0x0D80,
// 0D80..0DFF; Sinhala 0x0E00,
// 0E00..0E7F; Thai 0x0E80,
// 0E80..0EFF; Lao 0x0F00,
// 0F00..0FFF; Tibetan 0x1000,
// 1000..109F; Myanmar 0x10A0,
// 10A0..10FF; Georgian 0x1100,
// 1100..11FF; Hangul Jamo 0x1200,
// 1200..137F; Ethiopic 0x1380,
// 1380..139F; Ethiopic Supplement 0x13A0,
// 13A0..13FF; Cherokee 0x1400,
// 1400..167F; Unified Canadian Aboriginal Syllabics 0x1680,
// 1680..169F; Ogham 0x16A0,
// 16A0..16FF; Runic 0x1700,
// 1700..171F; Tagalog 0x1720,
// 1720..173F; Hanunoo 0x1740,
// 1740..175F; Buhid 0x1760,
// 1760..177F; Tagbanwa 0x1780,
// 1780..17FF; Khmer 0x1800,
// 1800..18AF; Mongolian 0x1900,
// 1900..194F; Limbu 0x1950,
// 1950..197F; Tai Le 0x1980,
// 1980..19DF; New Tai Lue 0x19E0,
// 19E0..19FF; Khmer Symbols 0x1A00,
// 1A00..1A1F; Buginese 0x1B00,
// 1B00..1B7F; Balinese 0x1B80,
// 1B80..1BBF; Sundanese 0x1C00,
// 1C00..1C4F; Lepcha 0x1C50,
// 1C50..1C7F; Ol Chiki 0x1D00,
// 1D00..1D7F; Phonetic Extensions 0x1D80,
// 1D80..1DBF; Phonetic Extensions Supplement 0x1DC0,
// 1DC0..1DFF; Combining Diacritical Marks Supplement 0x1E00,
// 1E00..1EFF; Latin Extended Additional 0x1F00,
// 1F00..1FFF; Greek Extended 0x2000,
// 2000..206F; General Punctuation 0x2070,
// 2070..209F; Superscripts and Subscripts 0x20A0,
// 20A0..20CF; Currency Symbols 0x20D0,
// 20D0..20FF; Combining Diacritical Marks for Symbols 0x2100,
// 2100..214F; Letterlike Symbols 0x2150,
// 2150..218F; Number Forms 0x2190,
// 2190..21FF; Arrows 0x2200,
// 2200..22FF; Mathematical Operators 0x2300,
// 2300..23FF; Miscellaneous Technical 0x2400,
// 2400..243F; Control Pictures 0x2440,
// 2440..245F; Optical Character Recognition 0x2460,
// 2460..24FF; Enclosed Alphanumerics 0x2500,
// 2500..257F; Box Drawing 0x2580,
// 2580..259F; Block Elements 0x25A0,
// 25A0..25FF; Geometric Shapes 0x2600,
// 2600..26FF; Miscellaneous Symbols 0x2700,
// 2700..27BF; Dingbats 0x27C0,
// 27C0..27EF; Miscellaneous Mathematical Symbols-A 0x27F0,
// 27F0..27FF; Supplemental Arrows-A 0x2800,
// 2800..28FF; Braille Patterns 0x2900,
// 2900..297F; Supplemental Arrows-B 0x2980,
// 2980..29FF; Miscellaneous Mathematical Symbols-B 0x2A00,
// 2A00..2AFF; Supplemental Mathematical Operators 0x2B00,
// 2B00..2BFF; Miscellaneous Symbols and Arrows 0x2C00,
// 2C00..2C5F; Glagolitic 0x2C60,
// 2C60..2C7F; Latin Extended-C 0x2C80,
// 2C80..2CFF; Coptic 0x2D00,
// 2D00..2D2F; Georgian Supplement 0x2D30,
// 2D30..2D7F; Tifinagh 0x2D80,
// 2D80..2DDF; Ethiopic Extended 0x2DE0,
// 2DE0..2DFF; Cyrillic Extended-A 0x2E00,
// 2E00..2E7F; Supplemental Punctuation 0x2E80,
// 2E80..2EFF; CJK Radicals Supplement 0x2F00,
// 2F00..2FDF; Kangxi Radicals 0x2FF0,
// 2FF0..2FFF; Ideographic Description Characters 0x3000,
// 3000..303F; CJK Symbols and Punctuation 0x3040,
// 3040..309F; Hiragana 0x30A0,
// 30A0..30FF; Katakana 0x3100,
// 3100..312F; Bopomofo 0x3130,
// 3130..318F; Hangul Compatibility Jamo 0x3190,
// 3190..319F; Kanbun 0x31A0,
// 31A0..31BF; Bopomofo Extended 0x31C0,
// 31C0..31EF; CJK Strokes 0x31F0,
// 31F0..31FF; Katakana Phonetic Extensions 0x3200,
// 3200..32FF; Enclosed CJK Letters and Months 0x3300,
// 3300..33FF; CJK Compatibility 0x3400,
// 3400..4DBF; CJK Unified Ideographs Extension A 0x4DC0,
// 4DC0..4DFF; Yijing Hexagram Symbols 0x4E00,
// 4E00..9FFF; CJK Unified Ideograph 0xA000,
// A000..A48F; Yi Syllables 0xA490,
// A490..A4CF; Yi Radicals 0xA500,
// A500..A63F; Vai 0xA640,
// A640..A69F; Cyrillic Extended-B 0xA700,
// A700..A71F; Modifier Tone Letters 0xA720,
// A720..A7FF; Latin Extended-D 0xA800,
// A800..A82F; Syloti Nagri 0xA840,
// A840..A87F; Phags-pa 0xA880,
// A880..A8DF; Saurashtra 0xA900,
// A900..A92F; Kayah Li 0xA930,
// A930..A95F; Rejang 0xAA00,
// AA00..AA5F; Cham 0xAC00,
// AC00..D7AF; Hangul Syllables 0xD800,
// D800..DB7F; High Surrogates 0xDB80,
// DB80..DBFF; High Private Use Surrogates 0xDC00,
// DC00..DFFF; Low Surrogates 0xE000,
// E000..F8FF; Private Use Area 0xF900,
// F900..FAFF; CJK Compatibility Ideographs 0xFB00,
// FB00..FB4F; Alphabetic Presentation Forms 0xFB50,
// FB50..FDFF; Arabic Presentation Forms-A 0xFE00,
// FE00..FE0F; Variation Selectors 0xFE10,
// FE10..FE1F; Vertical Forms 0xFE20,
// FE20..FE2F; Combining Half Marks 0xFE30,
// FE30..FE4F; CJK Compatibility Forms 0xFE50,
// FE50..FE6F; Small Form Variants 0xFE70,
// FE70..FEFF; Arabic Presentation Forms-B 0xFF00,
// FF00..FFEF; Halfwidth and Fullwidth Forms 0xFFF0,
// FFF0..FFFF; Specials 0x10000,
// 10000..1007F; Linear B Syllabary 0x10080,
// 10080..100FF; Linear B Ideograms 0x10100,
// 10100..1013F; Aegean Numbers 0x10140,
// 10140..1018F; Ancient Greek Numbers 0x10190,
// 10190..101CF; Ancient Symbols 0x101D0,
// 101D0..101FF; Phaistos Disc 0x10280,
// 10280..1029F; Lycian 0x102A0,
// 102A0..102DF; Carian 0x10300,
// 10300..1032F; Old Italic 0x10330,
// 10330..1034F; Gothic 0x10380,
// 10380..1039F; Ugaritic 0x103A0,
// 103A0..103DF; Old Persian 0x10400,
// 10400..1044F; Desere 0x10450,
// 10450..1047F; Shavian 0x10480,
// 10480..104AF; Osmanya 0x10800,
// 10800..1083F; Cypriot Syllabary 0x10900,
// 10900..1091F; Phoenician 0x10920,
// 10920..1093F; Lydian 0x10A00,
// 10A00..10A5F; Kharoshthi 0x12000,
// 12000..123FF; Cuneiform 0x12400,
// 12400..1247F; Cuneiform Numbers and Punctuation 0x1D000,
// 1D000..1D0FF; Byzantine Musical Symbols 0x1D100,
// 1D100..1D1FF; Musical Symbols 0x1D200,
// 1D200..1D24F; Ancient Greek Musical Notation 0x1D300,
// 1D300..1D35F; Tai Xuan Jing Symbols 0x1D360,
// 1D360..1D37F; Counting Rod Numerals 0x1D400,
// 1D400..1D7FF; Mathematical Alphanumeric Symbols 0x1F000,
// 1F000..1F02F; Mahjong Tiles 0x1F030,
// 1F030..1F09F; Domino Tiles 0x20000,
// 20000..2A6DF; CJK Unified Ideographs Extension B 0x2F800,
// 2F800..2FA1F; CJK Compatibility Ideographs Supplement 0xE0000,
// E0000..E007F; Tags 0xE0100,
// E0100..E01EF; Variation Selectors Supplement 0xF0000,
// F0000..FFFFF; Supplementary Private Use Area-A 0x100000,
// 100000..10FFFF; Supplementary Private Use Area-B * Returns the object representing the Unicode block containing the * given character, or <code>null</code> if the character is not a * member of a defined block. * <p><b>Note:</b> This method cannot handle <a * characters</a>. To support all Unicode characters, * including supplementary characters, use the {@link * @param c The character in question * @return The <code>UnicodeBlock</code> instance representing the * Unicode block of which this character is a member, or * <code>null</code> if the character is not a member of any * Returns the object representing the Unicode block * containing the given character (Unicode code point), or * <code>null</code> if the character is not a member of a * @param codePoint the character (Unicode code point) in question. * @return The <code>UnicodeBlock</code> instance representing the * Unicode block of which this character is a member, or * <code>null</code> if the character is not a member of any * @exception IllegalArgumentException if the specified * <code>codePoint</code> is an invalid Unicode code point. * @see Character#isValidCodePoint(int) // invariant: top > current >= bottom && codePoint >= unicodeBlockStarts[bottom] * Returns the UnicodeBlock with the given name. Block * names are determined by The Unicode Standard. The file * Blocks-<version>.txt defines blocks for a particular * version of the standard. The {@link Character} class specifies * the version of the standard that it supports. * This method accepts block names in the following forms: * <li> Canonical block names as defined by the Unicode Standard. * For example, the standard defines a "Basic Latin" block. Therefore, this * method accepts "Basic Latin" as a valid block name. The documentation of * each UnicodeBlock provides the canonical name. * <li>Canonical block names with all spaces removed. For example, "BasicLatin" * is a valid block name for the "Basic Latin" block. * <li>The text representation of each constant UnicodeBlock identifier. * For example, this method will return the {@link #BASIC_LATIN} block if * provided with the "BASIC_LATIN" name. This form replaces all spaces and * hyphens in the canonical name with underscores. * Finally, character case is ignored for all of the valid block name forms. * For example, "BASIC_LATIN" and "basic_latin" are both valid block names. * The en_US locale's case mapping rules are used to provide case-insensitive * string comparisons for block name validation. * If the Unicode Standard changes block names, both the previous and * current names will be accepted. * @param blockName A <code>UnicodeBlock</code> name. * @return The <code>UnicodeBlock</code> instance identified * by <code>blockName</code> * @throws IllegalArgumentException if <code>blockName</code> is an * @throws NullPointerException if <code>blockName</code> is null * The value of the <code>Character</code>. private final char value;
/** use serialVersionUID from JDK 1.0.2 for interoperability */ * Constructs a newly allocated <code>Character</code> object that * represents the specified <code>char</code> value. * @param value the value to be represented by the * <code>Character</code> object. * Returns a <tt>Character</tt> instance representing the specified * If a new <tt>Character</tt> instance is not required, this method * should generally be used in preference to the constructor * {@link #Character(char)}, as this method is likely to yield * significantly better space and time performance by caching * frequently requested values. * This method will always cache values in the range '\u0000' * to '\u007f'", inclusive, and may cache other values outside * @return a <tt>Character</tt> instance representing <tt>c</tt>. if(c <=
127) {
// must cache * Returns the value of this <code>Character</code> object. * @return the primitive <code>char</code> value represented by * Returns a hash code for this <code>Character</code>. * @return a hash code value for this object. * Compares this object against the specified object. * The result is <code>true</code> if and only if the argument is not * <code>null</code> and is a <code>Character</code> object that * represents the same <code>char</code> value as this object. * @param obj the object to compare with. * @return <code>true</code> if the objects are the same; * <code>false</code> otherwise. * Returns a <code>String</code> object representing this * <code>Character</code>'s value. The result is a string of * length 1 whose sole component is the primitive * <code>char</code> value represented by this * <code>Character</code> object. * @return a string representation of this object. * Returns a <code>String</code> object representing the * specified <code>char</code>. The result is a string of length * 1 consisting solely of the specified <code>char</code>. * @param c the <code>char</code> to be converted * @return the string representation of the specified <code>char</code> * Determines whether the specified code point is a valid * Unicode code point value</a>. * @param codePoint the Unicode code point to be tested * @return {@code true} if the specified code point value is between * {@link #MIN_CODE_POINT} and * {@link #MAX_CODE_POINT} inclusive; * {@code false} otherwise. * Determines whether the specified character (Unicode code point) * is in the <a href="#supplementary">supplementary character</a> range. * @param codePoint the character (Unicode code point) to be tested * @return {@code true} if the specified code point is between * {@link #MIN_SUPPLEMENTARY_CODE_POINT} and * {@link #MAX_CODE_POINT} inclusive; * {@code false} otherwise. * Determines if the given {@code char} value is a * Unicode high-surrogate code unit</a> * (also known as <i>leading-surrogate code unit</i>). * <p>Such values do not represent characters by themselves, * but are used in the representation of * <a href="#supplementary">supplementary characters</a> * in the UTF-16 encoding. * @param ch the {@code char} value to be tested. * @return {@code true} if the {@code char} value is between * {@link #MIN_HIGH_SURROGATE} and * {@link #MAX_HIGH_SURROGATE} inclusive; * {@code false} otherwise. * @see #isLowSurrogate(char) * @see Character.UnicodeBlock#of(int) * Determines if the given {@code char} value is a * Unicode low-surrogate code unit</a> * (also known as <i>trailing-surrogate code unit</i>). * <p>Such values do not represent characters by themselves, * but are used in the representation of * <a href="#supplementary">supplementary characters</a> * in the UTF-16 encoding. * @param ch the {@code char} value to be tested. * @return {@code true} if the {@code char} value is between * {@link #MIN_LOW_SURROGATE} and * {@link #MAX_LOW_SURROGATE} inclusive; * {@code false} otherwise. * @see #isHighSurrogate(char) * Determines if the given {@code char} value is a Unicode * <i>surrogate code unit</i>. * <p>Such values do not represent characters by themselves, * but are used in the representation of * <a href="#supplementary">supplementary characters</a> * in the UTF-16 encoding. * <p>A char value is a surrogate code unit if and only if it is either * a {@linkplain #isLowSurrogate(char) low-surrogate code unit} or * a {@linkplain #isHighSurrogate(char) high-surrogate code unit}. * @param ch the {@code char} value to be tested. * @return {@code true} if the {@code char} value is between * {@link #MIN_SURROGATE} and * {@link #MAX_SURROGATE} inclusive; * {@code false} otherwise. * Determines whether the specified pair of <code>char</code> * Unicode surrogate pair</a>. * <p>This method is equivalent to the expression: * isHighSurrogate(high) && isLowSurrogate(low) * @param high the high-surrogate code value to be tested * @param low the low-surrogate code value to be tested * @return <code>true</code> if the specified high and * low-surrogate code values represent a valid surrogate pair; * <code>false</code> otherwise. * Determines the number of <code>char</code> values needed to * represent the specified character (Unicode code point). If the * specified character is equal to or greater than 0x10000, then * the method returns 2. Otherwise, the method returns 1. * <p>This method doesn't validate the specified character to be a * valid Unicode code point. The caller must validate the * character value using {@link #isValidCodePoint(int) isValidCodePoint} * @param codePoint the character (Unicode code point) to be tested. * @return 2 if the character is a valid supplementary character; 1 otherwise. * @see #isSupplementaryCodePoint(int) * Converts the specified surrogate pair to its supplementary code * point value. This method does not validate the specified * surrogate pair. The caller must validate it using {@link * #isSurrogatePair(char, char) isSurrogatePair} if necessary. * @param high the high-surrogate code unit * @param low the low-surrogate code unit * @return the supplementary code point composed from the * specified surrogate pair. // return ((high - MIN_HIGH_SURROGATE) << 10) // + (low - MIN_LOW_SURROGATE) // + MIN_SUPPLEMENTARY_CODE_POINT; * Returns the code point at the given index of the * <code>CharSequence</code>. If the <code>char</code> value at * the given index in the <code>CharSequence</code> is in the * high-surrogate range, the following index is less than the * length of the <code>CharSequence</code>, and the * <code>char</code> value at the following index is in the * low-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at the given index is returned. * @param seq a sequence of <code>char</code> values (Unicode code * @param index the index to the <code>char</code> values (Unicode * code units) in <code>seq</code> to be converted * @return the Unicode code point at the given index * @exception NullPointerException if <code>seq</code> is null. * @exception IndexOutOfBoundsException if the value * <code>index</code> is negative or not less than * {@link CharSequence#length() seq.length()}. * Returns the code point at the given index of the * <code>char</code> array. If the <code>char</code> value at * the given index in the <code>char</code> array is in the * high-surrogate range, the following index is less than the * length of the <code>char</code> array, and the * <code>char</code> value at the following index is in the * low-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at the given index is returned. * @param a the <code>char</code> array * @param index the index to the <code>char</code> values (Unicode * code units) in the <code>char</code> array to be converted * @return the Unicode code point at the given index * @exception NullPointerException if <code>a</code> is null. * @exception IndexOutOfBoundsException if the value * <code>index</code> is negative or not less than * the length of the <code>char</code> array. * Returns the code point at the given index of the * <code>char</code> array, where only array elements with * <code>index</code> less than <code>limit</code> can be used. If * the <code>char</code> value at the given index in the * <code>char</code> array is in the high-surrogate range, the * following index is less than the <code>limit</code>, and the * <code>char</code> value at the following index is in the * low-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at the given index is returned. * @param a the <code>char</code> array * @param index the index to the <code>char</code> values (Unicode * code units) in the <code>char</code> array to be converted * @param limit the index after the last array element that can be used in the * <code>char</code> array * @return the Unicode code point at the given index * @exception NullPointerException if <code>a</code> is null. * @exception IndexOutOfBoundsException if the <code>index</code> * argument is negative or not less than the <code>limit</code> * argument, or if the <code>limit</code> argument is negative or * greater than the length of the <code>char</code> array. * Returns the code point preceding the given index of the * <code>CharSequence</code>. If the <code>char</code> value at * <code>(index - 1)</code> in the <code>CharSequence</code> is in * the low-surrogate range, <code>(index - 2)</code> is not * negative, and the <code>char</code> value at <code>(index - * 2)</code> in the <code>CharSequence</code> is in the * high-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at <code>(index - 1)</code> is * @param seq the <code>CharSequence</code> instance * @param index the index following the code point that should be returned * @return the Unicode code point value before the given index. * @exception NullPointerException if <code>seq</code> is null. * @exception IndexOutOfBoundsException if the <code>index</code> * argument is less than 1 or greater than {@link * CharSequence#length() seq.length()}. * Returns the code point preceding the given index of the * <code>char</code> array. If the <code>char</code> value at * <code>(index - 1)</code> in the <code>char</code> array is in * the low-surrogate range, <code>(index - 2)</code> is not * negative, and the <code>char</code> value at <code>(index - * 2)</code> in the <code>char</code> array is in the * high-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at <code>(index - 1)</code> is * @param a the <code>char</code> array * @param index the index following the code point that should be returned * @return the Unicode code point value before the given index. * @exception NullPointerException if <code>a</code> is null. * @exception IndexOutOfBoundsException if the <code>index</code> * argument is less than 1 or greater than the length of the * <code>char</code> array * Returns the code point preceding the given index of the * <code>char</code> array, where only array elements with * <code>index</code> greater than or equal to <code>start</code> * can be used. If the <code>char</code> value at <code>(index - * 1)</code> in the <code>char</code> array is in the * low-surrogate range, <code>(index - 2)</code> is not less than * <code>start</code>, and the <code>char</code> value at * <code>(index - 2)</code> in the <code>char</code> array is in * the high-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the <code>char</code> value at <code>(index - 1)</code> is * @param a the <code>char</code> array * @param index the index following the code point that should be returned * @param start the index of the first array element in the * <code>char</code> array * @return the Unicode code point value before the given index. * @exception NullPointerException if <code>a</code> is null. * @exception IndexOutOfBoundsException if the <code>index</code> * argument is not greater than the <code>start</code> argument or * is greater than the length of the <code>char</code> array, or * if the <code>start</code> argument is negative or not less than * the length of the <code>char</code> array. * Converts the specified character (Unicode code point) to its * UTF-16 representation. If the specified code point is a BMP * (Basic Multilingual Plane or Plane 0) value, the same value is * stored in <code>dst[dstIndex]</code>, and 1 is returned. If the * specified code point is a supplementary character, its * surrogate values are stored in <code>dst[dstIndex]</code> * (high-surrogate) and <code>dst[dstIndex+1]</code> * (low-surrogate), and 2 is returned. * @param codePoint the character (Unicode code point) to be converted. * @param dst an array of <code>char</code> in which the * <code>codePoint</code>'s UTF-16 value is stored. * @param dstIndex the start index into the <code>dst</code> * array where the converted value is stored. * @return 1 if the code point is a BMP code point, 2 if the * code point is a supplementary code point. * @exception IllegalArgumentException if the specified * <code>codePoint</code> is not a valid Unicode code point. * @exception NullPointerException if the specified <code>dst</code> is null. * @exception IndexOutOfBoundsException if <code>dstIndex</code> * is negative or not less than <code>dst.length</code>, or if * <code>dst</code> at <code>dstIndex</code> doesn't have enough * array element(s) to store the resulting <code>char</code> * value(s). (If <code>dstIndex</code> is equal to * <code>dst.length-1</code> and the specified * <code>codePoint</code> is a supplementary character, the * high-surrogate value is not stored in * <code>dst[dstIndex]</code>.) * Converts the specified character (Unicode code point) to its * UTF-16 representation stored in a <code>char</code> array. If * the specified code point is a BMP (Basic Multilingual Plane or * Plane 0) value, the resulting <code>char</code> array has * the same value as <code>codePoint</code>. If the specified code * point is a supplementary code point, the resulting * <code>char</code> array has the corresponding surrogate pair. * @param codePoint a Unicode code point * @return a <code>char</code> array having * <code>codePoint</code>'s UTF-16 representation. * @exception IllegalArgumentException if the specified * <code>codePoint</code> is not a valid Unicode code point. // We write elements "backwards" to guarantee all-or-nothing * Returns the number of Unicode code points in the text range of * the specified char sequence. The text range begins at the * specified <code>beginIndex</code> and extends to the * <code>char</code> at index <code>endIndex - 1</code>. Thus the * length (in <code>char</code>s) of the text range is * <code>endIndex-beginIndex</code>. Unpaired surrogates within * the text range count as one code point each. * @param seq the char sequence * @param beginIndex the index to the first <code>char</code> of * @param endIndex the index after the last <code>char</code> of * @return the number of Unicode code points in the specified text * @exception NullPointerException if <code>seq</code> is null. * @exception IndexOutOfBoundsException if the * <code>beginIndex</code> is negative, or <code>endIndex</code> * is larger than the length of the given sequence, or * <code>beginIndex</code> is larger than <code>endIndex</code>. * Returns the number of Unicode code points in a subarray of the * <code>char</code> array argument. The <code>offset</code> * argument is the index of the first <code>char</code> of the * subarray and the <code>count</code> argument specifies the * length of the subarray in <code>char</code>s. Unpaired * surrogates within the subarray count as one code point each. * @param a the <code>char</code> array * @param offset the index of the first <code>char</code> in the * given <code>char</code> array * @param count the length of the subarray in <code>char</code>s * @return the number of Unicode code points in the specified subarray * @exception NullPointerException if <code>a</code> is null. * @exception IndexOutOfBoundsException if <code>offset</code> or * <code>count</code> is negative, or if <code>offset + * count</code> is larger than the length of the given array. * Returns the index within the given char sequence that is offset * from the given <code>index</code> by <code>codePointOffset</code> * code points. Unpaired surrogates within the text range given by * <code>index</code> and <code>codePointOffset</code> count as * @param seq the char sequence * @param index the index to be offset * @param codePointOffset the offset in code points * @return the index within the char sequence * @exception NullPointerException if <code>seq</code> is null. * @exception IndexOutOfBoundsException if <code>index</code> * is negative or larger then the length of the char sequence, * or if <code>codePointOffset</code> is positive and the * subsequence starting with <code>index</code> has fewer than * <code>codePointOffset</code> code points, or if * <code>codePointOffset</code> is negative and the subsequence * before <code>index</code> has fewer than the absolute value * of <code>codePointOffset</code> code points. * Returns the index within the given <code>char</code> subarray * that is offset from the given <code>index</code> by * <code>codePointOffset</code> code points. The * <code>start</code> and <code>count</code> arguments specify a * subarray of the <code>char</code> array. Unpaired surrogates * within the text range given by <code>index</code> and * <code>codePointOffset</code> count as one code point each. * @param a the <code>char</code> array * @param start the index of the first <code>char</code> of the * @param count the length of the subarray in <code>char</code>s * @param index the index to be offset * @param codePointOffset the offset in code points * @return the index within the subarray * @exception NullPointerException if <code>a</code> is null. * @exception IndexOutOfBoundsException * if <code>start</code> or <code>count</code> is negative, * or if <code>start + count</code> is larger than the length of * or if <code>index</code> is less than <code>start</code> or * larger then <code>start + count</code>, * or if <code>codePointOffset</code> is positive and the text range * starting with <code>index</code> and ending with <code>start * + count - 1</code> has fewer than <code>codePointOffset</code> code * or if <code>codePointOffset</code> is negative and the text range * starting with <code>start</code> and ending with <code>index * - 1</code> has fewer than the absolute value of * <code>codePointOffset</code> code points. * Determines if the specified character is a lowercase character. * A character is lowercase if its general category type, provided * by <code>Character.getType(ch)</code>, is * <code>LOWERCASE_LETTER</code>. * The following are examples of lowercase characters: * a b c d e f g h i j k l m n o p q r s t u v w x y z * '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' * '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' * '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' * '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF' * <p> Many other Unicode characters are lowercase too. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isLowerCase(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is lowercase; * <code>false</code> otherwise. * @see java.lang.Character#isLowerCase(char) * @see java.lang.Character#isTitleCase(char) * @see java.lang.Character#toLowerCase(char) * @see java.lang.Character#getType(char) * Determines if the specified character (Unicode code point) is a * A character is lowercase if its general category type, provided * by {@link Character#getType getType(codePoint)}, is * <code>LOWERCASE_LETTER</code>. * The following are examples of lowercase characters: * a b c d e f g h i j k l m n o p q r s t u v w x y z * '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' * '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' * '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' * '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF' * <p> Many other Unicode characters are lowercase too. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is lowercase; * <code>false</code> otherwise. * @see java.lang.Character#isLowerCase(int) * @see java.lang.Character#isTitleCase(int) * @see java.lang.Character#toLowerCase(int) * @see java.lang.Character#getType(int) * Determines if the specified character is an uppercase character. * A character is uppercase if its general category type, provided by * <code>Character.getType(ch)</code>, is <code>UPPERCASE_LETTER</code>. * The following are examples of uppercase characters: * A B C D E F G H I J K L M N O P Q R S T U V W X Y Z * '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' * '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' * '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' * '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE' * <p> Many other Unicode characters are uppercase too.<p> * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isUpperCase(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is uppercase; * <code>false</code> otherwise. * @see java.lang.Character#isLowerCase(char) * @see java.lang.Character#isTitleCase(char) * @see java.lang.Character#toUpperCase(char) * @see java.lang.Character#getType(char) * Determines if the specified character (Unicode code point) is an uppercase character. * A character is uppercase if its general category type, provided by * {@link Character#getType(int) getType(codePoint)}, is <code>UPPERCASE_LETTER</code>. * The following are examples of uppercase characters: * A B C D E F G H I J K L M N O P Q R S T U V W X Y Z * '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' * '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' * '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' * '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE' * <p> Many other Unicode characters are uppercase too.<p> * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is uppercase; * <code>false</code> otherwise. * @see java.lang.Character#isLowerCase(int) * @see java.lang.Character#isTitleCase(int) * @see java.lang.Character#toUpperCase(int) * @see java.lang.Character#getType(int) * Determines if the specified character is a titlecase character. * A character is a titlecase character if its general * category type, provided by <code>Character.getType(ch)</code>, * is <code>TITLECASE_LETTER</code>. * Some characters look like pairs of Latin letters. For example, there * is an uppercase letter that looks like "LJ" and has a corresponding * lowercase letter that looks like "lj". A third form, which looks like "Lj", * is the appropriate form to use when rendering a word in lowercase * with initial capitals, as for a book title. * These are some of the Unicode characters for which this method returns * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON</code> * <li><code>LATIN CAPITAL LETTER L WITH SMALL LETTER J</code> * <li><code>LATIN CAPITAL LETTER N WITH SMALL LETTER J</code> * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z</code> * <p> Many other Unicode characters are titlecase too.<p> * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isTitleCase(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is titlecase; * <code>false</code> otherwise. * @see java.lang.Character#isLowerCase(char) * @see java.lang.Character#isUpperCase(char) * @see java.lang.Character#toTitleCase(char) * @see java.lang.Character#getType(char) * Determines if the specified character (Unicode code point) is a titlecase character. * A character is a titlecase character if its general * category type, provided by {@link Character#getType(int) getType(codePoint)}, * is <code>TITLECASE_LETTER</code>. * Some characters look like pairs of Latin letters. For example, there * is an uppercase letter that looks like "LJ" and has a corresponding * lowercase letter that looks like "lj". A third form, which looks like "Lj", * is the appropriate form to use when rendering a word in lowercase * with initial capitals, as for a book title. * These are some of the Unicode characters for which this method returns * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON</code> * <li><code>LATIN CAPITAL LETTER L WITH SMALL LETTER J</code> * <li><code>LATIN CAPITAL LETTER N WITH SMALL LETTER J</code> * <li><code>LATIN CAPITAL LETTER D WITH SMALL LETTER Z</code> * <p> Many other Unicode characters are titlecase too.<p> * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is titlecase; * <code>false</code> otherwise. * @see java.lang.Character#isLowerCase(int) * @see java.lang.Character#isUpperCase(int) * @see java.lang.Character#toTitleCase(int) * @see java.lang.Character#getType(int) * Determines if the specified character is a digit. * A character is a digit if its general category type, provided * by <code>Character.getType(ch)</code>, is * <code>DECIMAL_DIGIT_NUMBER</code>. * Some Unicode character ranges that contain digits: * <li><code>'\u0030'</code> through <code>'\u0039'</code>, * ISO-LATIN-1 digits (<code>'0'</code> through <code>'9'</code>) * <li><code>'\u0660'</code> through <code>'\u0669'</code>, * <li><code>'\u06F0'</code> through <code>'\u06F9'</code>, * Extended Arabic-Indic digits * <li><code>'\u0966'</code> through <code>'\u096F'</code>, * <li><code>'\uFF10'</code> through <code>'\uFF19'</code>, * Many other character ranges contain digits as well. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isDigit(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is a digit; * <code>false</code> otherwise. * @see java.lang.Character#digit(char, int) * @see java.lang.Character#forDigit(int, int) * @see java.lang.Character#getType(char) * Determines if the specified character (Unicode code point) is a digit. * A character is a digit if its general category type, provided * by {@link Character#getType(int) getType(codePoint)}, is * <code>DECIMAL_DIGIT_NUMBER</code>. * Some Unicode character ranges that contain digits: * <li><code>'\u0030'</code> through <code>'\u0039'</code>, * ISO-LATIN-1 digits (<code>'0'</code> through <code>'9'</code>) * <li><code>'\u0660'</code> through <code>'\u0669'</code>, * <li><code>'\u06F0'</code> through <code>'\u06F9'</code>, * Extended Arabic-Indic digits * <li><code>'\u0966'</code> through <code>'\u096F'</code>, * <li><code>'\uFF10'</code> through <code>'\uFF19'</code>, * Many other character ranges contain digits as well. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is a digit; * <code>false</code> otherwise. * @see java.lang.Character#forDigit(int, int) * @see java.lang.Character#getType(int) * Determines if a character is defined in Unicode. * A character is defined if at least one of the following is true: * <li>It has an entry in the UnicodeData file. * <li>It has a value in a range defined by the UnicodeData file. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isDefined(int)} method. * @param ch the character to be tested * @return <code>true</code> if the character has a defined meaning * in Unicode; <code>false</code> otherwise. * @see java.lang.Character#isDigit(char) * @see java.lang.Character#isLetter(char) * @see java.lang.Character#isLetterOrDigit(char) * @see java.lang.Character#isLowerCase(char) * @see java.lang.Character#isTitleCase(char) * @see java.lang.Character#isUpperCase(char) * Determines if a character (Unicode code point) is defined in Unicode. * A character is defined if at least one of the following is true: * <li>It has an entry in the UnicodeData file. * <li>It has a value in a range defined by the UnicodeData file. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character has a defined meaning * in Unicode; <code>false</code> otherwise. * @see java.lang.Character#isDigit(int) * @see java.lang.Character#isLetter(int) * @see java.lang.Character#isLetterOrDigit(int) * @see java.lang.Character#isLowerCase(int) * @see java.lang.Character#isTitleCase(int) * @see java.lang.Character#isUpperCase(int) * Determines if the specified character is a letter. * A character is considered to be a letter if its general * category type, provided by <code>Character.getType(ch)</code>, * is any of the following: * <li> <code>UPPERCASE_LETTER</code> * <li> <code>LOWERCASE_LETTER</code> * <li> <code>TITLECASE_LETTER</code> * <li> <code>MODIFIER_LETTER</code> * <li> <code>OTHER_LETTER</code> * Not all letters have case. Many characters are * letters but are neither uppercase nor lowercase nor titlecase. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isLetter(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is a letter; * <code>false</code> otherwise. * @see java.lang.Character#isDigit(char) * @see java.lang.Character#isJavaIdentifierStart(char) * @see java.lang.Character#isJavaLetter(char) * @see java.lang.Character#isJavaLetterOrDigit(char) * @see java.lang.Character#isLetterOrDigit(char) * @see java.lang.Character#isLowerCase(char) * @see java.lang.Character#isTitleCase(char) * @see java.lang.Character#isUnicodeIdentifierStart(char) * @see java.lang.Character#isUpperCase(char) * Determines if the specified character (Unicode code point) is a letter. * A character is considered to be a letter if its general * category type, provided by {@link Character#getType(int) getType(codePoint)}, * is any of the following: * <li> <code>UPPERCASE_LETTER</code> * <li> <code>LOWERCASE_LETTER</code> * <li> <code>TITLECASE_LETTER</code> * <li> <code>MODIFIER_LETTER</code> * <li> <code>OTHER_LETTER</code> * Not all letters have case. Many characters are * letters but are neither uppercase nor lowercase nor titlecase. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is a letter; * <code>false</code> otherwise. * @see java.lang.Character#isDigit(int) * @see java.lang.Character#isJavaIdentifierStart(int) * @see java.lang.Character#isLetterOrDigit(int) * @see java.lang.Character#isLowerCase(int) * @see java.lang.Character#isTitleCase(int) * @see java.lang.Character#isUnicodeIdentifierStart(int) * @see java.lang.Character#isUpperCase(int) * Determines if the specified character is a letter or digit. * A character is considered to be a letter or digit if either * <code>Character.isLetter(char ch)</code> or * <code>Character.isDigit(char ch)</code> returns * <code>true</code> for the character. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isLetterOrDigit(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is a letter or digit; * <code>false</code> otherwise. * @see java.lang.Character#isDigit(char) * @see java.lang.Character#isJavaIdentifierPart(char) * @see java.lang.Character#isJavaLetter(char) * @see java.lang.Character#isJavaLetterOrDigit(char) * @see java.lang.Character#isLetter(char) * @see java.lang.Character#isUnicodeIdentifierPart(char) * Determines if the specified character (Unicode code point) is a letter or digit. * A character is considered to be a letter or digit if either * {@link #isLetter(int) isLetter(codePoint)} or * {@link #isDigit(int) isDigit(codePoint)} returns * <code>true</code> for the character. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is a letter or digit; * <code>false</code> otherwise. * @see java.lang.Character#isDigit(int) * @see java.lang.Character#isJavaIdentifierPart(int) * @see java.lang.Character#isLetter(int) * @see java.lang.Character#isUnicodeIdentifierPart(int) * Determines if the specified character is permissible as the first * character in a Java identifier. * A character may start a Java identifier if and only if * one of the following is true: * <li> {@link #isLetter(char) isLetter(ch)} returns <code>true</code> * <li> {@link #getType(char) getType(ch)} returns <code>LETTER_NUMBER</code> * <li> ch is a currency symbol (such as "$") * <li> ch is a connecting punctuation character (such as "_"). * @param ch the character to be tested. * @return <code>true</code> if the character may start a Java * identifier; <code>false</code> otherwise. * @see java.lang.Character#isJavaLetterOrDigit(char) * @see java.lang.Character#isJavaIdentifierStart(char) * @see java.lang.Character#isJavaIdentifierPart(char) * @see java.lang.Character#isLetter(char) * @see java.lang.Character#isLetterOrDigit(char) * @see java.lang.Character#isUnicodeIdentifierStart(char) * @deprecated Replaced by isJavaIdentifierStart(char). * Determines if the specified character may be part of a Java * identifier as other than the first character. * A character may be part of a Java identifier if and only if any * of the following are true: * <li> it is a currency symbol (such as <code>'$'</code>) * <li> it is a connecting punctuation character (such as <code>'_'</code>) * <li> it is a numeric letter (such as a Roman numeral character) * <li> it is a combining mark * <li> it is a non-spacing mark * <li> <code>isIdentifierIgnorable</code> returns * <code>true</code> for the character. * @param ch the character to be tested. * @return <code>true</code> if the character may be part of a * Java identifier; <code>false</code> otherwise. * @see java.lang.Character#isJavaLetter(char) * @see java.lang.Character#isJavaIdentifierStart(char) * @see java.lang.Character#isJavaIdentifierPart(char) * @see java.lang.Character#isLetter(char) * @see java.lang.Character#isLetterOrDigit(char) * @see java.lang.Character#isUnicodeIdentifierPart(char) * @see java.lang.Character#isIdentifierIgnorable(char) * @deprecated Replaced by isJavaIdentifierPart(char). * Determines if the specified character is * permissible as the first character in a Java identifier. * A character may start a Java identifier if and only if * one of the following conditions is true: * <li> {@link #isLetter(char) isLetter(ch)} returns <code>true</code> * <li> {@link #getType(char) getType(ch)} returns <code>LETTER_NUMBER</code> * <li> ch is a currency symbol (such as "$") * <li> ch is a connecting punctuation character (such as "_"). * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isJavaIdentifierStart(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character may start a Java identifier; * <code>false</code> otherwise. * @see java.lang.Character#isJavaIdentifierPart(char) * @see java.lang.Character#isLetter(char) * @see java.lang.Character#isUnicodeIdentifierStart(char) * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) * Determines if the character (Unicode code point) is * permissible as the first character in a Java identifier. * A character may start a Java identifier if and only if * one of the following conditions is true: * <li> {@link #isLetter(int) isLetter(codePoint)} * returns <code>true</code> * <li> {@link #getType(int) getType(codePoint)} * returns <code>LETTER_NUMBER</code> * <li> the referenced character is a currency symbol (such as "$") * <li> the referenced character is a connecting punctuation character * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character may start a Java identifier; * <code>false</code> otherwise. * @see java.lang.Character#isJavaIdentifierPart(int) * @see java.lang.Character#isLetter(int) * @see java.lang.Character#isUnicodeIdentifierStart(int) * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) * Determines if the specified character may be part of a Java * identifier as other than the first character. * A character may be part of a Java identifier if any of the following * <li> it is a currency symbol (such as <code>'$'</code>) * <li> it is a connecting punctuation character (such as <code>'_'</code>) * <li> it is a numeric letter (such as a Roman numeral character) * <li> it is a combining mark * <li> it is a non-spacing mark * <li> <code>isIdentifierIgnorable</code> returns * <code>true</code> for the character * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isJavaIdentifierPart(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character may be part of a * Java identifier; <code>false</code> otherwise. * @see java.lang.Character#isIdentifierIgnorable(char) * @see java.lang.Character#isJavaIdentifierStart(char) * @see java.lang.Character#isLetterOrDigit(char) * @see java.lang.Character#isUnicodeIdentifierPart(char) * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) * Determines if the character (Unicode code point) may be part of a Java * identifier as other than the first character. * A character may be part of a Java identifier if any of the following * <li> it is a currency symbol (such as <code>'$'</code>) * <li> it is a connecting punctuation character (such as <code>'_'</code>) * <li> it is a numeric letter (such as a Roman numeral character) * <li> it is a combining mark * <li> it is a non-spacing mark * <li> {@link #isIdentifierIgnorable(int) * isIdentifierIgnorable(codePoint)} returns <code>true</code> for * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character may be part of a * Java identifier; <code>false</code> otherwise. * @see java.lang.Character#isIdentifierIgnorable(int) * @see java.lang.Character#isJavaIdentifierStart(int) * @see java.lang.Character#isLetterOrDigit(int) * @see java.lang.Character#isUnicodeIdentifierPart(int) * @see javax.lang.model.SourceVersion#isIdentifier(CharSequence) * Determines if the specified character is permissible as the * first character in a Unicode identifier. * A character may start a Unicode identifier if and only if * one of the following conditions is true: * <li> {@link #isLetter(char) isLetter(ch)} returns <code>true</code> * <li> {@link #getType(char) getType(ch)} returns * <code>LETTER_NUMBER</code>. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isUnicodeIdentifierStart(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character may start a Unicode * identifier; <code>false</code> otherwise. * @see java.lang.Character#isJavaIdentifierStart(char) * @see java.lang.Character#isLetter(char) * @see java.lang.Character#isUnicodeIdentifierPart(char) * Determines if the specified character (Unicode code point) is permissible as the * first character in a Unicode identifier. * A character may start a Unicode identifier if and only if * one of the following conditions is true: * <li> {@link #isLetter(int) isLetter(codePoint)} * returns <code>true</code> * <li> {@link #getType(int) getType(codePoint)} * returns <code>LETTER_NUMBER</code>. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character may start a Unicode * identifier; <code>false</code> otherwise. * @see java.lang.Character#isJavaIdentifierStart(int) * @see java.lang.Character#isLetter(int) * @see java.lang.Character#isUnicodeIdentifierPart(int) * Determines if the specified character may be part of a Unicode * identifier as other than the first character. * A character may be part of a Unicode identifier if and only if * one of the following statements is true: * <li> it is a connecting punctuation character (such as <code>'_'</code>) * <li> it is a numeric letter (such as a Roman numeral character) * <li> it is a combining mark * <li> it is a non-spacing mark * <li> <code>isIdentifierIgnorable</code> returns * <code>true</code> for this character. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isUnicodeIdentifierPart(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character may be part of a * Unicode identifier; <code>false</code> otherwise. * @see java.lang.Character#isIdentifierIgnorable(char) * @see java.lang.Character#isJavaIdentifierPart(char) * @see java.lang.Character#isLetterOrDigit(char) * @see java.lang.Character#isUnicodeIdentifierStart(char) * Determines if the specified character (Unicode code point) may be part of a Unicode * identifier as other than the first character. * A character may be part of a Unicode identifier if and only if * one of the following statements is true: * <li> it is a connecting punctuation character (such as <code>'_'</code>) * <li> it is a numeric letter (such as a Roman numeral character) * <li> it is a combining mark * <li> it is a non-spacing mark * <li> <code>isIdentifierIgnorable</code> returns * <code>true</code> for this character. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character may be part of a * Unicode identifier; <code>false</code> otherwise. * @see java.lang.Character#isIdentifierIgnorable(int) * @see java.lang.Character#isJavaIdentifierPart(int) * @see java.lang.Character#isLetterOrDigit(int) * @see java.lang.Character#isUnicodeIdentifierStart(int) * Determines if the specified character should be regarded as * an ignorable character in a Java identifier or a Unicode identifier. * The following Unicode characters are ignorable in a Java identifier * or a Unicode identifier: * <li>ISO control characters that are not whitespace * <li><code>'\u0000'</code> through <code>'\u0008'</code> * <li><code>'\u000E'</code> through <code>'\u001B'</code> * <li><code>'\u007F'</code> through <code>'\u009F'</code> * <li>all characters that have the <code>FORMAT</code> general * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isIdentifierIgnorable(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is an ignorable control * character that may be part of a Java or Unicode identifier; * <code>false</code> otherwise. * @see java.lang.Character#isJavaIdentifierPart(char) * @see java.lang.Character#isUnicodeIdentifierPart(char) * Determines if the specified character (Unicode code point) should be regarded as * an ignorable character in a Java identifier or a Unicode identifier. * The following Unicode characters are ignorable in a Java identifier * or a Unicode identifier: * <li>ISO control characters that are not whitespace * <li><code>'\u0000'</code> through <code>'\u0008'</code> * <li><code>'\u000E'</code> through <code>'\u001B'</code> * <li><code>'\u007F'</code> through <code>'\u009F'</code> * <li>all characters that have the <code>FORMAT</code> general * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is an ignorable control * character that may be part of a Java or Unicode identifier; * <code>false</code> otherwise. * @see java.lang.Character#isJavaIdentifierPart(int) * @see java.lang.Character#isUnicodeIdentifierPart(int) * Converts the character argument to lowercase using case * mapping information from the UnicodeData file. * <code>Character.isLowerCase(Character.toLowerCase(ch))</code> * does not always return <code>true</code> for some ranges of * characters, particularly those that are symbols or ideographs. * <p>In general, {@link java.lang.String#toLowerCase()} should be used to map * characters to lowercase. <code>String</code> case mapping methods * have several benefits over <code>Character</code> case mapping methods. * <code>String</code> case mapping methods can perform locale-sensitive * mappings, context-sensitive mappings, and 1:M character mappings, whereas * the <code>Character</code> case mapping methods cannot. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #toLowerCase(int)} method. * @param ch the character to be converted. * @return the lowercase equivalent of the character, if any; * otherwise, the character itself. * @see java.lang.Character#isLowerCase(char) * @see java.lang.String#toLowerCase() * Converts the character (Unicode code point) argument to * lowercase using case mapping information from the UnicodeData * <code>Character.isLowerCase(Character.toLowerCase(codePoint))</code> * does not always return <code>true</code> for some ranges of * characters, particularly those that are symbols or ideographs. * <p>In general, {@link java.lang.String#toLowerCase()} should be used to map * characters to lowercase. <code>String</code> case mapping methods * have several benefits over <code>Character</code> case mapping methods. * <code>String</code> case mapping methods can perform locale-sensitive * mappings, context-sensitive mappings, and 1:M character mappings, whereas * the <code>Character</code> case mapping methods cannot. * @param codePoint the character (Unicode code point) to be converted. * @return the lowercase equivalent of the character (Unicode code * point), if any; otherwise, the character itself. * @see java.lang.Character#isLowerCase(int) * @see java.lang.String#toLowerCase() * Converts the character argument to uppercase using case mapping * information from the UnicodeData file. * <code>Character.isUpperCase(Character.toUpperCase(ch))</code> * does not always return <code>true</code> for some ranges of * characters, particularly those that are symbols or ideographs. * <p>In general, {@link java.lang.String#toUpperCase()} should be used to map * characters to uppercase. <code>String</code> case mapping methods * have several benefits over <code>Character</code> case mapping methods. * <code>String</code> case mapping methods can perform locale-sensitive * mappings, context-sensitive mappings, and 1:M character mappings, whereas * the <code>Character</code> case mapping methods cannot. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #toUpperCase(int)} method. * @param ch the character to be converted. * @return the uppercase equivalent of the character, if any; * otherwise, the character itself. * @see java.lang.Character#isUpperCase(char) * @see java.lang.String#toUpperCase() * Converts the character (Unicode code point) argument to * uppercase using case mapping information from the UnicodeData * <code>Character.isUpperCase(Character.toUpperCase(codePoint))</code> * does not always return <code>true</code> for some ranges of * characters, particularly those that are symbols or ideographs. * <p>In general, {@link java.lang.String#toUpperCase()} should be used to map * characters to uppercase. <code>String</code> case mapping methods * have several benefits over <code>Character</code> case mapping methods. * <code>String</code> case mapping methods can perform locale-sensitive * mappings, context-sensitive mappings, and 1:M character mappings, whereas * the <code>Character</code> case mapping methods cannot. * @param codePoint the character (Unicode code point) to be converted. * @return the uppercase equivalent of the character, if any; * otherwise, the character itself. * @see java.lang.Character#isUpperCase(int) * @see java.lang.String#toUpperCase() * Converts the character argument to titlecase using case mapping * information from the UnicodeData file. If a character has no * explicit titlecase mapping and is not itself a titlecase char * according to UnicodeData, then the uppercase mapping is * returned as an equivalent titlecase mapping. If the * <code>char</code> argument is already a titlecase * <code>char</code>, the same <code>char</code> value will be * <code>Character.isTitleCase(Character.toTitleCase(ch))</code> * does not always return <code>true</code> for some ranges of * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #toTitleCase(int)} method. * @param ch the character to be converted. * @return the titlecase equivalent of the character, if any; * otherwise, the character itself. * @see java.lang.Character#isTitleCase(char) * @see java.lang.Character#toLowerCase(char) * @see java.lang.Character#toUpperCase(char) * Converts the character (Unicode code point) argument to titlecase using case mapping * information from the UnicodeData file. If a character has no * explicit titlecase mapping and is not itself a titlecase char * according to UnicodeData, then the uppercase mapping is * returned as an equivalent titlecase mapping. If the * character argument is already a titlecase * character, the same character value will be * <code>Character.isTitleCase(Character.toTitleCase(codePoint))</code> * does not always return <code>true</code> for some ranges of * @param codePoint the character (Unicode code point) to be converted. * @return the titlecase equivalent of the character, if any; * otherwise, the character itself. * @see java.lang.Character#isTitleCase(int) * @see java.lang.Character#toLowerCase(int) * @see java.lang.Character#toUpperCase(int) * Returns the numeric value of the character <code>ch</code> in the * If the radix is not in the range <code>MIN_RADIX</code> <= * <code>radix</code> <= <code>MAX_RADIX</code> or if the * value of <code>ch</code> is not a valid digit in the specified * radix, <code>-1</code> is returned. A character is a valid digit * if at least one of the following is true: * <li>The method <code>isDigit</code> is <code>true</code> of the character * and the Unicode decimal digit value of the character (or its * single-character decomposition) is less than the specified radix. * In this case the decimal digit value is returned. * <li>The character is one of the uppercase Latin letters * <code>'A'</code> through <code>'Z'</code> and its code is less than * <code>radix + 'A' - 10</code>. * In this case, <code>ch - 'A' + 10</code> * <li>The character is one of the lowercase Latin letters * <code>'a'</code> through <code>'z'</code> and its code is less than * <code>radix + 'a' - 10</code>. * In this case, <code>ch - 'a' + 10</code> * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #digit(int, int)} method. * @param ch the character to be converted. * @param radix the radix. * @return the numeric value represented by the character in the * @see java.lang.Character#forDigit(int, int) * @see java.lang.Character#isDigit(char) * Returns the numeric value of the specified character (Unicode * code point) in the specified radix. * <p>If the radix is not in the range <code>MIN_RADIX</code> <= * <code>radix</code> <= <code>MAX_RADIX</code> or if the * character is not a valid digit in the specified * radix, <code>-1</code> is returned. A character is a valid digit * if at least one of the following is true: * <li>The method {@link #isDigit(int) isDigit(codePoint)} is <code>true</code> of the character * and the Unicode decimal digit value of the character (or its * single-character decomposition) is less than the specified radix. * In this case the decimal digit value is returned. * <li>The character is one of the uppercase Latin letters * <code>'A'</code> through <code>'Z'</code> and its code is less than * <code>radix + 'A' - 10</code>. * In this case, <code>ch - 'A' + 10</code> * <li>The character is one of the lowercase Latin letters * <code>'a'</code> through <code>'z'</code> and its code is less than * <code>radix + 'a' - 10</code>. * In this case, <code>ch - 'a' + 10</code> * @param codePoint the character (Unicode code point) to be converted. * @param radix the radix. * @return the numeric value represented by the character in the * @see java.lang.Character#forDigit(int, int) * @see java.lang.Character#isDigit(int) * Returns the <code>int</code> value that the specified Unicode * character represents. For example, the character * <code>'\u216C'</code> (the roman numeral fifty) will return * an int with a value of 50. * The letters A-Z in their uppercase (<code>'\u0041'</code> through * <code>'\u005A'</code>), lowercase * (<code>'\u0061'</code> through <code>'\u007A'</code>), and * full width variant (<code>'\uFF21'</code> through * <code>'\uFF3A'</code> and <code>'\uFF41'</code> through * <code>'\uFF5A'</code>) forms have numeric values from 10 * through 35. This is independent of the Unicode specification, * which does not assign numeric values to these <code>char</code> * If the character does not have a numeric value, then -1 is returned. * If the character has a numeric value that cannot be represented as a * nonnegative integer (for example, a fractional value), then -2 * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #getNumericValue(int)} method. * @param ch the character to be converted. * @return the numeric value of the character, as a nonnegative <code>int</code> * value; -2 if the character has a numeric value that is not a * nonnegative integer; -1 if the character has no numeric value. * @see java.lang.Character#forDigit(int, int) * @see java.lang.Character#isDigit(char) * Returns the <code>int</code> value that the specified * character (Unicode code point) represents. For example, the character * <code>'\u216C'</code> (the Roman numeral fifty) will return * an <code>int</code> with a value of 50. * The letters A-Z in their uppercase (<code>'\u0041'</code> through * <code>'\u005A'</code>), lowercase * (<code>'\u0061'</code> through <code>'\u007A'</code>), and * full width variant (<code>'\uFF21'</code> through * <code>'\uFF3A'</code> and <code>'\uFF41'</code> through * <code>'\uFF5A'</code>) forms have numeric values from 10 * through 35. This is independent of the Unicode specification, * which does not assign numeric values to these <code>char</code> * If the character does not have a numeric value, then -1 is returned. * If the character has a numeric value that cannot be represented as a * nonnegative integer (for example, a fractional value), then -2 * @param codePoint the character (Unicode code point) to be converted. * @return the numeric value of the character, as a nonnegative <code>int</code> * value; -2 if the character has a numeric value that is not a * nonnegative integer; -1 if the character has no numeric value. * @see java.lang.Character#forDigit(int, int) * @see java.lang.Character#isDigit(int) * Determines if the specified character is ISO-LATIN-1 white space. * This method returns <code>true</code> for the following five * <tr><td><code>'\t'</code></td> <td><code>'\u0009'</code></td> * <td><code>HORIZONTAL TABULATION</code></td></tr> * <tr><td><code>'\n'</code></td> <td><code>'\u000A'</code></td> * <td><code>NEW LINE</code></td></tr> * <tr><td><code>'\f'</code></td> <td><code>'\u000C'</code></td> * <td><code>FORM FEED</code></td></tr> * <tr><td><code>'\r'</code></td> <td><code>'\u000D'</code></td> * <td><code>CARRIAGE RETURN</code></td></tr> * <tr><td><code>' '</code></td> <td><code>'\u0020'</code></td> * <td><code>SPACE</code></td></tr> * @param ch the character to be tested. * @return <code>true</code> if the character is ISO-LATIN-1 white * space; <code>false</code> otherwise. * @see java.lang.Character#isSpaceChar(char) * @see java.lang.Character#isWhitespace(char) * @deprecated Replaced by isWhitespace(char). (
1L <<
0x0020)) >>
ch) &
1L) !=
0);
* Determines if the specified character is a Unicode space character. * A character is considered to be a space character if and only if * it is specified to be a space character by the Unicode standard. This * method returns true if the character's general category type is any of * <li> <code>SPACE_SEPARATOR</code> * <li> <code>LINE_SEPARATOR</code> * <li> <code>PARAGRAPH_SEPARATOR</code> * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isSpaceChar(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is a space character; * <code>false</code> otherwise. * @see java.lang.Character#isWhitespace(char) * Determines if the specified character (Unicode code point) is a * Unicode space character. A character is considered to be a * space character if and only if it is specified to be a space * character by the Unicode standard. This method returns true if * the character's general category type is any of the following: * <li> {@link #SPACE_SEPARATOR} * <li> {@link #LINE_SEPARATOR} * <li> {@link #PARAGRAPH_SEPARATOR} * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is a space character; * <code>false</code> otherwise. * @see java.lang.Character#isWhitespace(int) * Determines if the specified character is white space according to Java. * A character is a Java whitespace character if and only if it satisfies * one of the following criteria: * <li> It is a Unicode space character (<code>SPACE_SEPARATOR</code>, * <code>LINE_SEPARATOR</code>, or <code>PARAGRAPH_SEPARATOR</code>) * but is not also a non-breaking space (<code>'\u00A0'</code>, * <code>'\u2007'</code>, <code>'\u202F'</code>). * <li> It is <code>'\u0009'</code>, HORIZONTAL TABULATION. * <li> It is <code>'\u000A'</code>, LINE FEED. * <li> It is <code>'\u000B'</code>, VERTICAL TABULATION. * <li> It is <code>'\u000C'</code>, FORM FEED. * <li> It is <code>'\u000D'</code>, CARRIAGE RETURN. * <li> It is <code>'\u001C'</code>, FILE SEPARATOR. * <li> It is <code>'\u001D'</code>, GROUP SEPARATOR. * <li> It is <code>'\u001E'</code>, RECORD SEPARATOR. * <li> It is <code>'\u001F'</code>, UNIT SEPARATOR. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isWhitespace(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is a Java whitespace * character; <code>false</code> otherwise. * @see java.lang.Character#isSpaceChar(char) * Determines if the specified character (Unicode code point) is * white space according to Java. A character is a Java * whitespace character if and only if it satisfies one of the * <li> It is a Unicode space character ({@link #SPACE_SEPARATOR}, * {@link #LINE_SEPARATOR}, or {@link #PARAGRAPH_SEPARATOR}) * but is not also a non-breaking space (<code>'\u00A0'</code>, * <code>'\u2007'</code>, <code>'\u202F'</code>). * <li> It is <code>'\u0009'</code>, HORIZONTAL TABULATION. * <li> It is <code>'\u000A'</code>, LINE FEED. * <li> It is <code>'\u000B'</code>, VERTICAL TABULATION. * <li> It is <code>'\u000C'</code>, FORM FEED. * <li> It is <code>'\u000D'</code>, CARRIAGE RETURN. * <li> It is <code>'\u001C'</code>, FILE SEPARATOR. * <li> It is <code>'\u001D'</code>, GROUP SEPARATOR. * <li> It is <code>'\u001E'</code>, RECORD SEPARATOR. * <li> It is <code>'\u001F'</code>, UNIT SEPARATOR. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is a Java whitespace * character; <code>false</code> otherwise. * @see java.lang.Character#isSpaceChar(int) * Determines if the specified character is an ISO control * character. A character is considered to be an ISO control * character if its code is in the range <code>'\u0000'</code> * through <code>'\u001F'</code> or in the range * <code>'\u007F'</code> through <code>'\u009F'</code>. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isISOControl(int)} method. * @param ch the character to be tested. * @return <code>true</code> if the character is an ISO control character; * <code>false</code> otherwise. * @see java.lang.Character#isSpaceChar(char) * @see java.lang.Character#isWhitespace(char) * Determines if the referenced character (Unicode code point) is an ISO control * character. A character is considered to be an ISO control * character if its code is in the range <code>'\u0000'</code> * through <code>'\u001F'</code> or in the range * <code>'\u007F'</code> through <code>'\u009F'</code>. * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is an ISO control character; * <code>false</code> otherwise. * @see java.lang.Character#isSpaceChar(int) * @see java.lang.Character#isWhitespace(int) * Returns a value indicating a character's general category. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #getType(int)} method. * @param ch the character to be tested. * @return a value of type <code>int</code> representing the * character's general category. * @see java.lang.Character#COMBINING_SPACING_MARK * @see java.lang.Character#CONNECTOR_PUNCTUATION * @see java.lang.Character#CONTROL * @see java.lang.Character#CURRENCY_SYMBOL * @see java.lang.Character#DASH_PUNCTUATION * @see java.lang.Character#DECIMAL_DIGIT_NUMBER * @see java.lang.Character#ENCLOSING_MARK * @see java.lang.Character#END_PUNCTUATION * @see java.lang.Character#FINAL_QUOTE_PUNCTUATION * @see java.lang.Character#FORMAT * @see java.lang.Character#INITIAL_QUOTE_PUNCTUATION * @see java.lang.Character#LETTER_NUMBER * @see java.lang.Character#LINE_SEPARATOR * @see java.lang.Character#LOWERCASE_LETTER * @see java.lang.Character#MATH_SYMBOL * @see java.lang.Character#MODIFIER_LETTER * @see java.lang.Character#MODIFIER_SYMBOL * @see java.lang.Character#NON_SPACING_MARK * @see java.lang.Character#OTHER_LETTER * @see java.lang.Character#OTHER_NUMBER * @see java.lang.Character#OTHER_PUNCTUATION * @see java.lang.Character#OTHER_SYMBOL * @see java.lang.Character#PARAGRAPH_SEPARATOR * @see java.lang.Character#PRIVATE_USE * @see java.lang.Character#SPACE_SEPARATOR * @see java.lang.Character#START_PUNCTUATION * @see java.lang.Character#SURROGATE * @see java.lang.Character#TITLECASE_LETTER * @see java.lang.Character#UNASSIGNED * @see java.lang.Character#UPPERCASE_LETTER * Returns a value indicating a character's general category. * @param codePoint the character (Unicode code point) to be tested. * @return a value of type <code>int</code> representing the * character's general category. * @see Character#COMBINING_SPACING_MARK COMBINING_SPACING_MARK * @see Character#CONNECTOR_PUNCTUATION CONNECTOR_PUNCTUATION * @see Character#CONTROL CONTROL * @see Character#CURRENCY_SYMBOL CURRENCY_SYMBOL * @see Character#DASH_PUNCTUATION DASH_PUNCTUATION * @see Character#DECIMAL_DIGIT_NUMBER DECIMAL_DIGIT_NUMBER * @see Character#ENCLOSING_MARK ENCLOSING_MARK * @see Character#END_PUNCTUATION END_PUNCTUATION * @see Character#FINAL_QUOTE_PUNCTUATION FINAL_QUOTE_PUNCTUATION * @see Character#FORMAT FORMAT * @see Character#INITIAL_QUOTE_PUNCTUATION INITIAL_QUOTE_PUNCTUATION * @see Character#LETTER_NUMBER LETTER_NUMBER * @see Character#LINE_SEPARATOR LINE_SEPARATOR * @see Character#LOWERCASE_LETTER LOWERCASE_LETTER * @see Character#MATH_SYMBOL MATH_SYMBOL * @see Character#MODIFIER_LETTER MODIFIER_LETTER * @see Character#MODIFIER_SYMBOL MODIFIER_SYMBOL * @see Character#NON_SPACING_MARK NON_SPACING_MARK * @see Character#OTHER_LETTER OTHER_LETTER * @see Character#OTHER_NUMBER OTHER_NUMBER * @see Character#OTHER_PUNCTUATION OTHER_PUNCTUATION * @see Character#OTHER_SYMBOL OTHER_SYMBOL * @see Character#PARAGRAPH_SEPARATOR PARAGRAPH_SEPARATOR * @see Character#PRIVATE_USE PRIVATE_USE * @see Character#SPACE_SEPARATOR SPACE_SEPARATOR * @see Character#START_PUNCTUATION START_PUNCTUATION * @see Character#SURROGATE SURROGATE * @see Character#TITLECASE_LETTER TITLECASE_LETTER * @see Character#UNASSIGNED UNASSIGNED * @see Character#UPPERCASE_LETTER UPPERCASE_LETTER * Determines the character representation for a specific digit in * the specified radix. If the value of <code>radix</code> is not a * valid radix, or the value of <code>digit</code> is not a valid * digit in the specified radix, the null character * (<code>'\u0000'</code>) is returned. * The <code>radix</code> argument is valid if it is greater than or * equal to <code>MIN_RADIX</code> and less than or equal to * <code>MAX_RADIX</code>. The <code>digit</code> argument is valid if * <code>0 <=digit < radix</code>. * If the digit is less than 10, then * <code>'0' + digit</code> is returned. Otherwise, the value * <code>'a' + digit - 10</code> is returned. * @param digit the number to convert to a character. * @param radix the radix. * @return the <code>char</code> representation of the specified digit * in the specified radix. * @see java.lang.Character#MIN_RADIX * @see java.lang.Character#MAX_RADIX * @see java.lang.Character#digit(char, int) return (
char)(
'0' +
digit);
return (
char)(
'a' -
10 +
digit);
* Returns the Unicode directionality property for the given * character. Character directionality is used to calculate the * visual ordering of text. The directionality value of undefined * <code>char</code> values is <code>DIRECTIONALITY_UNDEFINED</code>. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #getDirectionality(int)} method. * @param ch <code>char</code> for which the directionality property * @return the directionality property of the <code>char</code> value. * @see Character#DIRECTIONALITY_UNDEFINED * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR * @see Character#DIRECTIONALITY_ARABIC_NUMBER * @see Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR * @see Character#DIRECTIONALITY_NONSPACING_MARK * @see Character#DIRECTIONALITY_BOUNDARY_NEUTRAL * @see Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR * @see Character#DIRECTIONALITY_SEGMENT_SEPARATOR * @see Character#DIRECTIONALITY_WHITESPACE * @see Character#DIRECTIONALITY_OTHER_NEUTRALS * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT * Returns the Unicode directionality property for the given * character (Unicode code point). Character directionality is * used to calculate the visual ordering of text. The * directionality value of undefined character is {@link * #DIRECTIONALITY_UNDEFINED}. * @param codePoint the character (Unicode code point) for which * the directionality property is requested. * @return the directionality property of the character. * @see Character#DIRECTIONALITY_UNDEFINED DIRECTIONALITY_UNDEFINED * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT DIRECTIONALITY_LEFT_TO_RIGHT * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT DIRECTIONALITY_RIGHT_TO_LEFT * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER DIRECTIONALITY_EUROPEAN_NUMBER * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR * @see Character#DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR * @see Character#DIRECTIONALITY_ARABIC_NUMBER DIRECTIONALITY_ARABIC_NUMBER * @see Character#DIRECTIONALITY_COMMON_NUMBER_SEPARATOR DIRECTIONALITY_COMMON_NUMBER_SEPARATOR * @see Character#DIRECTIONALITY_NONSPACING_MARK DIRECTIONALITY_NONSPACING_MARK * @see Character#DIRECTIONALITY_BOUNDARY_NEUTRAL DIRECTIONALITY_BOUNDARY_NEUTRAL * @see Character#DIRECTIONALITY_PARAGRAPH_SEPARATOR DIRECTIONALITY_PARAGRAPH_SEPARATOR * @see Character#DIRECTIONALITY_SEGMENT_SEPARATOR DIRECTIONALITY_SEGMENT_SEPARATOR * @see Character#DIRECTIONALITY_WHITESPACE DIRECTIONALITY_WHITESPACE * @see Character#DIRECTIONALITY_OTHER_NEUTRALS DIRECTIONALITY_OTHER_NEUTRALS * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING * @see Character#DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING * @see Character#DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE * @see Character#DIRECTIONALITY_POP_DIRECTIONAL_FORMAT DIRECTIONALITY_POP_DIRECTIONAL_FORMAT * Determines whether the character is mirrored according to the * Unicode specification. Mirrored characters should have their * glyphs horizontally mirrored when displayed in text that is * right-to-left. For example, <code>'\u0028'</code> LEFT * PARENTHESIS is semantically defined to be an <i>opening * parenthesis</i>. This will appear as a "(" in text that is * left-to-right but as a ")" in text that is right-to-left. * <p><b>Note:</b> This method cannot handle <a * href="#supplementary"> supplementary characters</a>. To support * all Unicode characters, including supplementary characters, use * the {@link #isMirrored(int)} method. * @param ch <code>char</code> for which the mirrored property is requested * @return <code>true</code> if the char is mirrored, <code>false</code> * if the <code>char</code> is not mirrored or is not defined. * Determines whether the specified character (Unicode code point) * is mirrored according to the Unicode specification. Mirrored * characters should have their glyphs horizontally mirrored when * displayed in text that is right-to-left. For example, * <code>'\u0028'</code> LEFT PARENTHESIS is semantically * defined to be an <i>opening parenthesis</i>. This will appear * as a "(" in text that is left-to-right but as a ")" in text * @param codePoint the character (Unicode code point) to be tested. * @return <code>true</code> if the character is mirrored, <code>false</code> * if the character is not mirrored or is not defined. * Compares two <code>Character</code> objects numerically. * @param anotherCharacter the <code>Character</code> to be compared. * @return the value <code>0</code> if the argument <code>Character</code> * is equal to this <code>Character</code>; a value less than * <code>0</code> if this <code>Character</code> is numerically less * than the <code>Character</code> argument; and a value greater than * <code>0</code> if this <code>Character</code> is numerically greater * than the <code>Character</code> argument (unsigned comparison). * Note that this is strictly a numerical comparison; it is not * Converts the character (Unicode code point) argument to uppercase using * information from the UnicodeData file. * @param codePoint the character (Unicode code point) to be converted. * @return either the uppercase equivalent of the character, if * any, or an error flag (<code>Character.ERROR</code>) * that indicates that a 1:M <code>char</code> mapping exists. * @see java.lang.Character#isLowerCase(char) * @see java.lang.Character#isUpperCase(char) * @see java.lang.Character#toLowerCase(char) * @see java.lang.Character#toTitleCase(char) * Converts the character (Unicode code point) argument to uppercase using case * mapping information from the SpecialCasing file in the Unicode * specification. If a character has no explicit uppercase * mapping, then the <code>char</code> itself is returned in the * @param codePoint the character (Unicode code point) to be converted. * @return a <code>char[]</code> with the uppercased character. // As of Unicode 4.0, 1:M uppercasings only happen in the BMP. * The number of bits used to represent a <tt>char</tt> value in unsigned * binary form, constant {@code 16}. public static final int SIZE =
16;
* Returns the value obtained by reversing the order of the bytes in the * specified <tt>char</tt> value. * @return the value obtained by reversing (or, equivalently, swapping) * the bytes in the specified <tt>char</tt> value. return (
char) (((
ch &
0xFF00) >>
8) | (
ch <<
8));