3261N/A * Copyright (c) 2009, 2010, Oracle and/or its affiliates. All rights reserved. 1245N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 1245N/A * This code is free software; you can redistribute it and/or modify it 1245N/A * under the terms of the GNU General Public License version 2 only, as 2362N/A * published by the Free Software Foundation. Oracle designates this 1245N/A * particular file as subject to the "Classpath" exception as provided 2362N/A * by Oracle in the LICENSE file that accompanied this code. 1245N/A * This code is distributed in the hope that it will be useful, but WITHOUT 1245N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1245N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 1245N/A * version 2 for more details (a copy is included in the LICENSE file that 1245N/A * You should have received a copy of the GNU General Public License version 1245N/A * 2 along with this work; if not, write to the Free Software Foundation, 1245N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 2362N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 2362N/A * or visit www.oracle.com if you need additional information or have any 1245N/A * Four types of "DoubleByte" charsets are implemented in this class 1245N/A * The "mostly widely used" multibyte charset, a combination of 1245N/A * a singlebyte character set (usually the ASCII charset) and a 1245N/A * doublebyte character set. The codepoint values of singlebyte 1245N/A * and doublebyte don't overlap. Microsoft's multibyte charsets 1245N/A * and IBM's "DBCS_ASCII" charsets, such as IBM1381, 942, 943, 1245N/A * 948, 949 and 950 are such charsets. 1245N/A * IBM EBCDIC Mix multibyte charset. Use SO and SI to shift (switch) 1245N/A * in and out between the singlebyte character set and doublebyte 1245N/A * It's a "simple" form of EUC encoding scheme, only have the 1245N/A * singlebyte character set G0 and one doublebyte character set 1245N/A * G1 are defined, G2 (with SS2) and G3 (with SS3) are not used. 1245N/A * So it is actually the same as the "typical" type (1) mentioned 1245N/A * above, except it return "malformed" for the SS2 and SS3 when 1245N/A * A "pure" doublebyte only character set. From implementation 1245N/A * point of view, this is the type (1) with "decodeSingle" always 1245N/A * For simplicity, all implementations share the same decoding and 1245N/A * public char decodeSingle(int b) { 1245N/A * public char decodeDouble(int b1, int b2) { 1245N/A * if (b2 < b2Min || b2 > b2Max) 1245N/A * return UNMAPPABLE_DECODING; 1245N/A * return b2c[b1][b2 - b2Min]; 1245N/A * (1)b2Min, b2Max are the corresponding min and max value of the 1245N/A * low-half of the double-byte. 1245N/A * (2)The high 8-bit/b1 of the double-byte are used to indexed into 1245N/A * public int encodeChar(char ch) { 1245N/A * return c2b[c2bIndex[ch >> 8] + (ch & 0xff)]; 1306N/A // Make some protected methods public for use by JISAutoDetect 1245N/A // Check validity of dbcs ebcdic byte pair values 1245N/A // First byte : 0x41 -- 0xFE 1245N/A // Second byte: 0x41 -- 0xFE 1245N/A // Doublebyte blank: 0x4040 1245N/A // The validation implementation in "old" DBCS_IBM_EBCDIC and sun.io 1245N/A // if ((b1 != 0x40 || b2 != 0x40) && 1245N/A // (b2 < 0x41 || b2 > 0xfe)) {...} 1245N/A || (
b1 ==
0x40 &&
b2 ==
0x40);
// DBCS-HOST SPACE 1245N/A // don't check dp/dl together here, it's possible to 1245N/A // decdoe a SO/SI without space in output buffer. 1245N/A // The only thing we need to "override" is to check SS2/SS3 and 1245N/A // return "malformed" if found 1245N/A // init the c2b and c2bIndex tables from b2c. 1245N/A // add c->b only nr entries