UCDecoder.java revision 2362
481N/A * Copyright (c) 1995, 2000, Oracle and/or its affiliates. All rights reserved. 481N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 481N/A * This code is free software; you can redistribute it and/or modify it 481N/A * under the terms of the GNU General Public License version 2 only, as 481N/A * published by the Free Software Foundation. Oracle designates this 481N/A * particular file as subject to the "Classpath" exception as provided 481N/A * by Oracle in the LICENSE file that accompanied this code. 481N/A * This code is distributed in the hope that it will be useful, but WITHOUT 481N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 481N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 481N/A * version 2 for more details (a copy is included in the LICENSE file that 481N/A * accompanied this code). 481N/A * You should have received a copy of the GNU General Public License version 481N/A * 2 along with this work; if not, write to the Free Software Foundation, 481N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 481N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 481N/A * or visit www.oracle.com if you need additional information or have any 481N/A * This class implements a robust character decoder. The decoder will 481N/A * converted encoded text into binary data. 481N/A * The basic encoding unit is a 3 character atom. It encodes two bytes 481N/A * of data. Bytes are encoded into a 64 character set, the characters 481N/A * were chosen specifically because they appear in all codesets. 1304N/A * We don't care what their numerical equivalent is because 481N/A * we use a character array to map them. This is like UUencoding 481N/A * with the dependency on ASCII removed. 2976N/A * The three chars that make up an atom are encoded as follows: 2976N/A * 00xxxyyy 00axxxxx 00byyyyy 2650N/A * 00 = leading zeros, all values are 0 - 63 2976N/A * xxxyyy - Top 3 bits of X, Top 3 bits of Y 481N/A * axxxxx - a = X parity bit, xxxxx lower 5 bits of X 481N/A * byyyyy - b = Y parity bit, yyyyy lower 5 bits of Y * The atoms are arranged into lines suitable for inclusion into an * email message or text file. The number of bytes that are encoded * per line is 48 which keeps the total line length under 80 chars) * Each line has the form( * *(LLSS)(DDDD)(DDDD)(DDDD)...(CRC) * Where each (xxx) represents a three character atom. * (LLSS) - 8 bit length (high byte), and sequence number * (DDDD) - Data byte atoms, if length is odd, last data * atom has (DD00) (high byte data, low byte 0) * (CRC) - 16 bit CRC for the line, includes length, * sequence, and all data bytes. If there is a * zero pad byte (odd length) it is _NOT_ * If an error is encountered during decoding this class throws a * CEFormatException. The specific detail messages are: * "UCDecoder: High byte parity error." * "UCDecoder: Low byte parity error." * "UCDecoder: Out of sequence line." * "UCDecoder: CRC check failed." /** This class encodes two bytes per atom. */ /** this class encodes 48 bytes per line */ /* this is the UCE mapping of 0-63 to characters .. */ (
byte)
'0',(
byte)
'1',(
byte)
'2',(
byte)
'3',(
byte)
'4',(
byte)
'5',(
byte)
'6',(
byte)
'7',
// 0 (
byte)
'8',(
byte)
'9',(
byte)
'A',(
byte)
'B',(
byte)
'C',(
byte)
'D',(
byte)
'E',(
byte)
'F',
// 1 (
byte)
'G',(
byte)
'H',(
byte)
'I',(
byte)
'J',(
byte)
'K',(
byte)
'L',(
byte)
'M',(
byte)
'N',
// 2 (
byte)
'O',(
byte)
'P',(
byte)
'Q',(
byte)
'R',(
byte)
'S',(
byte)
'T',(
byte)
'U',(
byte)
'V',
// 3 (
byte)
'W',(
byte)
'X',(
byte)
'Y',(
byte)
'Z',(
byte)
'a',(
byte)
'b',(
byte)
'c',(
byte)
'd',
// 4 (
byte)
'e',(
byte)
'f',(
byte)
'g',(
byte)
'h',(
byte)
'i',(
byte)
'j',(
byte)
'k',(
byte)
'l',
// 5 (
byte)
'm',(
byte)
'n',(
byte)
'o',(
byte)
'p',(
byte)
'q',(
byte)
'r',(
byte)
's',(
byte)
't',
// 6 (
byte)
'u',(
byte)
'v',(
byte)
'w',(
byte)
'x',(
byte)
'y',(
byte)
'z',(
byte)
'(',(
byte)
')' // 7 private byte tmp[] =
new byte[
2];
* Decode one atom - reads the characters from the input stream, decodes * them, and checks for valid parity. byte a = -
1, b = -
1, c = -
1;
byte tmp[] =
new byte[
3];
for (i =
0; (i <
64) && ((a == -
1) || (b == -
1) || (c == -
1)); i++) {
high_byte = (
byte) (((a &
0x38) <<
2) + (b &
0x1f));
low_byte = (
byte) (((a &
0x7) <<
5) + (c &
0x1f));
for (i =
1; i <
256; i = i *
2) {
* decodeBufferPrefix initializes the sequence number to zero. * decodeLinePrefix reads the sequence number and the number of * encoded bytes from the line. If the sequence number is not the * previous sequence number + 1 then an exception is thrown. * UCE lines are line terminator immune, they all start with * * so the other thing this method does is scan for the next line * by looking for the * character. * @exception CEFormatException out of sequence lines detected. * this method reads the CRC that is at the end of every line and * verifies that it matches the computed CRC. * @exception CEFormatException if CRC check fails.