FloatingDecimal.java revision 3909
3909N/A * Copyright (c) 1996, 2011, Oracle and/or its affiliates. All rights reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 2362N/A * published by the Free Software Foundation. Oracle designates this 0N/A * particular file as subject to the "Classpath" exception as provided 2362N/A * by Oracle in the LICENSE file that accompanied this code. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 2362N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 2362N/A * or visit www.oracle.com if you need additional information or have any 0N/A * Constants of the implementation 0N/A * Most are IEEE-754 related. 0N/A * (There are more really boring constants at the end.) 0N/A * count number of bits from high-order 1 bit to low-order 1 bit, 0N/A // the strategy is to shift until we get a non-zero sign bit 0N/A // then shift until we have no bits left, counting the difference. 0N/A // we do byte shifting as a hack. Hope it helps. 0N/A if ( v ==
0L )
return 0;
0N/A while ( v >
0L ) {
// i.e. while ((v&highbit) == 0L ) 0N/A * Keep big powers of 5 handy for future reference. 0N/A assert p >=
0 : p;
// negative power of 5 0N/A // construct the value. 0N/A // in order to compute 5^p, 0N/A // compute its square root, 5^(p/2) and square. 0N/A // or, let q = p / 2, r = p -q, then 0N/A // 5^p = 5^(q+r) = 5^q * 5^r 0N/A // a common operation 0N/A // another common operation 0N/A * Make a floating double into a FDBigInt. 0N/A * This could also be structured as a FDBigInt 0N/A * constructor, but we'd have to build a lot of knowledge 0N/A * about floating-point representation into it, and we don't want to. 0N/A * AS A SIDE EFFECT, THIS METHOD WILL SET THE INSTANCE VARIABLES 0N/A * bigIntExp and bigIntNBits 0N/A * We now know where the high-order 1 bit is, 0N/A * and we know how many there are. 0N/A * Compute a number that is the ULP of the given value, 0N/A * More difficult if subtracting and the argument 0N/A * is a normalized a power of 2, as the ULP changes at these points. 0N/A private static double 0N/A // for subtraction from normalized, powers of 2, 0N/A // use next-smaller exponent 0N/A * Round a double to a float. 0N/A * In addition to the fraction bits of the double, 0N/A * look at the class instance variable roundDir, 0N/A * which should help us avoid double-rounding error. 0N/A * roundDir was set in hardValueOf if the estimate was 0N/A * close enough, but not exact. It tells us which direction 0N/A * of rounding is preferred. 0N/A // what we have here is special. 0N/A // don't worry, the right thing will happen. 0N/A * This is the easy subcase -- 0N/A * all the significant bits, after scaling, are held in lvalue. 0N/A * negSign and decExponent tell us what processing and scaling 0N/A * has already been done. Exceptional cases have already been 0N/A * lvalue is a finite number (not Inf, nor NaN) 0N/A * lvalue > 0L (not zero, nor negative). 0N/A * The only reason that we develop the digits here, rather than 0N/A * calling on Long.toString() is that we can do it a little faster, 0N/A * and besides want to treat trailing 0s specially. If Long.toString 0N/A * changes, we should re-evaluate this strategy! 0N/A // Discard non-significant low-order bits, while rounding, 0N/A // up to insignificant value. 0N/A // round up based on the low-order bits we're discarding 0N/A // even easier subcase! 0N/A // can do int arithmetic rather than long! 0N/A // same algorithm as above (same bugs, too ) 0N/A // but using long arithmetic. 0N/A // add one to the least significant digit. 0N/A // in the unlikely event there is a carry out, 0N/A // assert that this will only happen where there 0N/A // is only one digit, e.g. (float)1e-44 seems to do it. 0N/A while ( q ==
'9' && i >
0 ){
0N/A // carryout! High-order 1, rest 0s, larger exp. 0N/A // else fall through. 0N/A * FIRST IMPORTANT CONSTRUCTOR: DOUBLE 0N/A // discover and delete sign 0N/A // Discover obvious special cases of NaN and Infinity. 0N/A // Normalize denormalized numbers. 0N/A // Insert assumed high-order bit for normalized numbers. 0N/A // Subtract exponent bias. 0N/A // not a denorm, just a 0! 0N/A // call the routine that actually does all the hard work. 0N/A * SECOND IMPORTANT CONSTRUCTOR: SINGLE 0N/A // discover and delete sign 0N/A // Discover obvious special cases of NaN and Infinity. 0N/A // Normalize denormalized numbers. 0N/A // Insert assumed high-order bit for normalized numbers. 0N/A // Subtract exponent bias. 0N/A // not a denorm, just a 0! 0N/A // call the routine that actually does all the hard work. 0N/A int nTinyBits;
// number of these to the right of the point. 0N/A // Examine number. Determine if it is an easy case, 0N/A // which we can do pretty trivially using float/long conversion, 0N/A // or whether we must do real work. 0N/A // Look more closely at the number to decide if, 0N/A // with scaling by 10^nTinyBits, the result will fit in 0N/A * take the fraction bits, which are normalized. 0N/A * (a) nTinyBits == 0: Shift left or right appropriately 0N/A * to align the binary point at the extreme right, i.e. 0N/A * where a long int point is expected to be. The integer 0N/A * result is easily converted to a string. 0N/A * (b) nTinyBits > 0: Shift right by expShift-nFractBits, 0N/A * which effectively converts to long and scales by 0N/A * 2^nTinyBits. Then multiply by 5^nTinyBits to 0N/A * complete the scaling. We know this won't overflow 0N/A * because we just counted the number of bits necessary 0N/A * in the result. The integer you get from this can 0N/A * then be converted to a string pretty easily. 0N/A * The following causes excess digits to be printed 0N/A * out in the single-float case. Our manipulation of 0N/A * halfULP here is apparently not correct. If we 0N/A * better understand how this works, perhaps we can 0N/A * use this special case again. But for the time being, 0N/A * fractBits >>>= expShift+1-nFractBits; 0N/A * fractBits *= long5pow[ nTinyBits ]; 0N/A * halfULP = long5pow[ nTinyBits ] >> (1+nSignificantBits-nFractBits); 0N/A * developLongDigits( -nTinyBits, fractBits, halfULP ); 0N/A * This is the hard case. We are going to compute large positive 0N/A * integers B and S and integer decExp, s.t. 0N/A * d = ( B / S ) * 10^decExp 0N/A * Obvious choices are: 0N/A * decExp = floor( log10(d) ) 0N/A * B = d * 2^nTinyBits * 10^max( 0, -decExp ) 0N/A * S = 10^max( 0, decExp) * 2^nTinyBits 0N/A * (noting that nTinyBits has already been forced to non-negative) 0N/A * I am also going to compute a large positive integer 0N/A * M = (1/2^nSignificantBits) * 2^nTinyBits * 10^max( 0, -decExp ) 0N/A * i.e. M is (1/2) of the ULP of d, scaled like B. 0N/A * When we iterate through dividing B/S and picking off the 0N/A * quotient bits, we will know when to stop when the remainder 0N/A * We keep track of powers of 2 and powers of 5. 0N/A * Estimate decimal exponent. (If it is small-ish, 0N/A * we could double-check.) 0N/A * First, scale the mantissa bits such that 1 <= d2 < 2. 0N/A * We are then going to estimate 0N/A * log10(d2) ~=~ (d2-1.5)/1.5 + log(1.5) 0N/A * and so we can estimate 0N/A * log10(d) ~=~ log10(d2) + binExp * log10(2) 0N/A * take the floor and call it decExp. 0N/A * FIXME -- use more precise constants here. It costs no more. 0N/A (
d2-
1.5D)*
0.289529654D +
0.176091259 + (
double)
binExp *
0.301029995663981 );
0N/A int B2,
B5;
// powers of 2 and powers of 5, respectively, in B 0N/A int S2,
S5;
// powers of 2 and powers of 5, respectively, in S 0N/A int M2,
M5;
// powers of 2 and powers of 5, respectively, in M 0N/A int Bbits;
// binary digits needed to represent B, approx. 0N/A int tenSbits;
// binary digits needed to represent 10*S, approx. 0N/A * the long integer fractBits contains the (nFractBits) interesting 0N/A * bits from the mantissa of d ( hidden 1 added if necessary) followed 0N/A * by (expShift+1-nFractBits) zeros. In the interest of compactness, 0N/A * I will shift out those zeros before turning fractBits into a 0N/A * FDBigInt. The resulting whole number will be 0N/A * d * 2^(nFractBits-1-binExp). 0N/A * HACK!! For exact powers of two, the next smallest number 0N/A * is only half as far away as we think (because the meaning of 0N/A * ULP changes at power-of-two bounds) for this reason, we 0N/A * hack M2. Hope this works. 0N/A // since we cannot scale M down far enough, 0N/A // we must scale the other values up. 0N/A * Construct, Scale, iterate. 0N/A * Some day, we'll write a stopping test that takes 0N/A * account of the asymmetry of the spacing of floating-point 0N/A * numbers below perfect powers of 2 0N/A * 26 Sept 96 is not that day. 0N/A * So we use a symmetric test. 0N/A * Detect the special cases where all the numbers we are about 0N/A * to compute will fit in int or long integers. 0N/A * In these cases, we will avoid doing FDBigInt arithmetic. 0N/A * We use the same algorithms, except that we "normalize" 0N/A * our FDBigInts before iterating. This is to make division easier, 0N/A * as it makes our fist guess (quotient of high-order words) 0N/A * Some day, we'll write a stopping test that takes 0N/A * account of the asymmetry of the spacing of floating-point 0N/A * numbers below perfect powers of 2 0N/A * 26 Sept 96 is not that day. 0N/A * So we use a symmetric test. 0N/A // wa-hoo! They're all ints! 0N/A * Unroll the first iteration. If our decExp estimate 0N/A * was too high, our first quotient will be zero. In this 0N/A * case, we discard it and decrement decExp. 0N/A assert q <
10 : q;
// excessively large digit 0N/A // oops. Usually ignore leading zero. 0N/A * HACK! Java spec sez that we always have at least 0N/A * one digit after the . in either F- or E-form output. 0N/A * Thus we will need more than one digit if we're using 0N/A assert q <
10 : q;
// excessively large digit 0N/A // hack -- m might overflow! 0N/A // in this case, it is certainly > b, 0N/A // and b+m > tens, too, since that has overflowed 0N/A // still good! they're all longs! 0N/A * Unroll the first iteration. If our decExp estimate 0N/A * was too high, our first quotient will be zero. In this 0N/A * case, we discard it and decrement decExp. 0N/A q = (
int) ( b / s );
0N/A b =
10L * ( b % s );
0N/A assert q <
10 : q;
// excessively large digit 0N/A // oops. Usually ignore leading zero. 0N/A * HACK! Java spec sez that we always have at least 0N/A * one digit after the . in either F- or E-form output. 0N/A * Thus we will need more than one digit if we're using 0N/A q = (
int) ( b / s );
0N/A assert q <
10 : q;
// excessively large digit 0N/A // hack -- m might overflow! 0N/A // in this case, it is certainly > b, 0N/A // and b+m > tens, too, since that has overflowed 0N/A * We really must do FDBigInt arithmetic. 0N/A * Fist, construct our FDBigInt initial values. 0N/A // normalize so that division works better 0N/A * Unroll the first iteration. If our decExp estimate 0N/A * was too high, our first quotient will be zero. In this 0N/A * case, we discard it and decrement decExp. 0N/A assert q <
10 : q;
// excessively large digit 0N/A // oops. Usually ignore leading zero. 0N/A * HACK! Java spec sez that we always have at least 0N/A * one digit after the . in either F- or E-form output. 0N/A * Thus we will need more than one digit if we're using 0N/A assert q <
10 : q;
// excessively large digit 0N/A * Last digit gets rounded based on stopping condition. 0N/A // choose based on which digits we like. 0N/A // most brain-dead version 0N/A // print digits.digits. 0N/A // decExponent has 1, 2, or 3, digits 0N/A }
else if (e <=
99) {
0N/A return new char[
26];
0N/A // throws NullPointerException if null 0N/A // Check for NaN and Infinity strings 0N/A if(c ==
'N' || c ==
'I') {
// possible NaN or infinity 0N/A // compare Input string to "NaN" or "Infinity" 0N/A else // something is amiss, throw exception 0N/A // For the candidate string to be a NaN or infinity, 0N/A // all characters in input string and target char[] 0N/A // must be matched ==> j must equal targetChars.length 0N/A // and i must equal l 0N/A else {
// something went wrong, throw exception 0N/A }
else if (c ==
'0') {
// check for hexadecimal floating-point number 0N/A if (
ch ==
'x' ||
ch ==
'X' )
// possible hex string 0N/A }
// look for and process decimal floating-point string 0N/A break;
// out of switch. 0N/A break;
// out of switch. 0N/A // already saw one ., this is the 2nd. 0N/A break;
// out of switch. 0N/A * At this point, we've scanned all the digits and decimal 0N/A * point we're going to see. Trim off leading and trailing 0N/A * zeros, which will just confuse us later, and adjust 0N/A * our initial decimal exponent accordingly. 0N/A * we have seen i total characters. 0N/A * nLeadZero of them were zeros before any other digits. 0N/A * nTrailZero of them were zeros after any other digits. 0N/A * if ( decSeen ), then a . was seen after decPt characters 0N/A * ( including leading zeros which have been discarded ) 0N/A * nDigits characters were neither lead nor trailing 0N/A * special hack: if we saw no non-zero digits, then the 0N/A * Unfortunately, we feel honor-bound to keep parsing! 0N/A // we saw NO DIGITS AT ALL, 0N/A // not even a crummy 0! 0N/A // this is not allowed. 0N/A /* Our initial exponent is decPt, adjusted by the number of 0N/A * discarded zeros. Or, if there was no decPt, 0N/A * then its just nDigits adjusted by discarded trailing zeros. 0N/A * Look for 'e' or 'E' and an optionally signed integer. 0N/A if ( (i < l) && (((c =
in.
charAt(i) )==
'e') || (c ==
'E') ) ){
0N/A // the next character will cause integer 0N/A // The intent here is to end up with 0N/A // infinity or zero, as appropriate. 0N/A // The reason for yielding such a small decExponent, 0N/A // rather than something intuitive such as 0N/A // expSign*Integer.MAX_VALUE, is that this value 0N/A // is subject to further manipulation in 0N/A // doubleValue() and floatValue(), and I don't want 0N/A // it to be able to cause overflow there! 0N/A // (The only way we can get into trouble here is for 0N/A // really outrageous nDigits+nTrailZero, such as 2 billion. ) 0N/A // this should not overflow, since we tested 0N/A // for expVal > (MAX+N), where N >= abs(decExp) 0N/A // if we saw something not a digit ( or end of string ) 0N/A // after the [Ee][+-], without seeing any digits at all 0N/A // this is certainly an error. If we saw some digits, 0N/A // but then some trailing garbage, that might be ok. 0N/A // so we just fall through in that case. 0N/A * We parsed everything we could. 0N/A * If there are leftovers, then this is not good input! 0N/A * Take a FloatingDecimal, which we presumably just scanned in, 0N/A * and find out what its value is, as a double. 0N/A * AS A SIDE EFFECT, SET roundDir TO INDICATE PREFERRED 0N/A * ROUNDING DIRECTION in case the result is really destined 0N/A * for a single-precision float. 0N/A // First, check for NaN and Infinity values 0N/A * convert the lead kDigits to a long integer. 0N/A // (special performance hack: start to do it using int) 0N/A * lValue now contains a long integer with the value of 0N/A * the first kDigits digits of the number. 0N/A * dValue contains the (double) of the same. 0N/A * possibly an easy case. 0N/A * We know that the digits can be represented 0N/A * exactly. And if the exponent isn't too outrageous, 0N/A * the whole thing can be done with one operation, 0N/A * thus one rounding error. 0N/A * Note that all our constructors trim all leading and 0N/A * trailing zeros, so simple values (including zero) 0N/A * will always end up here 0N/A * Can get the answer with one operation, 0N/A * thus one roundoff. 0N/A * We can multiply dValue by 10^(slop) 0N/A * and it is still "small" and exact. 0N/A * Then we can multiply by 10^(exp-slop) 0N/A * with one rounding. 0N/A * Else we have a hard case with a positive exp. 0N/A * Can get the answer in one division. 0N/A * Else we have a hard case with a negative exp. 0N/A * The sum of digits plus exponent is greater than 0N/A * what we think we can do with one error. 0N/A * Start by approximating the right answer by, 0N/A * naively, scaling by powers of 10. 0N/A * Lets face it. This is going to be 0N/A * Infinity. Cut to the chase. 0N/A * The reason for the weird exp > 1 condition 0N/A * in the above loop was so that the last multiply 0N/A * would get unrolled. We handle it here. 0N/A * It could overflow. 0N/A * Look more closely at the result. 0N/A * If the exponent is just one too large, 0N/A * then use the maximum finite as our estimate 0N/A * value. Else call the result infinity 0N/A * ( I presume this could happen because 0N/A * rounding forces the result here to be 0N/A * an ULP or two larger than 0N/A * Double.MAX_VALUE ). 0N/A * Lets face it. This is going to be 0N/A * zero. Cut to the chase. 0N/A * The reason for the weird exp > 1 condition 0N/A * in the above loop was so that the last multiply 0N/A * would get unrolled. We handle it here. 0N/A * It could underflow. 0N/A * Look more closely at the result. 0N/A * If the exponent is just one too small, 0N/A * then use the minimum finite as our estimate 0N/A * value. Else call the result 0.0 0N/A * ( I presume this could happen because 0N/A * rounding forces the result here to be 0N/A * an ULP or two less than 0N/A * Double.MIN_VALUE ). 0N/A * dValue is now approximately the result. 0N/A * The hard part is adjusting it, by comparison 0N/A * with FDBigInt arithmetic. 0N/A * Formulate the EXACT big-number result as 0N/A /* AS A SIDE EFFECT, THIS METHOD WILL SET THE INSTANCE VARIABLES 0N/A * bigIntExp and bigIntNBits 0N/A * Scale bigD, bigB appropriately for 0N/A * big-integer operations. 0N/A * Naively, we multiply by powers of ten 0N/A * and powers of two. What we actually do 0N/A * is keep track of the powers of 5 and 0N/A * powers of 2 we would use, then factor out 0N/A * common divisors before doing the work. 0N/A int B2,
B5;
// powers of 2, 5 in bigB 0N/A int D2,
D5;
// powers of 2, 5 in bigD 0N/A int Ulp2;
// powers of 2 in halfUlp. 0N/A // shift bigB and bigD left by a number s. t. 0N/A // halfUlp is still an integer. 0N/A // This is going to be a denormalized number 0N/A // (if not actually zero). 0N/A // half an ULP is at 2^-(expBias+expShift+1) 0N/A // if there are common factors of 2, we might just as well 0N/A // factor them out, as they add nothing useful. 0N/A // do multiplications by powers of 5 and 2 0N/A // bigB is the scaled-big-int version of our floating-point 0N/A // bigD is the scaled-big-int version of the exact value 0N/A // as we understand it. 0N/A // halfUlp is 1/2 an ulp of bigB, except for special cases 0N/A // of exact powers of 2 0N/A // the plan is to compare bigB with bigD, and if the difference 0N/A // is less than halfUlp, then we're satisfied. Otherwise, 0N/A // use the ratio of difference to halfUlp to calculate a fudge 0N/A // factor to add to the floating value, then go 'round again. 0N/A // candidate is a normalized exact power of 2 and 0N/A // is too big. We will be subtracting. 0N/A // For our purposes, ulp is the ulp of the 0N/A // next smaller range. 0N/A // rats. Cannot de-scale ulp this far. 0N/A // must scale diff in other direction. 0N/A // the candidate is exactly right! 0N/A // this happens with surprising frequency 0N/A // difference is small. 0N/A // this is close enough 0N/A // difference is exactly half an ULP 0N/A // round to some other value maybe, then finish 0N/A // should check for bigIntNBits == 1 here?? 0N/A // difference is non-trivial. 0N/A // could scale addend by ratio of difference to 0N/A // halfUlp here, if we bothered to compute that difference. 0N/A // Most of the time ( I hope ) it is about 1 anyway. 0N/A continue;
// try again. 0N/A * Take a FloatingDecimal, which we presumably just scanned in, 0N/A * and find out what its value is, as a float. 0N/A * This is distinct from doubleValue() to avoid the extremely 0N/A * unlikely case of a double rounding error, wherein the conversion 0N/A * to double has one rounding error, and the conversion of that double 0N/A * to a float has another rounding error, IN THE WRONG DIRECTION, 0N/A * ( because of the preference to a zero low-order bit ). 0N/A // First, check for NaN and Infinity values 0N/A * convert the lead kDigits to an integer. 0N/A * iValue now contains an integer with the value of 0N/A * the first kDigits digits of the number. 0N/A * fValue contains the (float) of the same. 0N/A * possibly an easy case. 0N/A * We know that the digits can be represented 0N/A * exactly. And if the exponent isn't too outrageous, 0N/A * the whole thing can be done with one operation, 0N/A * thus one rounding error. 0N/A * Note that all our constructors trim all leading and 0N/A * trailing zeros, so simple values (including zero) 0N/A * will always end up here. 0N/A * Can get the answer with one operation, 0N/A * thus one roundoff. 0N/A * We can multiply dValue by 10^(slop) 0N/A * and it is still "small" and exact. 0N/A * Then we can multiply by 10^(exp-slop) 0N/A * with one rounding. 0N/A * Else we have a hard case with a positive exp. 0N/A * Can get the answer in one division. 0N/A * Else we have a hard case with a negative exp. 0N/A * In double-precision, this is an exact floating integer. 0N/A * So we can compute to double, then shorten to float 0N/A * with one round, and get the right answer. 0N/A * First, finish accumulating digits. 0N/A * Then convert that integer to a double, multiply 0N/A * by the appropriate power of ten, and convert to float. 0N/A * The sum of digits plus exponent is greater than 0N/A * what we think we can do with one error. 0N/A * Start by weeding out obviously out-of-range 0N/A * results, then convert to double and go to 0N/A * common hard-case code. 0N/A * Lets face it. This is going to be 0N/A * Infinity. Cut to the chase. 0N/A * Lets face it. This is going to be 0N/A * zero. Cut to the chase. 0N/A * Here, we do 'way too much work, but throwing away 0N/A * our partial results, and going and doing the whole 0N/A * thing as double, then throwing away half the bits that computes 0N/A * when we convert back to float. 0N/A * The alternative is to reproduce the whole multiple-precision 0N/A * algorithm for float precision, or to try to parameterize it 0N/A * for common usage. The former will take about 400 lines of code, 0N/A * and the latter I tried without success. Thus the semi-hack 0N/A * All the positive powers of 10 that can be 0N/A 1.0e1,
1.0e2,
1.0e3,
1.0e4,
1.0e5,
0N/A 1.0e6,
1.0e7,
1.0e8,
1.0e9,
1.0e10,
0N/A 1.0e11,
1.0e12,
1.0e13,
1.0e14,
1.0e15,
0N/A 1.0e16,
1.0e17,
1.0e18,
1.0e19,
1.0e20,
0N/A 1.0e1f,
1.0e2f,
1.0e3f,
1.0e4f,
1.0e5f,
0N/A 1.0e6f,
1.0e7f,
1.0e8f,
1.0e9f,
1.0e10f 0N/A 1e16,
1e32,
1e64,
1e128,
1e256 };
0N/A 1e-16,
1e-32,
1e-64,
1e-128,
1e-256 };
0N/A 5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5 0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A 5L*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5*
5,
0N/A // approximately ceil( log2( long5pow[i] ) ) 0N/A private static final char infinity[] = {
'I',
'n',
'f',
'i',
'n',
'i',
't',
'y' };
0N/A private static final char zero[] = {
'0',
'0',
'0',
'0',
'0',
'0',
'0',
'0' };
0N/A * Grammar is compatible with hexadecimal floating-point constants 0N/A * described in section 6.4.4.2 of the C99 specification. 0N/A "([-+])?0[xX](((\\p{XDigit}+)\\.?)|((\\p{XDigit}*)\\.(\\p{XDigit}+)))[pP]([-+])?(\\p{Digit}+)[fFdD]?" 0N/A * Convert string s to a suitable floating decimal; uses the 0N/A * double constructor and set the roundDir variable appropriately 0N/A * in case the value is later converted to a float. 0N/A // Verify string is a member of the hexadecimal floating-point 0N/A // Input does not match pattern 0N/A }
else {
// validInput 0N/A * We must isolate the sign, significand, and exponent 0N/A * fields. The sign value is straightforward. Since 0N/A * floating-point numbers are stored with a normalized 0N/A * representation, the significand and exponent are 0N/A * After extracting the sign, we normalized the 0N/A * significand as a hexadecimal value, calculating an 0N/A * exponent adjust for any shifts made during 0N/A * normalization. If the significand is zero, the 0N/A * exponent doesn't need to be examined since the output 0N/A * Next the exponent in the input string is extracted. 0N/A * Afterwards, the significand is normalized as a *binary* 0N/A * value and the input value's normalized exponent can be 0N/A * computed. The significand bits are copied into a 0N/A * double significand; if the string has more logical bits 0N/A * than can fit in a double, the extra bits affect the 0N/A * round and sticky bits which are used to round the final 0N/A // Extract significand sign 0N/A // Extract Significand magnitude 0N/A * Based on the form of the significand, calculate how the 0N/A * binary exponent needs to be adjusted to create a 0N/A * normalized *hexadecimal* floating-point number; that 0N/A * is, a number where there is one nonzero hex digit to 0N/A * the left of the (hexa)decimal point. Since we are 0N/A * adjusting a binary, not hexadecimal exponent, the 0N/A * exponent is adjusted by a multiple of 4. 0N/A * There are a number of significand scenarios to consider; 0N/A * letters are used in indicate nonzero digits: 0N/A * 1. 000xxxx => x.xxx normalized 0N/A * increase exponent by (number of x's - 1)*4 0N/A * 2. 000xxx.yyyy => x.xxyyyy normalized 0N/A * increase exponent by (number of x's - 1)*4 0N/A * 3. .000yyy => y.yy normalized 0N/A * decrease exponent by (number of zeros + 1)*4 0N/A * 4. 000.00000yyy => y.yy normalized 0N/A * decrease exponent by (number of zeros to right of point + 1)*4 0N/A * If the significand is exactly zero, return a properly 0N/A // left of "decimal" point 0N/A // (leading zeros stripped) 0N/A // "decimal" point; leading zeros 0N/A // must always be accounted for 0N/A * The significand is made up of either 0N/A * 1. group 4 entirely (integer portion only) 0N/A * 2. the fractional portion from group 7 plus any 0N/A * (optional) integer portions from group 6. 0N/A // Leading zeros never matter on the integer portion 0N/A // Group 6 is the optional integer; leading zeros 0N/A // never matter on the integer portion 0N/A // Turn "integer.fraction" into "integer"+"fraction" 0N/A * Adjust exponent as described above 0N/A }
else {
// Cases 3 and 4 0N/A // If the significand is zero, the exponent doesn't 0N/A // matter; return a properly signed zero. 0N/A * Use an int to read in the exponent value; this should 0N/A * provide more than sufficient range for non-contrived 0N/A * inputs. If reading the exponent in as an int does 0N/A * overflow, examine the sign of the exponent and 0N/A * significand to determine what to do. 0N/A // At this point, we know the exponent is 0N/A // syntactically well-formed as a sequence of 0N/A // digits. Therefore, if an NumberFormatException 0N/A // is thrown, it must be due to overflowing int's 0N/A // range. Also, at this point, we have already 0N/A // checked for a zero significand. Thus the signs 0N/A // of the exponent and significand determine the 0N/A // exponent + +infinity -infinity 0N/A // Calculate partially adjusted exponent 0N/A // Starting copying non-zero bits into proper position in 0N/A // a long; copy explicit bit too; this will be masked 0N/A // later for normal values. 0N/A // First iteration is different, since we only copy 0N/A // from the leading significand bit; one more exponent 0N/A // adjust will be needed... 0N/A // IMPORTANT: make leadingDigit a long to avoid 0N/A // surprising shift semantics! 0N/A * Left shift the leading digit (53 - (bit position of 0N/A * leading 1 in digit)); this sets the top bit of the 0N/A * significand to 1. The nextShift value is adjusted 0N/A * to take into account the number of bit positions of 0N/A * the leadingDigit actually used. Finally, the 0N/A * exponent is adjusted to normalize the significand 0N/A * as a binary value, not just a hex value. 0N/A /* exponent += 0 */ }
0N/A // The preceding if-else could be replaced by a single 0N/A // code block based on the high-order bit set in 0N/A // leadingDigit. Given leadingOnePosition, 0N/A // significand |= leadingDigit << (SIGNIFICAND_WIDTH - leadingOnePosition); 0N/A // nextShift = 52 - (3 + leadingOnePosition); 0N/A // exponent += (leadingOnePosition-1); 0N/A * Now the exponent variable is equal to the normalized 0N/A * binary exponent. Code below will make representation 0N/A * adjustments if the exponent is incremented after 0N/A * rounding (includes overflows to infinity) or if the 0N/A * result is subnormal. 0N/A // Copy digit into significand until the significand can't 0N/A // hold another full hex digit or there are no more input 0N/A // After the above loop, the bulk of the string is copied. 0N/A // Now, we must copy any partial hex digits into the 0N/A // significand AND compute the round bit and start computing 0N/A // from nextShift, figure out how many bits need 0N/A // to be copied, if any 0N/A // three bits need to be copied in; can 0N/A // two bits need to be copied in; can 0N/A // set round and start sticky 0N/A // one bit needs to be copied in 0N/A // Now set round and start sticky, if possible 0N/A // all bits copied into significand; set 0N/A // round and start sticky 0N/A // nonzeros in three low order bits? 0N/A // Round is set; sticky might be set. 0N/A // For the sticky bit, it suffices to check the 0N/A // current digit and test for any nonzero digits in 0N/A // the remaining unprocessed input. 0N/A // else all of string was seen, round and sticky are 0N/A // correct as false. 0N/A // Check for overflow and update exponent accordingly. 0N/A // overflow to properly signed infinity 0N/A }
else {
// Finite return value 0N/A // The result returned in this block cannot be a 0N/A // zero or subnormal; however after the 0N/A // significand is adjusted from rounding, we could 0N/A // still overflow in infinity. 0N/A // AND exponent bits into significand; if the 0N/A // significand is incremented and overflows from 0N/A // rounding, this combination will update the 0N/A // exponent correctly, even in the case of 0N/A // Double.MAX_VALUE overflowing to infinity. 0N/A }
else {
// Subnormal or zero 0N/A // (exponent < DoubleConsts.MIN_EXPONENT) 0N/A // No way to round back to nonzero value 0N/A // regardless of significand if the exponent is 0N/A }
else {
// -1075 <= exponent <= MIN_EXPONENT -1 = -1023 0N/A * Find bit position to round to; recompute 0N/A * round and sticky bits, and shift 0N/A * significand right appropriately. 0N/A // Number of bits of significand to preserve is 0N/A // exponent - abs_min_exp +1 0N/A // -1075 +1074 + 1 = 0 0N/A // -1023 +1074 + 1 = 52 0N/A // First, isolate the new round bit 0N/A // create mask to update sticky bits; low 0N/A // order bitsDiscarded bits should be 1 0N/A // Now, discard the bits 0N/A // The significand variable now contains the currently 0N/A // appropriate exponent bits too. 0N/A * Determine if significand should be incremented; 0N/A * making this determination depends on the least 0N/A * significant bit and the round and sticky bits. 0N/A * Round to nearest even rounding table, adapted from 0N/A * table 4.7 in "Computer Arithmetic" by IsraelKoren. 0N/A * The digit to the left of the "decimal" point is the 0N/A * least significant bit, the digits to the right of 0N/A * the point are the round and sticky bits 0N/A * x0.11 x1. = x0. +1 0N/A * Set roundingDir variable field of fd properly so 0N/A * that the input string can be properly rounded to a 0N/A * float value. There are two cases to consider: 0N/A * 1. rounding to double discards sticky bit 0N/A * information that would change the result of a float 0N/A * rounding (near halfway case between two floats) 0N/A * 2. rounding to double rounds up when rounding up 0N/A * would not occur when rounding to float. 0N/A * For former case only needs to be considered when 0N/A * the bits rounded away when casting to float are all 0N/A * zero; otherwise, float round bit is properly set 0N/A * and sticky will already be true. 0N/A * The lower exponent bound for the code below is the 0N/A * minimum (normalized) subnormal exponent - 1 since a 0N/A * value with that exponent can round up to the 0N/A * minimum subnormal value and the sticky bit 0N/A * information must be preserved (i.e. case 1). 0N/A // Outside above exponent range, the float value 0N/A // will be zero or infinity. 0N/A * If the low-order 28 bits of a rounded double 0N/A * significand are 0, the double could be a 0N/A * half-way case for a rounding to float. If the 0N/A * double value is a half-way case, the double 0N/A * significand may have to be modified to round 0N/A * the the right float value (see the stickyRound 0N/A * method). If the rounding to double has lost 0N/A * what would be float sticky bit information, the 0N/A * double significand must be incremented. If the 0N/A * double value's significand was itself 0N/A * incremented, the float value may end up too 0N/A * large so the increment should be undone. 0N/A // For negative values, the sign of the 0N/A // roundDir is the same as for positive values 0N/A // since adding 1 increasing the significand's 0N/A // magnitude and subtracting 1 decreases the 0N/A // significand's magnitude. If neither round 0N/A // nor sticky is true, the double value is 0N/A // exact and no adjustment is required for a 0N/A // proper float rounding. 0N/A // If round and sticky were both true, 0N/A // and the least significant 0N/A // significand bit were 0, the rounded 0N/A // significand would not have its 0N/A // low-order bits be zero. Therefore, 0N/A // we only need to adjust the 0N/A // significand if round XOR sticky is 0N/A else {
// prerounding lsb is 1 0N/A // If the prerounding lsb is 1 and the 0N/A // resulting significand has its 0N/A // low-order bits zero, the significand 0N/A // was incremented. Here, we undo the 0N/A // increment, which will ensure the 0N/A // right guard and sticky bits for the 0N/A * Return <code>s</code> with any leading zeros removed. 0N/A * Extract a hexadecimal digit from position <code>position</code> 0N/A * of string <code>s</code>. 0N/A * A really, really simple bigint package 0N/A * tailored to the needs of floating base conversion. 0N/A int data[];
// value: data[0] is least significant 0N/A int n= (
nd+
8)/
9;
// estimate size needed. 0N/A data =
new int[n];
// allocate enough space 0N/A * Left shift by c bits. 0N/A * Shifts this in place. 0N/A // special hack, since an anticount of 32 won't go! 0N/A // may have constructed high-order word of 0. 0N/A * normalize this number by shifting until 0N/A * the MSB of the number is at 0x08000000. 0N/A * This is in preparation for quoRemIteration, below. 0N/A * The idea is that, to make division easier, we want the 0N/A * divisor to be "normalized" -- usually this means shifting 0N/A * the MSB into the high words sign bit. But because we know that 0N/A * the quotient will be 0 < q < 10, we would like to arrange that 0N/A * the dividend not span up into another word of precision. 0N/A * (This needs to be explained more clearly!) 0N/A // oops. Value is zero. Cannot normalize it! 0N/A * In most cases, we assume that wordcount is zero. This only 0N/A * makes sense, as we try not to maintain any high-order 0N/A * words full of zeros. In fact, if there are zeros, we will 0N/A * simply SHORTEN our number at this point. Watch closely... 0N/A * Compute how far left we have to shift v s.t. its highest- 0N/A * order bit is in the right place. Then call lshiftMe to 0N/A if ( (v &
0xf0000000) !=
0 ){
0N/A // will have to shift up into the next word. 0N/A while ( v <=
0x000fffff ){
0N/A // hack: byte-at-a-time shifting 0N/A while ( v <=
0x07ffffff ){
0N/A * Multiply a FDBigInt by an int. 0N/A * Result is a new FDBigInt. 0N/A // guess adequate size of r. 0N/A p += v * ((
long)
data[i]&
0xffffffffL);
0N/A * Multiply a FDBigInt by an int and add another int. 0N/A * Result is computed in place. 0N/A // unroll 0th iteration, doing addition. 0N/A p = v * ((
long)
data[
0]&
0xffffffffL) + ((
long)
addend&
0xffffffffL);
0N/A p += v * ((
long)
data[i]&
0xffffffffL);
0N/A * Multiply a FDBigInt by another FDBigInt. 0N/A * Result is a new FDBigInt. 0N/A // crudely guess adequate size for r 0N/A // I think I am promised zeros... 0N/A long v = (
long)
this.
data[i] &
0xffffffffL;
// UNSIGNED CONVERSION 0N/A p += ((
long)r[i+j]&
0xffffffffL) + v*((
long)
other.
data[j]&
0xffffffffL);
// UNSIGNED CONVERSIONS ALL 'ROUND. 0N/A // compute how much of r we actually needed for all that. 0N/A * Add one FDBigInt to another. Return a FDBigInt 0N/A // arrange such that a.nWords >= b.nWords; 0N/A // n = a.nWords, m = b.nWords 0N/A int r[] =
new int[ n ];
0N/A for ( i =
0; i < n; i++ ){
0N/A c += (
long)a[i] &
0xffffffffL;
0N/A c += (
long)b[i] &
0xffffffffL;
0N/A c >>=
32;
// signed shift. 0N/A // oops -- carry out -- need longer result. 0N/A * Subtract one FDBigInt from another. Return a FDBigInt 0N/A * Assert that the result is positive. 0N/A for ( i =
0; i < n; i++ ){
0N/A c += (
long)
this.
data[i] &
0xffffffffL;
0N/A if ( ( r[i] = (
int) c ) ==
0 )
0N/A c >>=
32;
// signed shift 0N/A assert c ==
0L : c;
// borrow out of subtract 0N/A * Compare FDBigInt with another FDBigInt. Return an integer 0N/A // if any of my high-order words is non-zero, 0N/A // then the answer is evident 0N/A // if any of other's high-order words is non-zero, 0N/A // then the answer is evident 0N/A for ( ; i >
0 ; i-- )
0N/A // careful! want unsigned compare! 0N/A // use brute force here. 0N/A // a is really big, unsigned 0N/A return a-b;
// both big, negative 0N/A return 1;
// b not big, answer is obvious; 0N/A // a is not really big 0N/A // but b is really big 0N/A * q = (int)( this / S ) 0N/A * this = 10 * ( this mod S ) 0N/A * This is the iteration step of digit development for output. 0N/A * We assume that S has been normalized, as above, and that 0N/A * "this" has been lshift'ed accordingly. 0N/A * Also assume, of course, that the result, q, can be expressed 0N/A * as an integer, 0 <= q < 10. 0N/A // ensure that this and S have the same number of 0N/A // digits. If S is properly normalized and q < 10 then 0N/A // estimate q the obvious way. We will usually be 0N/A // right. If not, then we're only off by a little and 0N/A long q = ((
long)
data[n]&
0xffffffffL) / (
long)S.
data[n];
0N/A for (
int i =
0; i <= n ; i++ ){
0N/A // damn, damn, damn. q is too big. 0N/A // add S back in until this turns +. This should 0N/A // not be very many times! 0N/A for (
int i =
0; i <= n; i++ ){
0N/A sum >>=
32;
// Signed or unsigned, answer is 0 or 1 0N/A * Originally the following line read 0N/A * "if ( sum !=0 && sum != -1 )" 0N/A * but that would be wrong, because of the 0N/A * treatment of the two values as entirely unsigned, 0N/A * it would be impossible for a carry-out to be interpreted 0N/A * as -1 -- it would have to be a single-bit carry-out, or 0N/A assert sum ==
0 ||
sum ==
1 :
sum;
// carry out of division correction 0N/A // finally, we can multiply this by 10. 0N/A // it cannot overflow, right, as the high-order word has 0N/A // at least 4 high-order zeros! 0N/A for (
int i =
0; i <= n; i++ ){
0N/A p +=
10*((
long)
data[i]&
0xffffffffL);
0N/A p >>=
32;
// SIGNED shift. 0N/A assert p ==
0L : p;
// Carry out of *10 0N/A // if this can be represented as a long, return the value 0N/A return ((
long)
data[
0]&
0xffffffffL);
0N/A assert data[
1] >=
0;
// value too big 0N/A return ((
long)(
data[
1]) <<
32) | ((
long)
data[
0]&
0xffffffffL);
0N/A for( ; i>
0 ; i-- ){