2N/A * Copyright 2006 Sun Microsystems, Inc. All rights reserved. 2N/A * Use is subject to license terms. 2N/A * Cleaned-up and optimized version of MD5, based on the reference 2N/A * implementation provided in RFC 1321. See RSA Copyright information 2N/A#
pragma ident "%Z%%M% %I% %E% SMI" 2N/A * MD5C.C - RSA Data Security, Inc., MD5 message-digest algorithm 2N/A * Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All 2N/A * License to copy and use this software is granted provided that it 2N/A * is identified as the "RSA Data Security, Inc. MD5 Message-Digest 2N/A * Algorithm" in all material mentioning or referencing this software 2N/A * License is also granted to make and use derivative works provided 2N/A * that such works are identified as "derived from the RSA Data 2N/A * Security, Inc. MD5 Message-Digest Algorithm" in all material 2N/A * mentioning or referencing the derived work. 2N/A * RSA Data Security, Inc. makes no representations concerning either 2N/A * the merchantability of this software or the suitability of this 2N/A * software for any particular purpose. It is provided "as is" 2N/A * without express or implied warranty of any kind. 2N/A * These notices must be retained in any copies of any part of this 2N/A#
endif /* !_KERNEL || _BOOT */ 2N/A * F, G, H and I are the basic MD5 functions. 2N/A#
define F(b, c, d) (((b) & (c)) | ((~b) & (d)))
2N/A#
define G(b, c, d) (((b) & (d)) | ((c) & (~d)))
2N/A#
define H(b, c, d) ((b) ^ (c) ^ (d))
2N/A#
define I(b, c, d) ((c) ^ ((b) | (~d)))
2N/A * ROTATE_LEFT rotates x left n bits. 2N/A (((x) << (n)) | ((x) >> ((
sizeof (x) <<
3) - (n))))
2N/A * FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4. 2N/A * Rotation is separate from addition to prevent recomputation. 2N/A#
define FF(a, b, c, d, x, s,
ac) { \
2N/A (a) += F((b), (c), (d)) + (x) + ((
unsigned long long)(
ac)); \
2N/A#
define GG(a, b, c, d, x, s,
ac) { \
2N/A (a) += G((b), (c), (d)) + (x) + ((
unsigned long long)(
ac)); \
2N/A#
define HH(a, b, c, d, x, s,
ac) { \
2N/A (a) += H((b), (c), (d)) + (x) + ((
unsigned long long)(
ac)); \
2N/A#
define II(a, b, c, d, x, s,
ac) { \
2N/A (a) += I((b), (c), (d)) + (x) + ((
unsigned long long)(
ac)); \
2N/A * Loading 32-bit constants on a RISC is expensive since it involves both a 2N/A * `sethi' and an `or'. thus, we instead have the compiler generate `ld's to 2N/A * load the constants from an array called `md5_consts'. however, on intel 2N/A * (and other CISC processors), it is cheaper to load the constant 2N/A * directly. thus, the c code in MD5Transform() uses the macro MD5_CONST() 2N/A * which either expands to a constant or an array reference, depending on the 2N/A * architecture the code is being compiled for. 2N/A * Right now, i386 and amd64 are the CISC exceptions. 2N/A * If we get another CISC ISA, we'll have to change the ifdef. 2N/A * while it is somewhat counter-intuitive, on sparc (and presumably other RISC 2N/A * machines), it is more efficient to place all the constants used in this 2N/A * function in an array and load the values out of the array than to manually 2N/A * load the constants. this is because setting a register to a 32-bit value 2N/A * takes two ops in most cases: a `sethi' and an `or', but loading a 32-bit 2N/A * value from memory only takes one `ld' (or `lduw' on v9). while this 2N/A * increases memory usage, the compiler can find enough other things to do 2N/A * while waiting to keep the pipeline does not stall. additionally, it is 2N/A * likely that many of these constants are cached so that later accesses do 2N/A * not even go out to the bus. 2N/A * this array is declared `static' to keep the compiler from having to 2N/A * bcopy() this array onto the stack frame of MD5Transform() each time it is 2N/A * called -- which is unacceptably expensive. 2N/A * the `const' is to ensure that callers are good citizens and do not try to 2N/A * munge the array. since these routines are going to be called from inside 2N/A * multithreaded kernelland, this is a good safety check. -- `constants' will 2N/A * end up in .rodata. 2N/A * unfortunately, loading from an array in this manner hurts performance under 2N/A * intel (and presumably other CISC machines). so, there is a macro, 2N/A * MD5_CONST(), used in MD5Transform(), that either expands to a reference to 2N/A * this array, or to the actual constant, depending on what platform this code 2N/A * Going to load these consts in 8B chunks, so need to enforce 8B alignment 2N/A * To reduce the number of loads, load consts in 64-bit 2N/A * chunks and then split. 2N/A * No need to mask upper 32-bits, as just interested in 2N/A * low 32-bits (saves an & operation and means that this 2N/A * optimization doesn't increases the icount. 2N/A * purpose: initializes the md5 context and begins and md5 digest operation 2N/A * input: MD5_CTX * : the context to initialize. /* load magic initialization constants */ * purpose: continues an md5 digest operation, using the message block * input: MD5_CTX * : the context to update * uint8_t * : the message block * uint32_t : the length of the message block in bytes * MD5 crunches in 64-byte blocks. All numeric constants here are related to const unsigned char *
input = (
const unsigned char *)
inpp;
/* compute (number of bytes computed so far) mod 64 */ /* update number of bits hashed into this MD5 computation so far */ /* transform as many times as possible */ * only do initial bcopy() and MD5Transform() if * buf_index != 0. if buf_index == 0, we're just * wasting our time doing the bcopy() since there * wasn't any data left over from a previous call to * For N1 use %asi register. However, costly to repeatedly set * in MD5Transform. Therefore, set once here. * Should probably restore the old value afterwards... * if i and input_len are the same, return now instead * of calling bcopy(), since the bcopy() in this * case will be an expensive nop. /* buffer remaining input */ * purpose: ends an md5 digest operation, finalizing the message digest and * input: uint8_t * : a buffer to store the digest in * MD5_CTX * : the context to finalize, save, and zero /* store bit count, little endian */ /* pad out to 56 mod 64 */ /* append length (before padding) */ /* store state in digest */ /* zeroize sensitive information */ * sparc register window optimization: * `a', `b', `c', and `d' are passed into MD5Transform explicitly * since it increases the number of registers available to the * compiler. under this scheme, these variables can be held in * %i0 - %i3, which leaves more local and out registers available. * purpose: md5 transformation -- updates the digest based on `block' * input: uint32_t : bytes 1 - 4 of the digest * uint32_t : bytes 5 - 8 of the digest * uint32_t : bytes 9 - 12 of the digest * uint32_t : bytes 12 - 16 of the digest * MD5_CTX * : the context to update * uint8_t [64]: the block to use to update the digest * use individual integers instead of using an array. this is a * win, although the amount it wins by seems to vary quite a bit. /* LINTED E_BAD_PTR_CAST_ALIGN */ * the compiler (at least SC4.2/5.x) generates better code if * variable use is localized. in this case, swapping the integers in * this order allows `x_0 'to be swapped nearest to its first use in * FF(), and likewise for `x_1' and up. note that the compiler * prefers this to doing each swap right before the FF() that * if `block' is already aligned on a 4-byte boundary, use the * optimized load_little_32() directly. otherwise, bcopy() * into a buffer that *is* aligned on a 4-byte boundary and * then do the load_little_32() on that buffer. benchmarks * have shown that using the bcopy() is better than loading * the bytes individually and doing the endian-swap by hand. * even though it's quite tempting to assign to do: * blk = bcopy(blk, ctx->buf_un.buf32, sizeof (ctx->buf_un.buf32)); * and only have one set of LOAD_LITTLE_32()'s, the compiler (at least * SC4.2/5.x) *does not* like that, so please resist the urge. /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ /* LINTED E_BAD_PTR_CAST_ALIGN */ * zeroize sensitive information -- compiler will optimize * this out if everything is kept in registers * purpose: to convert a list of numbers from big endian to little endian * input: uint8_t * : place to store the converted little endian numbers * uint32_t * : place to get numbers to convert from * size_t : the length of the input in bytes /*LINTED E_BAD_PTR_CAST_ALIGN*/ #
endif /* _MD5_CHECK_ALIGNMENT */#
else /* big endian -- will work on little endian, but slowly */