7237N/A# ==================================================================== 7237N/A# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL 7237N/A# project. The module is, however, dual licensed under OpenSSL and 7237N/A# CRYPTOGAMS licenses depending on where you obtain it. For further 7237N/A# Hardware SPARC T4 support by David S. Miller <davem@davemloft.net>. 7237N/A# ==================================================================== 7237N/A# Performance improvement is not really impressive on pre-T1 CPU: +8% 7237N/A# over Sun C and +25% over gcc [3.3]. While on T1, a.k.a. Niagara, it 7237N/A# turned to be 40% faster than 64-bit code generated by Sun C 5.8 and 7237N/A# >2x than 64-bit code generated by gcc 3.4. And there is a gimmick. 7237N/A# X[16] vector is packed to 8 64-bit registers and as result nothing 7237N/A# is spilled on stack. In addition input data is loaded in compact 7237N/A# instruction sequence, thus minimizing the window when the code is 7237N/A# subject to [inter-thread] cache-thrashing hazard. The goal is to 7237N/A# ensure scalability on UltraSPARC T1, or rather to avoid decay when 7237N/A# amount of active threads exceeds the number of physical cores. 7237N/A# SPARC T4 SHA1 hardware achieves 3.72 cycles per byte, which is 3.1x 7237N/A# faster than software. Multi-process benchmark saturates at 11x 7237N/A# single-process result on 8-core processor, or ~9GBps per 2.85GHz 7237N/A@X=(
"%o0",
"%o1",
"%o2",
"%o3",
"%o4",
"%o5",
"%g1",
"%o7");
7237N/A " srlx @X[(($i+1)/2)%8],32,$Xi\n";
7237N/A xor @X[($j+
1)%
8],@X[$j%
8],@X[$j%
8]
7237N/A xor @X[($j+
4)%
8],@X[$j%
8],@X[$j%
8]
7237N/A.
asciz "SHA1 block transform for SPARCv9, CRYPTOGAMS by <appro\@openssl.org>" 7237N/A# Purpose of these subroutines is to explicitly encode VIS instructions, 7237N/A# so that one can compile the module without having to specify VIS 7237N/A# extentions on compiler command line, e.g. -xarch=v9 vs. -xarch=v9a. 7237N/A# Idea is to reserve for option to produce "universal" binary and let 7237N/A# programmer detect if current CPU is VIS capable at run-time. 7237N/A # re-encode for upper double register addressing 7237N/A return sprintf ".word\t0x%08x !%s",
7237N/Amy %
bias = (
"g" =>
0,
"o" =>
8,
"l" =>
16,
"i" =>
24 );
7237N/A return sprintf ".word\t0x%08x !%s",
7237N/A s/\b(f[^\s]*)\s+(%f[
0-
9]{
1,
2}),\s*(%f[
0-
9]{
1,
2}),\s*(%f[
0-
9]{
1,
2})/