Print this page
NEX-16819 loader UEFI support
Includes work by Toomas Soome <tsoome@me.com>
Upstream commits:
loader: pxe receive cleanup
9475 libefi: Do not return only if ReceiveFilter
installboot: should support efi system partition
8931 boot1.efi: scan all display modes rather than
loader: spinconsole updates
loader: gfx experiment to try GOP Blt() function.
sha1 build test
loader: add sha1 hash calculation
common/sha1: update for loader build
loader: biosdisk rework
uts: 32-bit kernel FB needs mapping in low memory
uts: add diag-device
uts: boot console mirror with diag-device
uts: enable very early console on ttya
kmdb: add diag-device as input/output device
uts: test VGA memory exclusion from mapping
uts: clear boot mapping and protect boot pages test
uts: add dboot map debug printf
uts: need to release FB pages in release_bootstrap()
uts: add screenmap ioctl
uts: update sys/queue.h
loader: add illumos uts/common to include path
loader: tem/gfx font cleanup
loader: vbe checks
uts: gfx_private set KD_TEXT when KD_RESETTEXT is
uts: gfx 8-bit update
loader: gfx 8-bit fix
loader: always set media size from partition.
uts: MB2 support for 32-bit kernel
loader: x86 should have tem 80x25
uts: x86 should have tem 80x25
uts: font update
loader: font update
uts: tem attributes
loader: tem.c comment added
uts: use font module
loader: add font module
loader: build rules for new font setup
uts: gfx_private update for new font structure
uts: early boot update for new font structure
uts: font update
uts: font build rules update for new fonts
uts: tem update to new font structure
loader: module.c needs to include tem_impl.h
uts: gfx_private 8x16 font rework
uts: make font_lookup public
loader: font rework
uts: font rework
9259 libefi: efi_alloc_and_read should check for PMBR
uts: tem utf-8 support
loader: implement tem utf-8 support
loader: tem should be able to display UTF-8
7784 uts: console input should support utf-8
7796 uts: ldterm default to utf-8
uts: do not reset serial console
uts: set up colors even if tem is not console
uts: add type for early boot properties
uts: gfx_private experiment with drm and vga
uts: gfx_private should use setmode drm callback.
uts: identify FB types and set up gfx_private based
loader: replace gop and vesa with framebuffer
uts: boot needs simple tem to support mdb
uts: boot_keyboard should emit esc sequences for
uts: gfx_private FB showuld be written by line
kmdb: set terminal window size
uts: gfx_private needs to keep track of early boot FB
pnglite: move pnglite to usr/src/common
loader: gfx_fb
ficl-sys: add gfx primitives
loader: add illumos.png logo
ficl: add fb-putimage
loader: add png support
loader: add alpha blending for gfx_fb
loader: use term-drawrect for menu frame
ficl: add simple gfx words
uts: provide fb_info via fbgattr dev_specific array.
uts: gfx_private add alpha blending
uts: update sys/ascii.h
uts: tem OSC support (incomplete)
uts: implement env module support and use data from
uts: tem get colors from early boot data
loader: use crc32 from libstand (libz)
loader: optimize for size
loader: pass tem info to the environment
loader: import tem for loader console
loader: UEFI loader needs to set ISADIR based on
loader: need UEFI32 support
8918 loader.efi: add vesa edid support
uts: tem_safe_pix_clear_prom_output() should only
uts: tem_safe_pix_clear_entire_screen() should use
uts: tem_safe_check_first_time() should query cursor
uts: tem implement cls callback & visual_io v4
uts: gfx_vgatext use block cursor for vgatext
uts: gfx_private implement cls callback & visual_io
uts: gfx_private bitmap framebuffer implementation
uts: early start frame buffer console support
uts: font functions should check the input char
uts: font rendering should support 16/24/32bit depths
uts: use smallest font as fallback default.
uts: update terminal dimensions based on selected
7834 uts: vgatext should use gfx_private
uts: add spacing property to 8859-1.bdf
terminfo: add underline for sun-color
terminfo: sun-color has 16 colors
uts: add font load callback type
loader: do not repeat int13 calls with error 0x20 and
8905 loader: add skein/edonr support
8904 common/crypto: make skein and edonr loader
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Revert "NEX-16819 loader UEFI support"
This reverts commit ec06b9fc617b99234e538bf2e7e4d02a24993e0c.
Reverting due to failures in the zfs-tests and the sharefs-tests
NEX-16819 loader UEFI support
Includes work by Toomas Soome <tsoome@me.com>
Upstream commits:
loader: pxe receive cleanup
9475 libefi: Do not return only if ReceiveFilter
installboot: should support efi system partition
8931 boot1.efi: scan all display modes rather than
loader: spinconsole updates
loader: gfx experiment to try GOP Blt() function.
sha1 build test
loader: add sha1 hash calculation
common/sha1: update for loader build
loader: biosdisk rework
uts: 32-bit kernel FB needs mapping in low memory
uts: add diag-device
uts: boot console mirror with diag-device
uts: enable very early console on ttya
kmdb: add diag-device as input/output device
uts: test VGA memory exclusion from mapping
uts: clear boot mapping and protect boot pages test
uts: add dboot map debug printf
uts: need to release FB pages in release_bootstrap()
uts: add screenmap ioctl
uts: update sys/queue.h
loader: add illumos uts/common to include path
loader: tem/gfx font cleanup
loader: vbe checks
uts: gfx_private set KD_TEXT when KD_RESETTEXT is
uts: gfx 8-bit update
loader: gfx 8-bit fix
loader: always set media size from partition.
uts: MB2 support for 32-bit kernel
loader: x86 should have tem 80x25
uts: x86 should have tem 80x25
uts: font update
loader: font update
uts: tem attributes
loader: tem.c comment added
uts: use font module
loader: add font module
loader: build rules for new font setup
uts: gfx_private update for new font structure
uts: early boot update for new font structure
uts: font update
uts: font build rules update for new fonts
uts: tem update to new font structure
loader: module.c needs to include tem_impl.h
uts: gfx_private 8x16 font rework
uts: make font_lookup public
loader: font rework
uts: font rework
libefi: efi_alloc_and_read should check for PMBR
uts: tem utf-8 support
loader: implement tem utf-8 support
loader: tem should be able to display UTF-8
7784 uts: console input should support utf-8
7796 uts: ldterm default to utf-8
uts: do not reset serial console
uts: set up colors even if tem is not console
uts: add type for early boot properties
uts: gfx_private experiment with drm and vga
uts: gfx_private should use setmode drm callback.
uts: identify FB types and set up gfx_private based
loader: replace gop and vesa with framebuffer
uts: boot needs simple tem to support mdb
uts: boot_keyboard should emit esc sequences for
uts: gfx_private FB showuld be written by line
kmdb: set terminal window size
uts: gfx_private needs to keep track of early boot FB
pnglite: move pnglite to usr/src/common
loader: gfx_fb
ficl-sys: add gfx primitives
loader: add illumos.png logo
ficl: add fb-putimage
loader: add png support
loader: add alpha blending for gfx_fb
loader: use term-drawrect for menu frame
ficl: add simple gfx words
uts: provide fb_info via fbgattr dev_specific array.
uts: gfx_private add alpha blending
uts: update sys/ascii.h
uts: tem OSC support (incomplete)
uts: implement env module support and use data from
uts: tem get colors from early boot data
loader: use crc32 from libstand (libz)
loader: optimize for size
loader: pass tem info to the environment
loader: import tem for loader console
loader: UEFI loader needs to set ISADIR based on
loader: need UEFI32 support
8918 loader.efi: add vesa edid support
uts: tem_safe_pix_clear_prom_output() should only
uts: tem_safe_pix_clear_entire_screen() should use
uts: tem_safe_check_first_time() should query cursor
uts: tem implement cls callback & visual_io v4
uts: gfx_vgatext use block cursor for vgatext
uts: gfx_private implement cls callback & visual_io
uts: gfx_private bitmap framebuffer implementation
uts: early start frame buffer console support
uts: font functions should check the input char
uts: font rendering should support 16/24/32bit depths
uts: use smallest font as fallback default.
uts: update terminal dimensions based on selected
7834 uts: vgatext should use gfx_private
uts: add spacing property to 8859-1.bdf
terminfo: add underline for sun-color
terminfo: sun-color has 16 colors
uts: add font load callback type
loader: do not repeat int13 calls with error 0x20 and
8905 loader: add skein/edonr support
8904 common/crypto: make skein and edonr loader
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
| Split |
Close |
| Expand all |
| Collapse all |
--- old/usr/src/common/crypto/sha1/sha1.c
+++ new/usr/src/common/crypto/sha1/sha1.c
1 1 /*
2 2 * Copyright 2009 Sun Microsystems, Inc. All rights reserved.
3 3 * Use is subject to license terms.
4 4 */
5 5
6 6 /*
7 7 * The basic framework for this code came from the reference
8 8 * implementation for MD5. That implementation is Copyright (C)
9 9 * 1991-2, RSA Data Security, Inc. Created 1991. All rights reserved.
10 10 *
11 11 * License to copy and use this software is granted provided that it
12 12 * is identified as the "RSA Data Security, Inc. MD5 Message-Digest
13 13 * Algorithm" in all material mentioning or referencing this software
14 14 * or this function.
15 15 *
16 16 * License is also granted to make and use derivative works provided
17 17 * that such works are identified as "derived from the RSA Data
18 18 * Security, Inc. MD5 Message-Digest Algorithm" in all material
19 19 * mentioning or referencing the derived work.
20 20 *
21 21 * RSA Data Security, Inc. makes no representations concerning either
22 22 * the merchantability of this software or the suitability of this
23 23 * software for any particular purpose. It is provided "as is"
24 24 * without express or implied warranty of any kind.
|
↓ open down ↓ |
24 lines elided |
↑ open up ↑ |
25 25 *
26 26 * These notices must be retained in any copies of any part of this
27 27 * documentation and/or software.
28 28 *
29 29 * NOTE: Cleaned-up and optimized, version of SHA1, based on the FIPS 180-1
30 30 * standard, available at http://www.itl.nist.gov/fipspubs/fip180-1.htm
31 31 * Not as fast as one would like -- further optimizations are encouraged
32 32 * and appreciated.
33 33 */
34 34
35 +#if defined(_STANDALONE)
36 +#include <sys/cdefs.h>
37 +#define _RESTRICT_KYWD restrict
38 +#else
35 39 #if !defined(_KERNEL) && !defined(_BOOT)
36 40 #include <stdint.h>
37 41 #include <strings.h>
38 42 #include <stdlib.h>
39 43 #include <errno.h>
40 44 #include <sys/systeminfo.h>
41 45 #endif /* !_KERNEL && !_BOOT */
46 +#endif /* _STANDALONE */
42 47
43 48 #include <sys/types.h>
44 49 #include <sys/param.h>
45 50 #include <sys/systm.h>
46 51 #include <sys/sysmacros.h>
47 52 #include <sys/sha1.h>
48 53 #include <sys/sha1_consts.h>
49 54
55 +#if defined(_STANDALONE)
56 +#include <sys/endian.h>
57 +#define HAVE_HTONL
58 +#if _BYTE_ORDER == _LITTLE_ENDIAN
59 +#undef _BIG_ENDIAN
60 +#else
61 +#undef _LITTLE_ENDIAN
62 +#endif
63 +#else
50 64 #ifdef _LITTLE_ENDIAN
51 65 #include <sys/byteorder.h>
52 66 #define HAVE_HTONL
53 67 #endif
68 +#endif
54 69
55 70 #ifdef _BOOT
56 71 #define bcopy(_s, _d, _l) ((void) memcpy((_d), (_s), (_l)))
57 72 #define bzero(_m, _l) ((void) memset((_m), 0, (_l)))
58 73 #endif
59 74
60 75 static void Encode(uint8_t *, const uint32_t *, size_t);
61 76
62 77 #if defined(__sparc)
63 78
64 79 #define SHA1_TRANSFORM(ctx, in) \
65 80 SHA1Transform((ctx)->state[0], (ctx)->state[1], (ctx)->state[2], \
66 81 (ctx)->state[3], (ctx)->state[4], (ctx), (in))
67 82
68 83 static void SHA1Transform(uint32_t, uint32_t, uint32_t, uint32_t, uint32_t,
69 84 SHA1_CTX *, const uint8_t *);
70 85
71 -#elif defined(__amd64)
86 +#elif defined(__amd64) && !defined(_STANDALONE)
72 87
73 88 #define SHA1_TRANSFORM(ctx, in) sha1_block_data_order((ctx), (in), 1)
74 89 #define SHA1_TRANSFORM_BLOCKS(ctx, in, num) sha1_block_data_order((ctx), \
75 90 (in), (num))
76 91
77 92 void sha1_block_data_order(SHA1_CTX *ctx, const void *inpp, size_t num_blocks);
78 93
79 94 #else
80 95
81 96 #define SHA1_TRANSFORM(ctx, in) SHA1Transform((ctx), (in))
82 97
83 98 static void SHA1Transform(SHA1_CTX *, const uint8_t *);
84 99
85 100 #endif
86 101
87 102
88 103 static uint8_t PADDING[64] = { 0x80, /* all zeros */ };
89 104
90 105 /*
91 106 * F, G, and H are the basic SHA1 functions.
92 107 */
93 108 #define F(b, c, d) (((b) & (c)) | ((~b) & (d)))
94 109 #define G(b, c, d) ((b) ^ (c) ^ (d))
95 110 #define H(b, c, d) (((b) & (c)) | (((b)|(c)) & (d)))
96 111
97 112 /*
98 113 * ROTATE_LEFT rotates x left n bits.
99 114 */
100 115
101 116 #if defined(__GNUC__) && defined(_LP64)
102 117 static __inline__ uint64_t
103 118 ROTATE_LEFT(uint64_t value, uint32_t n)
104 119 {
105 120 uint32_t t32;
106 121
107 122 t32 = (uint32_t)value;
108 123 return ((t32 << n) | (t32 >> (32 - n)));
109 124 }
110 125
111 126 #else
112 127
113 128 #define ROTATE_LEFT(x, n) \
114 129 (((x) << (n)) | ((x) >> ((sizeof (x) * NBBY)-(n))))
115 130
116 131 #endif
117 132
118 133
119 134 /*
120 135 * SHA1Init()
121 136 *
122 137 * purpose: initializes the sha1 context and begins and sha1 digest operation
123 138 * input: SHA1_CTX * : the context to initializes.
124 139 * output: void
125 140 */
126 141
127 142 void
128 143 SHA1Init(SHA1_CTX *ctx)
129 144 {
130 145 ctx->count[0] = ctx->count[1] = 0;
131 146
132 147 /*
133 148 * load magic initialization constants. Tell lint
134 149 * that these constants are unsigned by using U.
135 150 */
136 151
137 152 ctx->state[0] = 0x67452301U;
138 153 ctx->state[1] = 0xefcdab89U;
139 154 ctx->state[2] = 0x98badcfeU;
140 155 ctx->state[3] = 0x10325476U;
141 156 ctx->state[4] = 0xc3d2e1f0U;
142 157 }
143 158
144 159 #ifdef VIS_SHA1
145 160 #ifdef _KERNEL
146 161
147 162 #include <sys/regset.h>
148 163 #include <sys/vis.h>
149 164 #include <sys/fpu/fpusystm.h>
150 165
151 166 /* the alignment for block stores to save fp registers */
152 167 #define VIS_ALIGN (64)
153 168
154 169 extern int sha1_savefp(kfpu_t *, int);
155 170 extern void sha1_restorefp(kfpu_t *);
156 171
157 172 uint32_t vis_sha1_svfp_threshold = 128;
158 173
159 174 #endif /* _KERNEL */
160 175
161 176 /*
162 177 * VIS SHA-1 consts.
163 178 */
164 179 static uint64_t VIS[] = {
165 180 0x8000000080000000ULL,
166 181 0x0002000200020002ULL,
167 182 0x5a8279996ed9eba1ULL,
168 183 0x8f1bbcdcca62c1d6ULL,
169 184 0x012389ab456789abULL};
170 185
171 186 extern void SHA1TransformVIS(uint64_t *, uint32_t *, uint32_t *, uint64_t *);
172 187
173 188
174 189 /*
175 190 * SHA1Update()
176 191 *
177 192 * purpose: continues an sha1 digest operation, using the message block
178 193 * to update the context.
179 194 * input: SHA1_CTX * : the context to update
180 195 * void * : the message block
181 196 * size_t : the length of the message block in bytes
182 197 * output: void
183 198 */
184 199
185 200 void
186 201 SHA1Update(SHA1_CTX *ctx, const void *inptr, size_t input_len)
187 202 {
188 203 uint32_t i, buf_index, buf_len;
189 204 uint64_t X0[40], input64[8];
190 205 const uint8_t *input = inptr;
191 206 #ifdef _KERNEL
192 207 int usevis = 0;
193 208 #else
194 209 int usevis = 1;
195 210 #endif /* _KERNEL */
196 211
197 212 /* check for noop */
198 213 if (input_len == 0)
199 214 return;
200 215
201 216 /* compute number of bytes mod 64 */
202 217 buf_index = (ctx->count[1] >> 3) & 0x3F;
203 218
204 219 /* update number of bits */
205 220 if ((ctx->count[1] += (input_len << 3)) < (input_len << 3))
206 221 ctx->count[0]++;
207 222
208 223 ctx->count[0] += (input_len >> 29);
209 224
210 225 buf_len = 64 - buf_index;
211 226
212 227 /* transform as many times as possible */
213 228 i = 0;
214 229 if (input_len >= buf_len) {
215 230 #ifdef _KERNEL
216 231 kfpu_t *fpu;
217 232 if (fpu_exists) {
218 233 uint8_t fpua[sizeof (kfpu_t) + GSR_SIZE + VIS_ALIGN];
219 234 uint32_t len = (input_len + buf_index) & ~0x3f;
220 235 int svfp_ok;
221 236
222 237 fpu = (kfpu_t *)P2ROUNDUP((uintptr_t)fpua, 64);
223 238 svfp_ok = ((len >= vis_sha1_svfp_threshold) ? 1 : 0);
224 239 usevis = fpu_exists && sha1_savefp(fpu, svfp_ok);
225 240 } else {
226 241 usevis = 0;
227 242 }
228 243 #endif /* _KERNEL */
229 244
230 245 /*
231 246 * general optimization:
232 247 *
233 248 * only do initial bcopy() and SHA1Transform() if
234 249 * buf_index != 0. if buf_index == 0, we're just
235 250 * wasting our time doing the bcopy() since there
236 251 * wasn't any data left over from a previous call to
237 252 * SHA1Update().
238 253 */
239 254
240 255 if (buf_index) {
241 256 bcopy(input, &ctx->buf_un.buf8[buf_index], buf_len);
242 257 if (usevis) {
243 258 SHA1TransformVIS(X0,
244 259 ctx->buf_un.buf32,
245 260 &ctx->state[0], VIS);
246 261 } else {
247 262 SHA1_TRANSFORM(ctx, ctx->buf_un.buf8);
248 263 }
249 264 i = buf_len;
250 265 }
251 266
252 267 /*
253 268 * VIS SHA-1: uses the VIS 1.0 instructions to accelerate
254 269 * SHA-1 processing. This is achieved by "offloading" the
255 270 * computation of the message schedule (MS) to the VIS units.
256 271 * This allows the VIS computation of the message schedule
257 272 * to be performed in parallel with the standard integer
258 273 * processing of the remainder of the SHA-1 computation.
259 274 * performance by up to around 1.37X, compared to an optimized
260 275 * integer-only implementation.
261 276 *
262 277 * The VIS implementation of SHA1Transform has a different API
263 278 * to the standard integer version:
264 279 *
265 280 * void SHA1TransformVIS(
266 281 * uint64_t *, // Pointer to MS for ith block
267 282 * uint32_t *, // Pointer to ith block of message data
268 283 * uint32_t *, // Pointer to SHA state i.e ctx->state
269 284 * uint64_t *, // Pointer to various VIS constants
270 285 * )
271 286 *
272 287 * Note: the message data must by 4-byte aligned.
273 288 *
274 289 * Function requires VIS 1.0 support.
275 290 *
276 291 * Handling is provided to deal with arbitrary byte alingment
277 292 * of the input data but the performance gains are reduced
278 293 * for alignments other than 4-bytes.
279 294 */
280 295 if (usevis) {
281 296 if (!IS_P2ALIGNED(&input[i], sizeof (uint32_t))) {
282 297 /*
283 298 * Main processing loop - input misaligned
284 299 */
285 300 for (; i + 63 < input_len; i += 64) {
286 301 bcopy(&input[i], input64, 64);
287 302 SHA1TransformVIS(X0,
288 303 (uint32_t *)input64,
289 304 &ctx->state[0], VIS);
290 305 }
291 306 } else {
292 307 /*
293 308 * Main processing loop - input 8-byte aligned
294 309 */
295 310 for (; i + 63 < input_len; i += 64) {
296 311 SHA1TransformVIS(X0,
297 312 /* LINTED E_BAD_PTR_CAST_ALIGN */
298 313 (uint32_t *)&input[i], /* CSTYLED */
299 314 &ctx->state[0], VIS);
300 315 }
301 316
302 317 }
303 318 #ifdef _KERNEL
304 319 sha1_restorefp(fpu);
305 320 #endif /* _KERNEL */
306 321 } else {
307 322 for (; i + 63 < input_len; i += 64) {
308 323 SHA1_TRANSFORM(ctx, &input[i]);
309 324 }
310 325 }
311 326
312 327 /*
313 328 * general optimization:
314 329 *
315 330 * if i and input_len are the same, return now instead
316 331 * of calling bcopy(), since the bcopy() in this case
317 332 * will be an expensive nop.
318 333 */
319 334
320 335 if (input_len == i)
321 336 return;
322 337
323 338 buf_index = 0;
324 339 }
325 340
326 341 /* buffer remaining input */
|
↓ open down ↓ |
245 lines elided |
↑ open up ↑ |
327 342 bcopy(&input[i], &ctx->buf_un.buf8[buf_index], input_len - i);
328 343 }
329 344
330 345 #else /* VIS_SHA1 */
331 346
332 347 void
333 348 SHA1Update(SHA1_CTX *ctx, const void *inptr, size_t input_len)
334 349 {
335 350 uint32_t i, buf_index, buf_len;
336 351 const uint8_t *input = inptr;
337 -#if defined(__amd64)
352 +#if defined(__amd64) && !defined(_STANDALONE)
338 353 uint32_t block_count;
339 354 #endif /* __amd64 */
340 355
341 356 /* check for noop */
342 357 if (input_len == 0)
343 358 return;
344 359
345 360 /* compute number of bytes mod 64 */
346 361 buf_index = (ctx->count[1] >> 3) & 0x3F;
347 362
348 363 /* update number of bits */
349 364 if ((ctx->count[1] += (input_len << 3)) < (input_len << 3))
350 365 ctx->count[0]++;
351 366
352 367 ctx->count[0] += (input_len >> 29);
353 368
354 369 buf_len = 64 - buf_index;
355 370
356 371 /* transform as many times as possible */
357 372 i = 0;
358 373 if (input_len >= buf_len) {
359 374
360 375 /*
361 376 * general optimization:
362 377 *
363 378 * only do initial bcopy() and SHA1Transform() if
364 379 * buf_index != 0. if buf_index == 0, we're just
365 380 * wasting our time doing the bcopy() since there
|
↓ open down ↓ |
18 lines elided |
↑ open up ↑ |
366 381 * wasn't any data left over from a previous call to
367 382 * SHA1Update().
368 383 */
369 384
370 385 if (buf_index) {
371 386 bcopy(input, &ctx->buf_un.buf8[buf_index], buf_len);
372 387 SHA1_TRANSFORM(ctx, ctx->buf_un.buf8);
373 388 i = buf_len;
374 389 }
375 390
376 -#if !defined(__amd64)
391 +#if !defined(__amd64) || defined(_STANDALONE)
377 392 for (; i + 63 < input_len; i += 64)
378 393 SHA1_TRANSFORM(ctx, &input[i]);
379 394 #else
380 395 block_count = (input_len - i) >> 6;
381 396 if (block_count > 0) {
382 397 SHA1_TRANSFORM_BLOCKS(ctx, &input[i], block_count);
383 398 i += block_count << 6;
384 399 }
385 400 #endif /* !__amd64 */
386 401
387 402 /*
388 403 * general optimization:
389 404 *
390 405 * if i and input_len are the same, return now instead
391 406 * of calling bcopy(), since the bcopy() in this case
392 407 * will be an expensive nop.
393 408 */
394 409
395 410 if (input_len == i)
396 411 return;
397 412
398 413 buf_index = 0;
399 414 }
400 415
401 416 /* buffer remaining input */
402 417 bcopy(&input[i], &ctx->buf_un.buf8[buf_index], input_len - i);
403 418 }
404 419
405 420 #endif /* VIS_SHA1 */
406 421
407 422 /*
408 423 * SHA1Final()
409 424 *
410 425 * purpose: ends an sha1 digest operation, finalizing the message digest and
411 426 * zeroing the context.
412 427 * input: uchar_t * : A buffer to store the digest.
413 428 * : The function actually uses void* because many
414 429 * : callers pass things other than uchar_t here.
415 430 * SHA1_CTX * : the context to finalize, save, and zero
416 431 * output: void
417 432 */
418 433
419 434 void
420 435 SHA1Final(void *digest, SHA1_CTX *ctx)
421 436 {
422 437 uint8_t bitcount_be[sizeof (ctx->count)];
423 438 uint32_t index = (ctx->count[1] >> 3) & 0x3f;
424 439
425 440 /* store bit count, big endian */
426 441 Encode(bitcount_be, ctx->count, sizeof (bitcount_be));
427 442
428 443 /* pad out to 56 mod 64 */
429 444 SHA1Update(ctx, PADDING, ((index < 56) ? 56 : 120) - index);
430 445
431 446 /* append length (before padding) */
|
↓ open down ↓ |
45 lines elided |
↑ open up ↑ |
432 447 SHA1Update(ctx, bitcount_be, sizeof (bitcount_be));
433 448
434 449 /* store state in digest */
435 450 Encode(digest, ctx->state, sizeof (ctx->state));
436 451
437 452 /* zeroize sensitive information */
438 453 bzero(ctx, sizeof (*ctx));
439 454 }
440 455
441 456
442 -#if !defined(__amd64)
457 +#if !defined(__amd64) || defined(_STANDALONE)
443 458
444 459 typedef uint32_t sha1word;
445 460
446 461 /*
447 462 * sparc optimization:
448 463 *
449 464 * on the sparc, we can load big endian 32-bit data easily. note that
450 465 * special care must be taken to ensure the address is 32-bit aligned.
451 466 * in the interest of speed, we don't check to make sure, since
452 467 * careful programming can guarantee this for us.
453 468 */
454 469
455 470 #if defined(_BIG_ENDIAN)
456 471 #define LOAD_BIG_32(addr) (*(uint32_t *)(addr))
457 472
458 473 #elif defined(HAVE_HTONL)
459 474 #define LOAD_BIG_32(addr) htonl(*((uint32_t *)(addr)))
460 475
461 476 #else
462 477 /* little endian -- will work on big endian, but slowly */
463 478 #define LOAD_BIG_32(addr) \
464 479 (((addr)[0] << 24) | ((addr)[1] << 16) | ((addr)[2] << 8) | (addr)[3])
465 480 #endif /* _BIG_ENDIAN */
466 481
467 482 /*
468 483 * SHA1Transform()
469 484 */
470 485 #if defined(W_ARRAY)
471 486 #define W(n) w[n]
472 487 #else /* !defined(W_ARRAY) */
473 488 #define W(n) w_ ## n
474 489 #endif /* !defined(W_ARRAY) */
475 490
476 491
477 492 #if defined(__sparc)
478 493
479 494 /*
480 495 * sparc register window optimization:
481 496 *
482 497 * `a', `b', `c', `d', and `e' are passed into SHA1Transform
483 498 * explicitly since it increases the number of registers available to
484 499 * the compiler. under this scheme, these variables can be held in
485 500 * %i0 - %i4, which leaves more local and out registers available.
486 501 *
487 502 * purpose: sha1 transformation -- updates the digest based on `block'
488 503 * input: uint32_t : bytes 1 - 4 of the digest
489 504 * uint32_t : bytes 5 - 8 of the digest
490 505 * uint32_t : bytes 9 - 12 of the digest
491 506 * uint32_t : bytes 12 - 16 of the digest
492 507 * uint32_t : bytes 16 - 20 of the digest
493 508 * SHA1_CTX * : the context to update
494 509 * uint8_t [64]: the block to use to update the digest
495 510 * output: void
496 511 */
497 512
498 513 void
499 514 SHA1Transform(uint32_t a, uint32_t b, uint32_t c, uint32_t d, uint32_t e,
500 515 SHA1_CTX *ctx, const uint8_t blk[64])
501 516 {
502 517 /*
503 518 * sparc optimization:
504 519 *
505 520 * while it is somewhat counter-intuitive, on sparc, it is
506 521 * more efficient to place all the constants used in this
507 522 * function in an array and load the values out of the array
508 523 * than to manually load the constants. this is because
509 524 * setting a register to a 32-bit value takes two ops in most
510 525 * cases: a `sethi' and an `or', but loading a 32-bit value
511 526 * from memory only takes one `ld' (or `lduw' on v9). while
512 527 * this increases memory usage, the compiler can find enough
513 528 * other things to do while waiting to keep the pipeline does
514 529 * not stall. additionally, it is likely that many of these
515 530 * constants are cached so that later accesses do not even go
516 531 * out to the bus.
517 532 *
518 533 * this array is declared `static' to keep the compiler from
519 534 * having to bcopy() this array onto the stack frame of
520 535 * SHA1Transform() each time it is called -- which is
521 536 * unacceptably expensive.
522 537 *
523 538 * the `const' is to ensure that callers are good citizens and
524 539 * do not try to munge the array. since these routines are
525 540 * going to be called from inside multithreaded kernelland,
526 541 * this is a good safety check. -- `sha1_consts' will end up in
527 542 * .rodata.
528 543 *
529 544 * unfortunately, loading from an array in this manner hurts
530 545 * performance under Intel. So, there is a macro,
531 546 * SHA1_CONST(), used in SHA1Transform(), that either expands to
532 547 * a reference to this array, or to the actual constant,
533 548 * depending on what platform this code is compiled for.
534 549 */
535 550
536 551 static const uint32_t sha1_consts[] = {
537 552 SHA1_CONST_0, SHA1_CONST_1, SHA1_CONST_2, SHA1_CONST_3
538 553 };
539 554
540 555 /*
541 556 * general optimization:
542 557 *
543 558 * use individual integers instead of using an array. this is a
544 559 * win, although the amount it wins by seems to vary quite a bit.
545 560 */
546 561
547 562 uint32_t w_0, w_1, w_2, w_3, w_4, w_5, w_6, w_7;
548 563 uint32_t w_8, w_9, w_10, w_11, w_12, w_13, w_14, w_15;
549 564
550 565 /*
551 566 * sparc optimization:
552 567 *
553 568 * if `block' is already aligned on a 4-byte boundary, use
554 569 * LOAD_BIG_32() directly. otherwise, bcopy() into a
555 570 * buffer that *is* aligned on a 4-byte boundary and then do
556 571 * the LOAD_BIG_32() on that buffer. benchmarks have shown
557 572 * that using the bcopy() is better than loading the bytes
558 573 * individually and doing the endian-swap by hand.
559 574 *
560 575 * even though it's quite tempting to assign to do:
561 576 *
562 577 * blk = bcopy(ctx->buf_un.buf32, blk, sizeof (ctx->buf_un.buf32));
563 578 *
564 579 * and only have one set of LOAD_BIG_32()'s, the compiler
565 580 * *does not* like that, so please resist the urge.
566 581 */
567 582
568 583 if ((uintptr_t)blk & 0x3) { /* not 4-byte aligned? */
569 584 bcopy(blk, ctx->buf_un.buf32, sizeof (ctx->buf_un.buf32));
570 585 w_15 = LOAD_BIG_32(ctx->buf_un.buf32 + 15);
571 586 w_14 = LOAD_BIG_32(ctx->buf_un.buf32 + 14);
572 587 w_13 = LOAD_BIG_32(ctx->buf_un.buf32 + 13);
573 588 w_12 = LOAD_BIG_32(ctx->buf_un.buf32 + 12);
574 589 w_11 = LOAD_BIG_32(ctx->buf_un.buf32 + 11);
575 590 w_10 = LOAD_BIG_32(ctx->buf_un.buf32 + 10);
576 591 w_9 = LOAD_BIG_32(ctx->buf_un.buf32 + 9);
577 592 w_8 = LOAD_BIG_32(ctx->buf_un.buf32 + 8);
578 593 w_7 = LOAD_BIG_32(ctx->buf_un.buf32 + 7);
579 594 w_6 = LOAD_BIG_32(ctx->buf_un.buf32 + 6);
580 595 w_5 = LOAD_BIG_32(ctx->buf_un.buf32 + 5);
581 596 w_4 = LOAD_BIG_32(ctx->buf_un.buf32 + 4);
582 597 w_3 = LOAD_BIG_32(ctx->buf_un.buf32 + 3);
583 598 w_2 = LOAD_BIG_32(ctx->buf_un.buf32 + 2);
584 599 w_1 = LOAD_BIG_32(ctx->buf_un.buf32 + 1);
585 600 w_0 = LOAD_BIG_32(ctx->buf_un.buf32 + 0);
586 601 } else {
587 602 /* LINTED E_BAD_PTR_CAST_ALIGN */
588 603 w_15 = LOAD_BIG_32(blk + 60);
589 604 /* LINTED E_BAD_PTR_CAST_ALIGN */
590 605 w_14 = LOAD_BIG_32(blk + 56);
591 606 /* LINTED E_BAD_PTR_CAST_ALIGN */
592 607 w_13 = LOAD_BIG_32(blk + 52);
593 608 /* LINTED E_BAD_PTR_CAST_ALIGN */
594 609 w_12 = LOAD_BIG_32(blk + 48);
595 610 /* LINTED E_BAD_PTR_CAST_ALIGN */
596 611 w_11 = LOAD_BIG_32(blk + 44);
597 612 /* LINTED E_BAD_PTR_CAST_ALIGN */
598 613 w_10 = LOAD_BIG_32(blk + 40);
599 614 /* LINTED E_BAD_PTR_CAST_ALIGN */
600 615 w_9 = LOAD_BIG_32(blk + 36);
601 616 /* LINTED E_BAD_PTR_CAST_ALIGN */
602 617 w_8 = LOAD_BIG_32(blk + 32);
603 618 /* LINTED E_BAD_PTR_CAST_ALIGN */
604 619 w_7 = LOAD_BIG_32(blk + 28);
605 620 /* LINTED E_BAD_PTR_CAST_ALIGN */
606 621 w_6 = LOAD_BIG_32(blk + 24);
607 622 /* LINTED E_BAD_PTR_CAST_ALIGN */
608 623 w_5 = LOAD_BIG_32(blk + 20);
609 624 /* LINTED E_BAD_PTR_CAST_ALIGN */
610 625 w_4 = LOAD_BIG_32(blk + 16);
611 626 /* LINTED E_BAD_PTR_CAST_ALIGN */
612 627 w_3 = LOAD_BIG_32(blk + 12);
613 628 /* LINTED E_BAD_PTR_CAST_ALIGN */
614 629 w_2 = LOAD_BIG_32(blk + 8);
615 630 /* LINTED E_BAD_PTR_CAST_ALIGN */
616 631 w_1 = LOAD_BIG_32(blk + 4);
617 632 /* LINTED E_BAD_PTR_CAST_ALIGN */
618 633 w_0 = LOAD_BIG_32(blk + 0);
619 634 }
620 635 #else /* !defined(__sparc) */
621 636
622 637 void /* CSTYLED */
623 638 SHA1Transform(SHA1_CTX *ctx, const uint8_t blk[64])
624 639 {
625 640 /* CSTYLED */
626 641 sha1word a = ctx->state[0];
627 642 sha1word b = ctx->state[1];
628 643 sha1word c = ctx->state[2];
629 644 sha1word d = ctx->state[3];
630 645 sha1word e = ctx->state[4];
631 646
632 647 #if defined(W_ARRAY)
633 648 sha1word w[16];
634 649 #else /* !defined(W_ARRAY) */
635 650 sha1word w_0, w_1, w_2, w_3, w_4, w_5, w_6, w_7;
636 651 sha1word w_8, w_9, w_10, w_11, w_12, w_13, w_14, w_15;
637 652 #endif /* !defined(W_ARRAY) */
638 653
639 654 W(0) = LOAD_BIG_32((void *)(blk + 0));
640 655 W(1) = LOAD_BIG_32((void *)(blk + 4));
641 656 W(2) = LOAD_BIG_32((void *)(blk + 8));
642 657 W(3) = LOAD_BIG_32((void *)(blk + 12));
643 658 W(4) = LOAD_BIG_32((void *)(blk + 16));
644 659 W(5) = LOAD_BIG_32((void *)(blk + 20));
645 660 W(6) = LOAD_BIG_32((void *)(blk + 24));
646 661 W(7) = LOAD_BIG_32((void *)(blk + 28));
647 662 W(8) = LOAD_BIG_32((void *)(blk + 32));
648 663 W(9) = LOAD_BIG_32((void *)(blk + 36));
649 664 W(10) = LOAD_BIG_32((void *)(blk + 40));
650 665 W(11) = LOAD_BIG_32((void *)(blk + 44));
651 666 W(12) = LOAD_BIG_32((void *)(blk + 48));
652 667 W(13) = LOAD_BIG_32((void *)(blk + 52));
653 668 W(14) = LOAD_BIG_32((void *)(blk + 56));
654 669 W(15) = LOAD_BIG_32((void *)(blk + 60));
655 670
656 671 #endif /* !defined(__sparc) */
657 672
658 673 /*
659 674 * general optimization:
660 675 *
661 676 * even though this approach is described in the standard as
662 677 * being slower algorithmically, it is 30-40% faster than the
663 678 * "faster" version under SPARC, because this version has more
664 679 * of the constraints specified at compile-time and uses fewer
665 680 * variables (and therefore has better register utilization)
666 681 * than its "speedier" brother. (i've tried both, trust me)
667 682 *
668 683 * for either method given in the spec, there is an "assignment"
669 684 * phase where the following takes place:
670 685 *
671 686 * tmp = (main_computation);
672 687 * e = d; d = c; c = rotate_left(b, 30); b = a; a = tmp;
673 688 *
674 689 * we can make the algorithm go faster by not doing this work,
675 690 * but just pretending that `d' is now `e', etc. this works
676 691 * really well and obviates the need for a temporary variable.
677 692 * however, we still explicitly perform the rotate action,
678 693 * since it is cheaper on SPARC to do it once than to have to
679 694 * do it over and over again.
680 695 */
681 696
682 697 /* round 1 */
683 698 e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(0) + SHA1_CONST(0); /* 0 */
684 699 b = ROTATE_LEFT(b, 30);
685 700
686 701 d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(1) + SHA1_CONST(0); /* 1 */
687 702 a = ROTATE_LEFT(a, 30);
688 703
689 704 c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(2) + SHA1_CONST(0); /* 2 */
690 705 e = ROTATE_LEFT(e, 30);
691 706
692 707 b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(3) + SHA1_CONST(0); /* 3 */
693 708 d = ROTATE_LEFT(d, 30);
694 709
695 710 a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(4) + SHA1_CONST(0); /* 4 */
696 711 c = ROTATE_LEFT(c, 30);
697 712
698 713 e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(5) + SHA1_CONST(0); /* 5 */
699 714 b = ROTATE_LEFT(b, 30);
700 715
701 716 d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(6) + SHA1_CONST(0); /* 6 */
702 717 a = ROTATE_LEFT(a, 30);
703 718
704 719 c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(7) + SHA1_CONST(0); /* 7 */
705 720 e = ROTATE_LEFT(e, 30);
706 721
707 722 b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(8) + SHA1_CONST(0); /* 8 */
708 723 d = ROTATE_LEFT(d, 30);
709 724
710 725 a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(9) + SHA1_CONST(0); /* 9 */
711 726 c = ROTATE_LEFT(c, 30);
712 727
713 728 e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(10) + SHA1_CONST(0); /* 10 */
714 729 b = ROTATE_LEFT(b, 30);
715 730
716 731 d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(11) + SHA1_CONST(0); /* 11 */
717 732 a = ROTATE_LEFT(a, 30);
718 733
719 734 c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(12) + SHA1_CONST(0); /* 12 */
720 735 e = ROTATE_LEFT(e, 30);
721 736
722 737 b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(13) + SHA1_CONST(0); /* 13 */
723 738 d = ROTATE_LEFT(d, 30);
724 739
725 740 a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(14) + SHA1_CONST(0); /* 14 */
726 741 c = ROTATE_LEFT(c, 30);
727 742
728 743 e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(15) + SHA1_CONST(0); /* 15 */
729 744 b = ROTATE_LEFT(b, 30);
730 745
731 746 W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 16 */
732 747 d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(0) + SHA1_CONST(0);
733 748 a = ROTATE_LEFT(a, 30);
734 749
735 750 W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 17 */
736 751 c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(1) + SHA1_CONST(0);
737 752 e = ROTATE_LEFT(e, 30);
738 753
739 754 W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 18 */
740 755 b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(2) + SHA1_CONST(0);
741 756 d = ROTATE_LEFT(d, 30);
742 757
743 758 W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 19 */
744 759 a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(3) + SHA1_CONST(0);
745 760 c = ROTATE_LEFT(c, 30);
746 761
747 762 /* round 2 */
748 763 W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 20 */
749 764 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(4) + SHA1_CONST(1);
750 765 b = ROTATE_LEFT(b, 30);
751 766
752 767 W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 21 */
753 768 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(5) + SHA1_CONST(1);
754 769 a = ROTATE_LEFT(a, 30);
755 770
756 771 W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 22 */
757 772 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(6) + SHA1_CONST(1);
758 773 e = ROTATE_LEFT(e, 30);
759 774
760 775 W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 23 */
761 776 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(7) + SHA1_CONST(1);
762 777 d = ROTATE_LEFT(d, 30);
763 778
764 779 W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 24 */
765 780 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(8) + SHA1_CONST(1);
766 781 c = ROTATE_LEFT(c, 30);
767 782
768 783 W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 25 */
769 784 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(9) + SHA1_CONST(1);
770 785 b = ROTATE_LEFT(b, 30);
771 786
772 787 W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 26 */
773 788 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(10) + SHA1_CONST(1);
774 789 a = ROTATE_LEFT(a, 30);
775 790
776 791 W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 27 */
777 792 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(11) + SHA1_CONST(1);
778 793 e = ROTATE_LEFT(e, 30);
779 794
780 795 W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 28 */
781 796 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(12) + SHA1_CONST(1);
782 797 d = ROTATE_LEFT(d, 30);
783 798
784 799 W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 29 */
785 800 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(13) + SHA1_CONST(1);
786 801 c = ROTATE_LEFT(c, 30);
787 802
788 803 W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 30 */
789 804 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(14) + SHA1_CONST(1);
790 805 b = ROTATE_LEFT(b, 30);
791 806
792 807 W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 31 */
793 808 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(15) + SHA1_CONST(1);
794 809 a = ROTATE_LEFT(a, 30);
795 810
796 811 W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 32 */
797 812 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(0) + SHA1_CONST(1);
798 813 e = ROTATE_LEFT(e, 30);
799 814
800 815 W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 33 */
801 816 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(1) + SHA1_CONST(1);
802 817 d = ROTATE_LEFT(d, 30);
803 818
804 819 W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 34 */
805 820 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(2) + SHA1_CONST(1);
806 821 c = ROTATE_LEFT(c, 30);
807 822
808 823 W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 35 */
809 824 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(3) + SHA1_CONST(1);
810 825 b = ROTATE_LEFT(b, 30);
811 826
812 827 W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 36 */
813 828 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(4) + SHA1_CONST(1);
814 829 a = ROTATE_LEFT(a, 30);
815 830
816 831 W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 37 */
817 832 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(5) + SHA1_CONST(1);
818 833 e = ROTATE_LEFT(e, 30);
819 834
820 835 W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 38 */
821 836 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(6) + SHA1_CONST(1);
822 837 d = ROTATE_LEFT(d, 30);
823 838
824 839 W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 39 */
825 840 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(7) + SHA1_CONST(1);
826 841 c = ROTATE_LEFT(c, 30);
827 842
828 843 /* round 3 */
829 844 W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 40 */
830 845 e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(8) + SHA1_CONST(2);
831 846 b = ROTATE_LEFT(b, 30);
832 847
833 848 W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 41 */
834 849 d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(9) + SHA1_CONST(2);
835 850 a = ROTATE_LEFT(a, 30);
836 851
837 852 W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 42 */
838 853 c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(10) + SHA1_CONST(2);
839 854 e = ROTATE_LEFT(e, 30);
840 855
841 856 W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 43 */
842 857 b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(11) + SHA1_CONST(2);
843 858 d = ROTATE_LEFT(d, 30);
844 859
845 860 W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 44 */
846 861 a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(12) + SHA1_CONST(2);
847 862 c = ROTATE_LEFT(c, 30);
848 863
849 864 W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 45 */
850 865 e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(13) + SHA1_CONST(2);
851 866 b = ROTATE_LEFT(b, 30);
852 867
853 868 W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 46 */
854 869 d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(14) + SHA1_CONST(2);
855 870 a = ROTATE_LEFT(a, 30);
856 871
857 872 W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 47 */
858 873 c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(15) + SHA1_CONST(2);
859 874 e = ROTATE_LEFT(e, 30);
860 875
861 876 W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 48 */
862 877 b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(0) + SHA1_CONST(2);
863 878 d = ROTATE_LEFT(d, 30);
864 879
865 880 W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 49 */
866 881 a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(1) + SHA1_CONST(2);
867 882 c = ROTATE_LEFT(c, 30);
868 883
869 884 W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 50 */
870 885 e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(2) + SHA1_CONST(2);
871 886 b = ROTATE_LEFT(b, 30);
872 887
873 888 W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 51 */
874 889 d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(3) + SHA1_CONST(2);
875 890 a = ROTATE_LEFT(a, 30);
876 891
877 892 W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 52 */
878 893 c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(4) + SHA1_CONST(2);
879 894 e = ROTATE_LEFT(e, 30);
880 895
881 896 W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 53 */
882 897 b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(5) + SHA1_CONST(2);
883 898 d = ROTATE_LEFT(d, 30);
884 899
885 900 W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 54 */
886 901 a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(6) + SHA1_CONST(2);
887 902 c = ROTATE_LEFT(c, 30);
888 903
889 904 W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 55 */
890 905 e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(7) + SHA1_CONST(2);
891 906 b = ROTATE_LEFT(b, 30);
892 907
893 908 W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 56 */
894 909 d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(8) + SHA1_CONST(2);
895 910 a = ROTATE_LEFT(a, 30);
896 911
897 912 W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 57 */
898 913 c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(9) + SHA1_CONST(2);
899 914 e = ROTATE_LEFT(e, 30);
900 915
901 916 W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 58 */
902 917 b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(10) + SHA1_CONST(2);
903 918 d = ROTATE_LEFT(d, 30);
904 919
905 920 W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 59 */
906 921 a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(11) + SHA1_CONST(2);
907 922 c = ROTATE_LEFT(c, 30);
908 923
909 924 /* round 4 */
910 925 W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 60 */
911 926 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(12) + SHA1_CONST(3);
912 927 b = ROTATE_LEFT(b, 30);
913 928
914 929 W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 61 */
915 930 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(13) + SHA1_CONST(3);
916 931 a = ROTATE_LEFT(a, 30);
917 932
918 933 W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 62 */
919 934 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(14) + SHA1_CONST(3);
920 935 e = ROTATE_LEFT(e, 30);
921 936
922 937 W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 63 */
923 938 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(15) + SHA1_CONST(3);
924 939 d = ROTATE_LEFT(d, 30);
925 940
926 941 W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 64 */
927 942 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(0) + SHA1_CONST(3);
928 943 c = ROTATE_LEFT(c, 30);
929 944
930 945 W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 65 */
931 946 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(1) + SHA1_CONST(3);
932 947 b = ROTATE_LEFT(b, 30);
933 948
934 949 W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 66 */
935 950 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(2) + SHA1_CONST(3);
936 951 a = ROTATE_LEFT(a, 30);
937 952
938 953 W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 67 */
939 954 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(3) + SHA1_CONST(3);
940 955 e = ROTATE_LEFT(e, 30);
941 956
942 957 W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 68 */
943 958 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(4) + SHA1_CONST(3);
944 959 d = ROTATE_LEFT(d, 30);
945 960
946 961 W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 69 */
947 962 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(5) + SHA1_CONST(3);
948 963 c = ROTATE_LEFT(c, 30);
949 964
950 965 W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 70 */
951 966 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(6) + SHA1_CONST(3);
952 967 b = ROTATE_LEFT(b, 30);
953 968
954 969 W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 71 */
955 970 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(7) + SHA1_CONST(3);
956 971 a = ROTATE_LEFT(a, 30);
957 972
958 973 W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 72 */
959 974 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(8) + SHA1_CONST(3);
960 975 e = ROTATE_LEFT(e, 30);
961 976
962 977 W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 73 */
963 978 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(9) + SHA1_CONST(3);
964 979 d = ROTATE_LEFT(d, 30);
965 980
966 981 W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 74 */
967 982 a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(10) + SHA1_CONST(3);
968 983 c = ROTATE_LEFT(c, 30);
969 984
970 985 W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 75 */
971 986 e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(11) + SHA1_CONST(3);
972 987 b = ROTATE_LEFT(b, 30);
973 988
974 989 W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 76 */
975 990 d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(12) + SHA1_CONST(3);
976 991 a = ROTATE_LEFT(a, 30);
977 992
978 993 W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 77 */
979 994 c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(13) + SHA1_CONST(3);
980 995 e = ROTATE_LEFT(e, 30);
981 996
982 997 W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 78 */
983 998 b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(14) + SHA1_CONST(3);
984 999 d = ROTATE_LEFT(d, 30);
985 1000
986 1001 W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 79 */
987 1002
988 1003 ctx->state[0] += ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(15) +
989 1004 SHA1_CONST(3);
990 1005 ctx->state[1] += b;
991 1006 ctx->state[2] += ROTATE_LEFT(c, 30);
992 1007 ctx->state[3] += d;
993 1008 ctx->state[4] += e;
994 1009
995 1010 /* zeroize sensitive information */
996 1011 W(0) = W(1) = W(2) = W(3) = W(4) = W(5) = W(6) = W(7) = W(8) = 0;
997 1012 W(9) = W(10) = W(11) = W(12) = W(13) = W(14) = W(15) = 0;
998 1013 }
999 1014 #endif /* !__amd64 */
1000 1015
1001 1016
1002 1017 /*
1003 1018 * Encode()
1004 1019 *
1005 1020 * purpose: to convert a list of numbers from little endian to big endian
1006 1021 * input: uint8_t * : place to store the converted big endian numbers
1007 1022 * uint32_t * : place to get numbers to convert from
1008 1023 * size_t : the length of the input in bytes
1009 1024 * output: void
1010 1025 */
1011 1026
1012 1027 static void
1013 1028 Encode(uint8_t *_RESTRICT_KYWD output, const uint32_t *_RESTRICT_KYWD input,
1014 1029 size_t len)
1015 1030 {
1016 1031 size_t i, j;
1017 1032
1018 1033 #if defined(__sparc)
1019 1034 if (IS_P2ALIGNED(output, sizeof (uint32_t))) {
1020 1035 for (i = 0, j = 0; j < len; i++, j += 4) {
1021 1036 /* LINTED E_BAD_PTR_CAST_ALIGN */
1022 1037 *((uint32_t *)(output + j)) = input[i];
1023 1038 }
1024 1039 } else {
1025 1040 #endif /* little endian -- will work on big endian, but slowly */
1026 1041 for (i = 0, j = 0; j < len; i++, j += 4) {
1027 1042 output[j] = (input[i] >> 24) & 0xff;
1028 1043 output[j + 1] = (input[i] >> 16) & 0xff;
1029 1044 output[j + 2] = (input[i] >> 8) & 0xff;
1030 1045 output[j + 3] = input[i] & 0xff;
1031 1046 }
1032 1047 #if defined(__sparc)
1033 1048 }
1034 1049 #endif
1035 1050 }
|
↓ open down ↓ |
583 lines elided |
↑ open up ↑ |
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX