big-one Wdiff usr/src/common/crypto/sha1/sha1.c

Print this page

NEX-16819 loader UEFI support
Includes work by Toomas Soome <tsoome@me.com>
Upstream commits:
    loader: pxe receive cleanup
    9475 libefi: Do not return only if ReceiveFilter
    installboot: should support efi system partition
    8931 boot1.efi: scan all display modes rather than
    loader: spinconsole updates
    loader: gfx experiment to try GOP Blt() function.
    sha1 build test
    loader: add sha1 hash calculation
    common/sha1: update for loader build
    loader: biosdisk rework
    uts: 32-bit kernel FB needs mapping in low memory
    uts: add diag-device
    uts: boot console mirror with diag-device
    uts: enable very early console on ttya
    kmdb: add diag-device as input/output device
    uts: test VGA memory exclusion from mapping
    uts: clear boot mapping and protect boot pages test
    uts: add dboot map debug printf
    uts: need to release FB pages in release_bootstrap()
    uts: add screenmap ioctl
    uts: update sys/queue.h
    loader: add illumos uts/common to include path
    loader: tem/gfx font cleanup
    loader: vbe checks
    uts: gfx_private set KD_TEXT when KD_RESETTEXT is
    uts: gfx 8-bit update
    loader: gfx 8-bit fix
    loader: always set media size from partition.
    uts: MB2 support for 32-bit kernel
    loader: x86 should have tem 80x25
    uts: x86 should have tem 80x25
    uts: font update
    loader: font update
    uts: tem attributes
    loader: tem.c comment added
    uts: use font module
    loader: add font module
    loader: build rules for new font setup
    uts: gfx_private update for new font structure
    uts: early boot update for new font structure
    uts: font update
    uts: font build rules update for new fonts
    uts: tem update to new font structure
    loader: module.c needs to include tem_impl.h
    uts: gfx_private 8x16 font rework
    uts: make font_lookup public
    loader: font rework
    uts: font rework
    9259 libefi: efi_alloc_and_read should check for PMBR
    uts: tem utf-8 support
    loader: implement tem utf-8 support
    loader: tem should be able to display UTF-8
    7784 uts: console input should support utf-8
    7796 uts: ldterm default to utf-8
    uts: do not reset serial console
    uts: set up colors even if tem is not console
    uts: add type for early boot properties
    uts: gfx_private experiment with drm and vga
    uts: gfx_private should use setmode drm callback.
    uts: identify FB types and set up gfx_private based
    loader: replace gop and vesa with framebuffer
    uts: boot needs simple tem to support mdb
    uts: boot_keyboard should emit esc sequences for
    uts: gfx_private FB showuld be written by line
    kmdb: set terminal window size
    uts: gfx_private needs to keep track of early boot FB
    pnglite: move pnglite to usr/src/common
    loader: gfx_fb
    ficl-sys: add gfx primitives
    loader: add illumos.png logo
    ficl: add fb-putimage
    loader: add png support
    loader: add alpha blending for gfx_fb
    loader: use term-drawrect for menu frame
    ficl: add simple gfx words
    uts: provide fb_info via fbgattr dev_specific array.
    uts: gfx_private add alpha blending
    uts: update sys/ascii.h
    uts: tem OSC support (incomplete)
    uts: implement env module support and use data from
    uts: tem get colors from early boot data
    loader: use crc32 from libstand (libz)
    loader: optimize for size
    loader: pass tem info to the environment
    loader: import tem for loader console
    loader: UEFI loader needs to set ISADIR based on
    loader: need UEFI32 support
    8918 loader.efi: add vesa edid support
    uts: tem_safe_pix_clear_prom_output() should only
    uts: tem_safe_pix_clear_entire_screen() should use
    uts: tem_safe_check_first_time() should query cursor
    uts: tem implement cls callback & visual_io v4
    uts: gfx_vgatext use block cursor for vgatext
    uts: gfx_private implement cls callback & visual_io
    uts: gfx_private bitmap framebuffer implementation
    uts: early start frame buffer console support
    uts: font functions should check the input char
    uts: font rendering should support 16/24/32bit depths
    uts: use smallest font as fallback default.
    uts: update terminal dimensions based on selected
    7834 uts: vgatext should use gfx_private
    uts: add spacing property to 8859-1.bdf
    terminfo: add underline for sun-color
    terminfo: sun-color has 16 colors
    uts: add font load callback type
    loader: do not repeat int13 calls with error 0x20 and
    8905 loader: add skein/edonr support
    8904 common/crypto: make skein and edonr loader
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Revert "NEX-16819 loader UEFI support"
This reverts commit ec06b9fc617b99234e538bf2e7e4d02a24993e0c.
Reverting due to failures in the zfs-tests and the sharefs-tests
NEX-16819 loader UEFI support
Includes work by Toomas Soome <tsoome@me.com>
Upstream commits:
    loader: pxe receive cleanup
    9475 libefi: Do not return only if ReceiveFilter
    installboot: should support efi system partition
    8931 boot1.efi: scan all display modes rather than
    loader: spinconsole updates
    loader: gfx experiment to try GOP Blt() function.
    sha1 build test
    loader: add sha1 hash calculation
    common/sha1: update for loader build
    loader: biosdisk rework
    uts: 32-bit kernel FB needs mapping in low memory
    uts: add diag-device
    uts: boot console mirror with diag-device
    uts: enable very early console on ttya
    kmdb: add diag-device as input/output device
    uts: test VGA memory exclusion from mapping
    uts: clear boot mapping and protect boot pages test
    uts: add dboot map debug printf
    uts: need to release FB pages in release_bootstrap()
    uts: add screenmap ioctl
    uts: update sys/queue.h
    loader: add illumos uts/common to include path
    loader: tem/gfx font cleanup
    loader: vbe checks
    uts: gfx_private set KD_TEXT when KD_RESETTEXT is
    uts: gfx 8-bit update
    loader: gfx 8-bit fix
    loader: always set media size from partition.
    uts: MB2 support for 32-bit kernel
    loader: x86 should have tem 80x25
    uts: x86 should have tem 80x25
    uts: font update
    loader: font update
    uts: tem attributes
    loader: tem.c comment added
    uts: use font module
    loader: add font module
    loader: build rules for new font setup
    uts: gfx_private update for new font structure
    uts: early boot update for new font structure
    uts: font update
    uts: font build rules update for new fonts
    uts: tem update to new font structure
    loader: module.c needs to include tem_impl.h
    uts: gfx_private 8x16 font rework
    uts: make font_lookup public
    loader: font rework
    uts: font rework
    libefi: efi_alloc_and_read should check for PMBR
    uts: tem utf-8 support
    loader: implement tem utf-8 support
    loader: tem should be able to display UTF-8
    7784 uts: console input should support utf-8
    7796 uts: ldterm default to utf-8
    uts: do not reset serial console
    uts: set up colors even if tem is not console
    uts: add type for early boot properties
    uts: gfx_private experiment with drm and vga
    uts: gfx_private should use setmode drm callback.
    uts: identify FB types and set up gfx_private based
    loader: replace gop and vesa with framebuffer
    uts: boot needs simple tem to support mdb
    uts: boot_keyboard should emit esc sequences for
    uts: gfx_private FB showuld be written by line
    kmdb: set terminal window size
    uts: gfx_private needs to keep track of early boot FB
    pnglite: move pnglite to usr/src/common
    loader: gfx_fb
    ficl-sys: add gfx primitives
    loader: add illumos.png logo
    ficl: add fb-putimage
    loader: add png support
    loader: add alpha blending for gfx_fb
    loader: use term-drawrect for menu frame
    ficl: add simple gfx words
    uts: provide fb_info via fbgattr dev_specific array.
    uts: gfx_private add alpha blending
    uts: update sys/ascii.h
    uts: tem OSC support (incomplete)
    uts: implement env module support and use data from
    uts: tem get colors from early boot data
    loader: use crc32 from libstand (libz)
    loader: optimize for size
    loader: pass tem info to the environment
    loader: import tem for loader console
    loader: UEFI loader needs to set ISADIR based on
    loader: need UEFI32 support
    8918 loader.efi: add vesa edid support
    uts: tem_safe_pix_clear_prom_output() should only
    uts: tem_safe_pix_clear_entire_screen() should use
    uts: tem_safe_check_first_time() should query cursor
    uts: tem implement cls callback & visual_io v4
    uts: gfx_vgatext use block cursor for vgatext
    uts: gfx_private implement cls callback & visual_io
    uts: gfx_private bitmap framebuffer implementation
    uts: early start frame buffer console support
    uts: font functions should check the input char
    uts: font rendering should support 16/24/32bit depths
    uts: use smallest font as fallback default.
    uts: update terminal dimensions based on selected
    7834 uts: vgatext should use gfx_private
    uts: add spacing property to 8859-1.bdf
    terminfo: add underline for sun-color
    terminfo: sun-color has 16 colors
    uts: add font load callback type
    loader: do not repeat int13 calls with error 0x20 and
    8905 loader: add skein/edonr support
    8904 common/crypto: make skein and edonr loader
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>

Split	Close
Expand all
Collapse all

          --- old/usr/src/common/crypto/sha1/sha1.c
          +++ new/usr/src/common/crypto/sha1/sha1.c

   1    1  /*
   2    2   * Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
   3    3   * Use is subject to license terms.
   4    4   */
   5    5  
   6    6  /*
   7    7   * The basic framework for this code came from the reference
   8    8   * implementation for MD5.  That implementation is Copyright (C)
   9    9   * 1991-2, RSA Data Security, Inc. Created 1991. All rights reserved.
  10   10   *
  11   11   * License to copy and use this software is granted provided that it
  12   12   * is identified as the "RSA Data Security, Inc. MD5 Message-Digest
  13   13   * Algorithm" in all material mentioning or referencing this software
  14   14   * or this function.
  15   15   *
  16   16   * License is also granted to make and use derivative works provided
  17   17   * that such works are identified as "derived from the RSA Data
  18   18   * Security, Inc. MD5 Message-Digest Algorithm" in all material
  19   19   * mentioning or referencing the derived work.
  20   20   *
  21   21   * RSA Data Security, Inc. makes no representations concerning either
  22   22   * the merchantability of this software or the suitability of this
  23   23   * software for any particular purpose. It is provided "as is"
  24   24   * without express or implied warranty of any kind.

↓ open down ↓

24 lines elided

↑ open up ↑

  25   25   *
  26   26   * These notices must be retained in any copies of any part of this
  27   27   * documentation and/or software.
  28   28   *
  29   29   * NOTE: Cleaned-up and optimized, version of SHA1, based on the FIPS 180-1
  30   30   * standard, available at http://www.itl.nist.gov/fipspubs/fip180-1.htm
  31   31   * Not as fast as one would like -- further optimizations are encouraged
  32   32   * and appreciated.
  33   33   */
  34   34  
       35 +#if defined(_STANDALONE)
       36 +#include <sys/cdefs.h>
       37 +#define _RESTRICT_KYWD  restrict
       38 +#else
  35   39  #if !defined(_KERNEL) && !defined(_BOOT)
  36   40  #include <stdint.h>
  37   41  #include <strings.h>
  38   42  #include <stdlib.h>
  39   43  #include <errno.h>
  40   44  #include <sys/systeminfo.h>
  41   45  #endif  /* !_KERNEL && !_BOOT */
       46 +#endif  /* _STANDALONE */
  42   47  
  43   48  #include <sys/types.h>
  44   49  #include <sys/param.h>
  45   50  #include <sys/systm.h>
  46   51  #include <sys/sysmacros.h>
  47   52  #include <sys/sha1.h>
  48   53  #include <sys/sha1_consts.h>
  49   54  
       55 +#if defined(_STANDALONE)
       56 +#include <sys/endian.h>
       57 +#define HAVE_HTONL
       58 +#if _BYTE_ORDER == _LITTLE_ENDIAN
       59 +#undef _BIG_ENDIAN
       60 +#else
       61 +#undef _LITTLE_ENDIAN
       62 +#endif
       63 +#else
  50   64  #ifdef _LITTLE_ENDIAN
  51   65  #include <sys/byteorder.h>
  52   66  #define HAVE_HTONL
  53   67  #endif
       68 +#endif
  54   69  
  55   70  #ifdef  _BOOT
  56   71  #define bcopy(_s, _d, _l)       ((void) memcpy((_d), (_s), (_l)))
  57   72  #define bzero(_m, _l)           ((void) memset((_m), 0, (_l)))
  58   73  #endif
  59   74  
  60   75  static void Encode(uint8_t *, const uint32_t *, size_t);
  61   76  
  62   77  #if     defined(__sparc)
  63   78  
  64   79  #define SHA1_TRANSFORM(ctx, in) \
  65   80          SHA1Transform((ctx)->state[0], (ctx)->state[1], (ctx)->state[2], \
  66   81                  (ctx)->state[3], (ctx)->state[4], (ctx), (in))
  67   82  
  68   83  static void SHA1Transform(uint32_t, uint32_t, uint32_t, uint32_t, uint32_t,
  69   84      SHA1_CTX *, const uint8_t *);
  70   85  
  71      -#elif   defined(__amd64)
       86 +#elif   defined(__amd64) && !defined(_STANDALONE)
  72   87  
  73   88  #define SHA1_TRANSFORM(ctx, in) sha1_block_data_order((ctx), (in), 1)
  74   89  #define SHA1_TRANSFORM_BLOCKS(ctx, in, num) sha1_block_data_order((ctx), \
  75   90                  (in), (num))
  76   91  
  77   92  void sha1_block_data_order(SHA1_CTX *ctx, const void *inpp, size_t num_blocks);
  78   93  
  79   94  #else
  80   95  
  81   96  #define SHA1_TRANSFORM(ctx, in) SHA1Transform((ctx), (in))

  82   97  
  83   98  static void SHA1Transform(SHA1_CTX *, const uint8_t *);
  84   99  
  85  100  #endif
  86  101  
  87  102  
  88  103  static uint8_t PADDING[64] = { 0x80, /* all zeros */ };
  89  104  
  90  105  /*
  91  106   * F, G, and H are the basic SHA1 functions.
  92  107   */
  93  108  #define F(b, c, d)      (((b) & (c)) | ((~b) & (d)))
  94  109  #define G(b, c, d)      ((b) ^ (c) ^ (d))
  95  110  #define H(b, c, d)      (((b) & (c)) | (((b)|(c)) & (d)))
  96  111  
  97  112  /*
  98  113   * ROTATE_LEFT rotates x left n bits.
  99  114   */
 100  115  
 101  116  #if     defined(__GNUC__) && defined(_LP64)
 102  117  static __inline__ uint64_t
 103  118  ROTATE_LEFT(uint64_t value, uint32_t n)
 104  119  {
 105  120          uint32_t t32;
 106  121  
 107  122          t32 = (uint32_t)value;
 108  123          return ((t32 << n) | (t32 >> (32 - n)));
 109  124  }
 110  125  
 111  126  #else
 112  127  
 113  128  #define ROTATE_LEFT(x, n)       \
 114  129          (((x) << (n)) | ((x) >> ((sizeof (x) * NBBY)-(n))))
 115  130  
 116  131  #endif
 117  132  
 118  133  
 119  134  /*
 120  135   * SHA1Init()
 121  136   *
 122  137   * purpose: initializes the sha1 context and begins and sha1 digest operation
 123  138   *   input: SHA1_CTX *  : the context to initializes.
 124  139   *  output: void
 125  140   */
 126  141  
 127  142  void
 128  143  SHA1Init(SHA1_CTX *ctx)
 129  144  {
 130  145          ctx->count[0] = ctx->count[1] = 0;
 131  146  
 132  147          /*
 133  148           * load magic initialization constants. Tell lint
 134  149           * that these constants are unsigned by using U.
 135  150           */
 136  151  
 137  152          ctx->state[0] = 0x67452301U;
 138  153          ctx->state[1] = 0xefcdab89U;
 139  154          ctx->state[2] = 0x98badcfeU;
 140  155          ctx->state[3] = 0x10325476U;
 141  156          ctx->state[4] = 0xc3d2e1f0U;
 142  157  }
 143  158  
 144  159  #ifdef VIS_SHA1
 145  160  #ifdef _KERNEL
 146  161  
 147  162  #include <sys/regset.h>
 148  163  #include <sys/vis.h>
 149  164  #include <sys/fpu/fpusystm.h>
 150  165  
 151  166  /* the alignment for block stores to save fp registers */
 152  167  #define VIS_ALIGN       (64)
 153  168  
 154  169  extern int sha1_savefp(kfpu_t *, int);
 155  170  extern void sha1_restorefp(kfpu_t *);
 156  171  
 157  172  uint32_t        vis_sha1_svfp_threshold = 128;
 158  173  
 159  174  #endif /* _KERNEL */
 160  175  
 161  176  /*
 162  177   * VIS SHA-1 consts.
 163  178   */
 164  179  static uint64_t VIS[] = {
 165  180          0x8000000080000000ULL,
 166  181          0x0002000200020002ULL,
 167  182          0x5a8279996ed9eba1ULL,
 168  183          0x8f1bbcdcca62c1d6ULL,
 169  184          0x012389ab456789abULL};
 170  185  
 171  186  extern void SHA1TransformVIS(uint64_t *, uint32_t *, uint32_t *, uint64_t *);
 172  187  
 173  188  
 174  189  /*
 175  190   * SHA1Update()
 176  191   *
 177  192   * purpose: continues an sha1 digest operation, using the message block
 178  193   *          to update the context.
 179  194   *   input: SHA1_CTX *  : the context to update
 180  195   *          void *      : the message block
 181  196   *          size_t    : the length of the message block in bytes
 182  197   *  output: void
 183  198   */
 184  199  
 185  200  void
 186  201  SHA1Update(SHA1_CTX *ctx, const void *inptr, size_t input_len)
 187  202  {
 188  203          uint32_t i, buf_index, buf_len;
 189  204          uint64_t X0[40], input64[8];
 190  205          const uint8_t *input = inptr;
 191  206  #ifdef _KERNEL
 192  207          int usevis = 0;
 193  208  #else
 194  209          int usevis = 1;
 195  210  #endif /* _KERNEL */
 196  211  
 197  212          /* check for noop */
 198  213          if (input_len == 0)
 199  214                  return;
 200  215  
 201  216          /* compute number of bytes mod 64 */
 202  217          buf_index = (ctx->count[1] >> 3) & 0x3F;
 203  218  
 204  219          /* update number of bits */
 205  220          if ((ctx->count[1] += (input_len << 3)) < (input_len << 3))
 206  221                  ctx->count[0]++;
 207  222  
 208  223          ctx->count[0] += (input_len >> 29);
 209  224  
 210  225          buf_len = 64 - buf_index;
 211  226  
 212  227          /* transform as many times as possible */
 213  228          i = 0;
 214  229          if (input_len >= buf_len) {
 215  230  #ifdef _KERNEL
 216  231                  kfpu_t *fpu;
 217  232                  if (fpu_exists) {
 218  233                          uint8_t fpua[sizeof (kfpu_t) + GSR_SIZE + VIS_ALIGN];
 219  234                          uint32_t len = (input_len + buf_index) & ~0x3f;
 220  235                          int svfp_ok;
 221  236  
 222  237                          fpu = (kfpu_t *)P2ROUNDUP((uintptr_t)fpua, 64);
 223  238                          svfp_ok = ((len >= vis_sha1_svfp_threshold) ? 1 : 0);
 224  239                          usevis = fpu_exists && sha1_savefp(fpu, svfp_ok);
 225  240                  } else {
 226  241                          usevis = 0;
 227  242                  }
 228  243  #endif /* _KERNEL */
 229  244  
 230  245                  /*
 231  246                   * general optimization:
 232  247                   *
 233  248                   * only do initial bcopy() and SHA1Transform() if
 234  249                   * buf_index != 0.  if buf_index == 0, we're just
 235  250                   * wasting our time doing the bcopy() since there
 236  251                   * wasn't any data left over from a previous call to
 237  252                   * SHA1Update().
 238  253                   */
 239  254  
 240  255                  if (buf_index) {
 241  256                          bcopy(input, &ctx->buf_un.buf8[buf_index], buf_len);
 242  257                          if (usevis) {
 243  258                                  SHA1TransformVIS(X0,
 244  259                                      ctx->buf_un.buf32,
 245  260                                      &ctx->state[0], VIS);
 246  261                          } else {
 247  262                                  SHA1_TRANSFORM(ctx, ctx->buf_un.buf8);
 248  263                          }
 249  264                          i = buf_len;
 250  265                  }
 251  266  
 252  267                  /*
 253  268                   * VIS SHA-1: uses the VIS 1.0 instructions to accelerate
 254  269                   * SHA-1 processing. This is achieved by "offloading" the
 255  270                   * computation of the message schedule (MS) to the VIS units.
 256  271                   * This allows the VIS computation of the message schedule
 257  272                   * to be performed in parallel with the standard integer
 258  273                   * processing of the remainder of the SHA-1 computation.
 259  274                   * performance by up to around 1.37X, compared to an optimized
 260  275                   * integer-only implementation.
 261  276                   *
 262  277                   * The VIS implementation of SHA1Transform has a different API
 263  278                   * to the standard integer version:
 264  279                   *
 265  280                   * void SHA1TransformVIS(
 266  281                   *       uint64_t *, // Pointer to MS for ith block
 267  282                   *       uint32_t *, // Pointer to ith block of message data
 268  283                   *       uint32_t *, // Pointer to SHA state i.e ctx->state
 269  284                   *       uint64_t *, // Pointer to various VIS constants
 270  285                   * )
 271  286                   *
 272  287                   * Note: the message data must by 4-byte aligned.
 273  288                   *
 274  289                   * Function requires VIS 1.0 support.
 275  290                   *
 276  291                   * Handling is provided to deal with arbitrary byte alingment
 277  292                   * of the input data but the performance gains are reduced
 278  293                   * for alignments other than 4-bytes.
 279  294                   */
 280  295                  if (usevis) {
 281  296                          if (!IS_P2ALIGNED(&input[i], sizeof (uint32_t))) {
 282  297                                  /*
 283  298                                   * Main processing loop - input misaligned
 284  299                                   */
 285  300                                  for (; i + 63 < input_len; i += 64) {
 286  301                                          bcopy(&input[i], input64, 64);
 287  302                                          SHA1TransformVIS(X0,
 288  303                                              (uint32_t *)input64,
 289  304                                              &ctx->state[0], VIS);
 290  305                                  }
 291  306                          } else {
 292  307                                  /*
 293  308                                   * Main processing loop - input 8-byte aligned
 294  309                                   */
 295  310                                  for (; i + 63 < input_len; i += 64) {
 296  311                                          SHA1TransformVIS(X0,
 297  312                                              /* LINTED E_BAD_PTR_CAST_ALIGN */
 298  313                                              (uint32_t *)&input[i], /* CSTYLED */
 299  314                                              &ctx->state[0], VIS);
 300  315                                  }
 301  316  
 302  317                          }
 303  318  #ifdef _KERNEL
 304  319                          sha1_restorefp(fpu);
 305  320  #endif /* _KERNEL */
 306  321                  } else {
 307  322                          for (; i + 63 < input_len; i += 64) {
 308  323                                  SHA1_TRANSFORM(ctx, &input[i]);
 309  324                          }
 310  325                  }
 311  326  
 312  327                  /*
 313  328                   * general optimization:
 314  329                   *
 315  330                   * if i and input_len are the same, return now instead
 316  331                   * of calling bcopy(), since the bcopy() in this case
 317  332                   * will be an expensive nop.
 318  333                   */
 319  334  
 320  335                  if (input_len == i)
 321  336                          return;
 322  337  
 323  338                  buf_index = 0;
 324  339          }
 325  340  
 326  341          /* buffer remaining input */

↓ open down ↓

245 lines elided

↑ open up ↑

 327  342          bcopy(&input[i], &ctx->buf_un.buf8[buf_index], input_len - i);
 328  343  }
 329  344  
 330  345  #else /* VIS_SHA1 */
 331  346  
 332  347  void
 333  348  SHA1Update(SHA1_CTX *ctx, const void *inptr, size_t input_len)
 334  349  {
 335  350          uint32_t i, buf_index, buf_len;
 336  351          const uint8_t *input = inptr;
 337      -#if defined(__amd64)
      352 +#if defined(__amd64) && !defined(_STANDALONE)
 338  353          uint32_t        block_count;
 339  354  #endif  /* __amd64 */
 340  355  
 341  356          /* check for noop */
 342  357          if (input_len == 0)
 343  358                  return;
 344  359  
 345  360          /* compute number of bytes mod 64 */
 346  361          buf_index = (ctx->count[1] >> 3) & 0x3F;
 347  362

 348  363          /* update number of bits */
 349  364          if ((ctx->count[1] += (input_len << 3)) < (input_len << 3))
 350  365                  ctx->count[0]++;
 351  366  
 352  367          ctx->count[0] += (input_len >> 29);
 353  368  
 354  369          buf_len = 64 - buf_index;
 355  370  
 356  371          /* transform as many times as possible */
 357  372          i = 0;
 358  373          if (input_len >= buf_len) {
 359  374  
 360  375                  /*
 361  376                   * general optimization:
 362  377                   *
 363  378                   * only do initial bcopy() and SHA1Transform() if
 364  379                   * buf_index != 0.  if buf_index == 0, we're just
 365  380                   * wasting our time doing the bcopy() since there

↓ open down ↓

18 lines elided

↑ open up ↑

 366  381                   * wasn't any data left over from a previous call to
 367  382                   * SHA1Update().
 368  383                   */
 369  384  
 370  385                  if (buf_index) {
 371  386                          bcopy(input, &ctx->buf_un.buf8[buf_index], buf_len);
 372  387                          SHA1_TRANSFORM(ctx, ctx->buf_un.buf8);
 373  388                          i = buf_len;
 374  389                  }
 375  390  
 376      -#if !defined(__amd64)
      391 +#if !defined(__amd64) || defined(_STANDALONE)
 377  392                  for (; i + 63 < input_len; i += 64)
 378  393                          SHA1_TRANSFORM(ctx, &input[i]);
 379  394  #else
 380  395                  block_count = (input_len - i) >> 6;
 381  396                  if (block_count > 0) {
 382  397                          SHA1_TRANSFORM_BLOCKS(ctx, &input[i], block_count);
 383  398                          i += block_count << 6;
 384  399                  }
 385  400  #endif  /* !__amd64 */
 386  401

 387  402                  /*
 388  403                   * general optimization:
 389  404                   *
 390  405                   * if i and input_len are the same, return now instead
 391  406                   * of calling bcopy(), since the bcopy() in this case
 392  407                   * will be an expensive nop.
 393  408                   */
 394  409  
 395  410                  if (input_len == i)
 396  411                          return;
 397  412  
 398  413                  buf_index = 0;
 399  414          }
 400  415  
 401  416          /* buffer remaining input */
 402  417          bcopy(&input[i], &ctx->buf_un.buf8[buf_index], input_len - i);
 403  418  }
 404  419  
 405  420  #endif /* VIS_SHA1 */
 406  421  
 407  422  /*
 408  423   * SHA1Final()
 409  424   *
 410  425   * purpose: ends an sha1 digest operation, finalizing the message digest and
 411  426   *          zeroing the context.
 412  427   *   input: uchar_t *   : A buffer to store the digest.
 413  428   *                      : The function actually uses void* because many
 414  429   *                      : callers pass things other than uchar_t here.
 415  430   *          SHA1_CTX *  : the context to finalize, save, and zero
 416  431   *  output: void
 417  432   */
 418  433  
 419  434  void
 420  435  SHA1Final(void *digest, SHA1_CTX *ctx)
 421  436  {
 422  437          uint8_t         bitcount_be[sizeof (ctx->count)];
 423  438          uint32_t        index = (ctx->count[1] >> 3) & 0x3f;
 424  439  
 425  440          /* store bit count, big endian */
 426  441          Encode(bitcount_be, ctx->count, sizeof (bitcount_be));
 427  442  
 428  443          /* pad out to 56 mod 64 */
 429  444          SHA1Update(ctx, PADDING, ((index < 56) ? 56 : 120) - index);
 430  445  
 431  446          /* append length (before padding) */

↓ open down ↓

45 lines elided

↑ open up ↑

 432  447          SHA1Update(ctx, bitcount_be, sizeof (bitcount_be));
 433  448  
 434  449          /* store state in digest */
 435  450          Encode(digest, ctx->state, sizeof (ctx->state));
 436  451  
 437  452          /* zeroize sensitive information */
 438  453          bzero(ctx, sizeof (*ctx));
 439  454  }
 440  455  
 441  456  
 442      -#if !defined(__amd64)
      457 +#if !defined(__amd64) || defined(_STANDALONE)
 443  458  
 444  459  typedef uint32_t sha1word;
 445  460  
 446  461  /*
 447  462   * sparc optimization:
 448  463   *
 449  464   * on the sparc, we can load big endian 32-bit data easily.  note that
 450  465   * special care must be taken to ensure the address is 32-bit aligned.
 451  466   * in the interest of speed, we don't check to make sure, since
 452  467   * careful programming can guarantee this for us.

 453  468   */
 454  469  
 455  470  #if     defined(_BIG_ENDIAN)
 456  471  #define LOAD_BIG_32(addr)       (*(uint32_t *)(addr))
 457  472  
 458  473  #elif   defined(HAVE_HTONL)
 459  474  #define LOAD_BIG_32(addr) htonl(*((uint32_t *)(addr)))
 460  475  
 461  476  #else
 462  477  /* little endian -- will work on big endian, but slowly */
 463  478  #define LOAD_BIG_32(addr)       \
 464  479          (((addr)[0] << 24) | ((addr)[1] << 16) | ((addr)[2] << 8) | (addr)[3])
 465  480  #endif  /* _BIG_ENDIAN */
 466  481  
 467  482  /*
 468  483   * SHA1Transform()
 469  484   */
 470  485  #if     defined(W_ARRAY)
 471  486  #define W(n) w[n]
 472  487  #else   /* !defined(W_ARRAY) */
 473  488  #define W(n) w_ ## n
 474  489  #endif  /* !defined(W_ARRAY) */
 475  490  
 476  491  
 477  492  #if     defined(__sparc)
 478  493  
 479  494  /*
 480  495   * sparc register window optimization:
 481  496   *
 482  497   * `a', `b', `c', `d', and `e' are passed into SHA1Transform
 483  498   * explicitly since it increases the number of registers available to
 484  499   * the compiler.  under this scheme, these variables can be held in
 485  500   * %i0 - %i4, which leaves more local and out registers available.
 486  501   *
 487  502   * purpose: sha1 transformation -- updates the digest based on `block'
 488  503   *   input: uint32_t    : bytes  1 -  4 of the digest
 489  504   *          uint32_t    : bytes  5 -  8 of the digest
 490  505   *          uint32_t    : bytes  9 - 12 of the digest
 491  506   *          uint32_t    : bytes 12 - 16 of the digest
 492  507   *          uint32_t    : bytes 16 - 20 of the digest
 493  508   *          SHA1_CTX *  : the context to update
 494  509   *          uint8_t [64]: the block to use to update the digest
 495  510   *  output: void
 496  511   */
 497  512  
 498  513  void
 499  514  SHA1Transform(uint32_t a, uint32_t b, uint32_t c, uint32_t d, uint32_t e,
 500  515      SHA1_CTX *ctx, const uint8_t blk[64])
 501  516  {
 502  517          /*
 503  518           * sparc optimization:
 504  519           *
 505  520           * while it is somewhat counter-intuitive, on sparc, it is
 506  521           * more efficient to place all the constants used in this
 507  522           * function in an array and load the values out of the array
 508  523           * than to manually load the constants.  this is because
 509  524           * setting a register to a 32-bit value takes two ops in most
 510  525           * cases: a `sethi' and an `or', but loading a 32-bit value
 511  526           * from memory only takes one `ld' (or `lduw' on v9).  while
 512  527           * this increases memory usage, the compiler can find enough
 513  528           * other things to do while waiting to keep the pipeline does
 514  529           * not stall.  additionally, it is likely that many of these
 515  530           * constants are cached so that later accesses do not even go
 516  531           * out to the bus.
 517  532           *
 518  533           * this array is declared `static' to keep the compiler from
 519  534           * having to bcopy() this array onto the stack frame of
 520  535           * SHA1Transform() each time it is called -- which is
 521  536           * unacceptably expensive.
 522  537           *
 523  538           * the `const' is to ensure that callers are good citizens and
 524  539           * do not try to munge the array.  since these routines are
 525  540           * going to be called from inside multithreaded kernelland,
 526  541           * this is a good safety check. -- `sha1_consts' will end up in
 527  542           * .rodata.
 528  543           *
 529  544           * unfortunately, loading from an array in this manner hurts
 530  545           * performance under Intel.  So, there is a macro,
 531  546           * SHA1_CONST(), used in SHA1Transform(), that either expands to
 532  547           * a reference to this array, or to the actual constant,
 533  548           * depending on what platform this code is compiled for.
 534  549           */
 535  550  
 536  551          static const uint32_t sha1_consts[] = {
 537  552                  SHA1_CONST_0, SHA1_CONST_1, SHA1_CONST_2, SHA1_CONST_3
 538  553          };
 539  554  
 540  555          /*
 541  556           * general optimization:
 542  557           *
 543  558           * use individual integers instead of using an array.  this is a
 544  559           * win, although the amount it wins by seems to vary quite a bit.
 545  560           */
 546  561  
 547  562          uint32_t        w_0, w_1, w_2,  w_3,  w_4,  w_5,  w_6,  w_7;
 548  563          uint32_t        w_8, w_9, w_10, w_11, w_12, w_13, w_14, w_15;
 549  564  
 550  565          /*
 551  566           * sparc optimization:
 552  567           *
 553  568           * if `block' is already aligned on a 4-byte boundary, use
 554  569           * LOAD_BIG_32() directly.  otherwise, bcopy() into a
 555  570           * buffer that *is* aligned on a 4-byte boundary and then do
 556  571           * the LOAD_BIG_32() on that buffer.  benchmarks have shown
 557  572           * that using the bcopy() is better than loading the bytes
 558  573           * individually and doing the endian-swap by hand.
 559  574           *
 560  575           * even though it's quite tempting to assign to do:
 561  576           *
 562  577           * blk = bcopy(ctx->buf_un.buf32, blk, sizeof (ctx->buf_un.buf32));
 563  578           *
 564  579           * and only have one set of LOAD_BIG_32()'s, the compiler
 565  580           * *does not* like that, so please resist the urge.
 566  581           */
 567  582  
 568  583          if ((uintptr_t)blk & 0x3) {             /* not 4-byte aligned? */
 569  584                  bcopy(blk, ctx->buf_un.buf32,  sizeof (ctx->buf_un.buf32));
 570  585                  w_15 = LOAD_BIG_32(ctx->buf_un.buf32 + 15);
 571  586                  w_14 = LOAD_BIG_32(ctx->buf_un.buf32 + 14);
 572  587                  w_13 = LOAD_BIG_32(ctx->buf_un.buf32 + 13);
 573  588                  w_12 = LOAD_BIG_32(ctx->buf_un.buf32 + 12);
 574  589                  w_11 = LOAD_BIG_32(ctx->buf_un.buf32 + 11);
 575  590                  w_10 = LOAD_BIG_32(ctx->buf_un.buf32 + 10);
 576  591                  w_9  = LOAD_BIG_32(ctx->buf_un.buf32 +  9);
 577  592                  w_8  = LOAD_BIG_32(ctx->buf_un.buf32 +  8);
 578  593                  w_7  = LOAD_BIG_32(ctx->buf_un.buf32 +  7);
 579  594                  w_6  = LOAD_BIG_32(ctx->buf_un.buf32 +  6);
 580  595                  w_5  = LOAD_BIG_32(ctx->buf_un.buf32 +  5);
 581  596                  w_4  = LOAD_BIG_32(ctx->buf_un.buf32 +  4);
 582  597                  w_3  = LOAD_BIG_32(ctx->buf_un.buf32 +  3);
 583  598                  w_2  = LOAD_BIG_32(ctx->buf_un.buf32 +  2);
 584  599                  w_1  = LOAD_BIG_32(ctx->buf_un.buf32 +  1);
 585  600                  w_0  = LOAD_BIG_32(ctx->buf_un.buf32 +  0);
 586  601          } else {
 587  602                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 588  603                  w_15 = LOAD_BIG_32(blk + 60);
 589  604                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 590  605                  w_14 = LOAD_BIG_32(blk + 56);
 591  606                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 592  607                  w_13 = LOAD_BIG_32(blk + 52);
 593  608                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 594  609                  w_12 = LOAD_BIG_32(blk + 48);
 595  610                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 596  611                  w_11 = LOAD_BIG_32(blk + 44);
 597  612                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 598  613                  w_10 = LOAD_BIG_32(blk + 40);
 599  614                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 600  615                  w_9  = LOAD_BIG_32(blk + 36);
 601  616                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 602  617                  w_8  = LOAD_BIG_32(blk + 32);
 603  618                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 604  619                  w_7  = LOAD_BIG_32(blk + 28);
 605  620                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 606  621                  w_6  = LOAD_BIG_32(blk + 24);
 607  622                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 608  623                  w_5  = LOAD_BIG_32(blk + 20);
 609  624                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 610  625                  w_4  = LOAD_BIG_32(blk + 16);
 611  626                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 612  627                  w_3  = LOAD_BIG_32(blk + 12);
 613  628                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 614  629                  w_2  = LOAD_BIG_32(blk +  8);
 615  630                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 616  631                  w_1  = LOAD_BIG_32(blk +  4);
 617  632                  /* LINTED E_BAD_PTR_CAST_ALIGN */
 618  633                  w_0  = LOAD_BIG_32(blk +  0);
 619  634          }
 620  635  #else   /* !defined(__sparc) */
 621  636  
 622  637  void /* CSTYLED */
 623  638  SHA1Transform(SHA1_CTX *ctx, const uint8_t blk[64])
 624  639  {
 625  640          /* CSTYLED */
 626  641          sha1word a = ctx->state[0];
 627  642          sha1word b = ctx->state[1];
 628  643          sha1word c = ctx->state[2];
 629  644          sha1word d = ctx->state[3];
 630  645          sha1word e = ctx->state[4];
 631  646  
 632  647  #if     defined(W_ARRAY)
 633  648          sha1word        w[16];
 634  649  #else   /* !defined(W_ARRAY) */
 635  650          sha1word        w_0, w_1, w_2,  w_3,  w_4,  w_5,  w_6,  w_7;
 636  651          sha1word        w_8, w_9, w_10, w_11, w_12, w_13, w_14, w_15;
 637  652  #endif  /* !defined(W_ARRAY) */
 638  653  
 639  654          W(0)  = LOAD_BIG_32((void *)(blk +  0));
 640  655          W(1)  = LOAD_BIG_32((void *)(blk +  4));
 641  656          W(2)  = LOAD_BIG_32((void *)(blk +  8));
 642  657          W(3)  = LOAD_BIG_32((void *)(blk + 12));
 643  658          W(4)  = LOAD_BIG_32((void *)(blk + 16));
 644  659          W(5)  = LOAD_BIG_32((void *)(blk + 20));
 645  660          W(6)  = LOAD_BIG_32((void *)(blk + 24));
 646  661          W(7)  = LOAD_BIG_32((void *)(blk + 28));
 647  662          W(8)  = LOAD_BIG_32((void *)(blk + 32));
 648  663          W(9)  = LOAD_BIG_32((void *)(blk + 36));
 649  664          W(10) = LOAD_BIG_32((void *)(blk + 40));
 650  665          W(11) = LOAD_BIG_32((void *)(blk + 44));
 651  666          W(12) = LOAD_BIG_32((void *)(blk + 48));
 652  667          W(13) = LOAD_BIG_32((void *)(blk + 52));
 653  668          W(14) = LOAD_BIG_32((void *)(blk + 56));
 654  669          W(15) = LOAD_BIG_32((void *)(blk + 60));
 655  670  
 656  671  #endif  /* !defined(__sparc) */
 657  672  
 658  673          /*
 659  674           * general optimization:
 660  675           *
 661  676           * even though this approach is described in the standard as
 662  677           * being slower algorithmically, it is 30-40% faster than the
 663  678           * "faster" version under SPARC, because this version has more
 664  679           * of the constraints specified at compile-time and uses fewer
 665  680           * variables (and therefore has better register utilization)
 666  681           * than its "speedier" brother.  (i've tried both, trust me)
 667  682           *
 668  683           * for either method given in the spec, there is an "assignment"
 669  684           * phase where the following takes place:
 670  685           *
 671  686           *      tmp = (main_computation);
 672  687           *      e = d; d = c; c = rotate_left(b, 30); b = a; a = tmp;
 673  688           *
 674  689           * we can make the algorithm go faster by not doing this work,
 675  690           * but just pretending that `d' is now `e', etc. this works
 676  691           * really well and obviates the need for a temporary variable.
 677  692           * however, we still explicitly perform the rotate action,
 678  693           * since it is cheaper on SPARC to do it once than to have to
 679  694           * do it over and over again.
 680  695           */
 681  696  
 682  697          /* round 1 */
 683  698          e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(0) + SHA1_CONST(0); /* 0 */
 684  699          b = ROTATE_LEFT(b, 30);
 685  700  
 686  701          d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(1) + SHA1_CONST(0); /* 1 */
 687  702          a = ROTATE_LEFT(a, 30);
 688  703  
 689  704          c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(2) + SHA1_CONST(0); /* 2 */
 690  705          e = ROTATE_LEFT(e, 30);
 691  706  
 692  707          b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(3) + SHA1_CONST(0); /* 3 */
 693  708          d = ROTATE_LEFT(d, 30);
 694  709  
 695  710          a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(4) + SHA1_CONST(0); /* 4 */
 696  711          c = ROTATE_LEFT(c, 30);
 697  712  
 698  713          e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(5) + SHA1_CONST(0); /* 5 */
 699  714          b = ROTATE_LEFT(b, 30);
 700  715  
 701  716          d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(6) + SHA1_CONST(0); /* 6 */
 702  717          a = ROTATE_LEFT(a, 30);
 703  718  
 704  719          c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(7) + SHA1_CONST(0); /* 7 */
 705  720          e = ROTATE_LEFT(e, 30);
 706  721  
 707  722          b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(8) + SHA1_CONST(0); /* 8 */
 708  723          d = ROTATE_LEFT(d, 30);
 709  724  
 710  725          a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(9) + SHA1_CONST(0); /* 9 */
 711  726          c = ROTATE_LEFT(c, 30);
 712  727  
 713  728          e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(10) + SHA1_CONST(0); /* 10 */
 714  729          b = ROTATE_LEFT(b, 30);
 715  730  
 716  731          d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(11) + SHA1_CONST(0); /* 11 */
 717  732          a = ROTATE_LEFT(a, 30);
 718  733  
 719  734          c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(12) + SHA1_CONST(0); /* 12 */
 720  735          e = ROTATE_LEFT(e, 30);
 721  736  
 722  737          b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(13) + SHA1_CONST(0); /* 13 */
 723  738          d = ROTATE_LEFT(d, 30);
 724  739  
 725  740          a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(14) + SHA1_CONST(0); /* 14 */
 726  741          c = ROTATE_LEFT(c, 30);
 727  742  
 728  743          e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(15) + SHA1_CONST(0); /* 15 */
 729  744          b = ROTATE_LEFT(b, 30);
 730  745  
 731  746          W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1);            /* 16 */
 732  747          d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(0) + SHA1_CONST(0);
 733  748          a = ROTATE_LEFT(a, 30);
 734  749  
 735  750          W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1);            /* 17 */
 736  751          c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(1) + SHA1_CONST(0);
 737  752          e = ROTATE_LEFT(e, 30);
 738  753  
 739  754          W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1);   /* 18 */
 740  755          b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(2) + SHA1_CONST(0);
 741  756          d = ROTATE_LEFT(d, 30);
 742  757  
 743  758          W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1);            /* 19 */
 744  759          a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(3) + SHA1_CONST(0);
 745  760          c = ROTATE_LEFT(c, 30);
 746  761  
 747  762          /* round 2 */
 748  763          W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1);            /* 20 */
 749  764          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(4) + SHA1_CONST(1);
 750  765          b = ROTATE_LEFT(b, 30);
 751  766  
 752  767          W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1);            /* 21 */
 753  768          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(5) + SHA1_CONST(1);
 754  769          a = ROTATE_LEFT(a, 30);
 755  770  
 756  771          W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1);            /* 22 */
 757  772          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(6) + SHA1_CONST(1);
 758  773          e = ROTATE_LEFT(e, 30);
 759  774  
 760  775          W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1);            /* 23 */
 761  776          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(7) + SHA1_CONST(1);
 762  777          d = ROTATE_LEFT(d, 30);
 763  778  
 764  779          W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1);            /* 24 */
 765  780          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(8) + SHA1_CONST(1);
 766  781          c = ROTATE_LEFT(c, 30);
 767  782  
 768  783          W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1);            /* 25 */
 769  784          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(9) + SHA1_CONST(1);
 770  785          b = ROTATE_LEFT(b, 30);
 771  786  
 772  787          W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1);  /* 26 */
 773  788          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(10) + SHA1_CONST(1);
 774  789          a = ROTATE_LEFT(a, 30);
 775  790  
 776  791          W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1);  /* 27 */
 777  792          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(11) + SHA1_CONST(1);
 778  793          e = ROTATE_LEFT(e, 30);
 779  794  
 780  795          W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1);  /* 28 */
 781  796          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(12) + SHA1_CONST(1);
 782  797          d = ROTATE_LEFT(d, 30);
 783  798  
 784  799          W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 29 */
 785  800          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(13) + SHA1_CONST(1);
 786  801          c = ROTATE_LEFT(c, 30);
 787  802  
 788  803          W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1);  /* 30 */
 789  804          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(14) + SHA1_CONST(1);
 790  805          b = ROTATE_LEFT(b, 30);
 791  806  
 792  807          W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1);  /* 31 */
 793  808          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(15) + SHA1_CONST(1);
 794  809          a = ROTATE_LEFT(a, 30);
 795  810  
 796  811          W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1);            /* 32 */
 797  812          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(0) + SHA1_CONST(1);
 798  813          e = ROTATE_LEFT(e, 30);
 799  814  
 800  815          W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1);            /* 33 */
 801  816          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(1) + SHA1_CONST(1);
 802  817          d = ROTATE_LEFT(d, 30);
 803  818  
 804  819          W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1);   /* 34 */
 805  820          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(2) + SHA1_CONST(1);
 806  821          c = ROTATE_LEFT(c, 30);
 807  822  
 808  823          W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1);            /* 35 */
 809  824          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(3) + SHA1_CONST(1);
 810  825          b = ROTATE_LEFT(b, 30);
 811  826  
 812  827          W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1);            /* 36 */
 813  828          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(4) + SHA1_CONST(1);
 814  829          a = ROTATE_LEFT(a, 30);
 815  830  
 816  831          W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1);            /* 37 */
 817  832          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(5) + SHA1_CONST(1);
 818  833          e = ROTATE_LEFT(e, 30);
 819  834  
 820  835          W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1);            /* 38 */
 821  836          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(6) + SHA1_CONST(1);
 822  837          d = ROTATE_LEFT(d, 30);
 823  838  
 824  839          W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1);            /* 39 */
 825  840          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(7) + SHA1_CONST(1);
 826  841          c = ROTATE_LEFT(c, 30);
 827  842  
 828  843          /* round 3 */
 829  844          W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1);            /* 40 */
 830  845          e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(8) + SHA1_CONST(2);
 831  846          b = ROTATE_LEFT(b, 30);
 832  847  
 833  848          W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1);            /* 41 */
 834  849          d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(9) + SHA1_CONST(2);
 835  850          a = ROTATE_LEFT(a, 30);
 836  851  
 837  852          W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1);  /* 42 */
 838  853          c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(10) + SHA1_CONST(2);
 839  854          e = ROTATE_LEFT(e, 30);
 840  855  
 841  856          W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1);  /* 43 */
 842  857          b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(11) + SHA1_CONST(2);
 843  858          d = ROTATE_LEFT(d, 30);
 844  859  
 845  860          W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1);  /* 44 */
 846  861          a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(12) + SHA1_CONST(2);
 847  862          c = ROTATE_LEFT(c, 30);
 848  863  
 849  864          W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 45 */
 850  865          e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(13) + SHA1_CONST(2);
 851  866          b = ROTATE_LEFT(b, 30);
 852  867  
 853  868          W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1);  /* 46 */
 854  869          d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(14) + SHA1_CONST(2);
 855  870          a = ROTATE_LEFT(a, 30);
 856  871  
 857  872          W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1);  /* 47 */
 858  873          c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(15) + SHA1_CONST(2);
 859  874          e = ROTATE_LEFT(e, 30);
 860  875  
 861  876          W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1);            /* 48 */
 862  877          b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(0) + SHA1_CONST(2);
 863  878          d = ROTATE_LEFT(d, 30);
 864  879  
 865  880          W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1);            /* 49 */
 866  881          a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(1) + SHA1_CONST(2);
 867  882          c = ROTATE_LEFT(c, 30);
 868  883  
 869  884          W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1);   /* 50 */
 870  885          e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(2) + SHA1_CONST(2);
 871  886          b = ROTATE_LEFT(b, 30);
 872  887  
 873  888          W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1);            /* 51 */
 874  889          d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(3) + SHA1_CONST(2);
 875  890          a = ROTATE_LEFT(a, 30);
 876  891  
 877  892          W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1);            /* 52 */
 878  893          c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(4) + SHA1_CONST(2);
 879  894          e = ROTATE_LEFT(e, 30);
 880  895  
 881  896          W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1);            /* 53 */
 882  897          b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(5) + SHA1_CONST(2);
 883  898          d = ROTATE_LEFT(d, 30);
 884  899  
 885  900          W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1);            /* 54 */
 886  901          a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(6) + SHA1_CONST(2);
 887  902          c = ROTATE_LEFT(c, 30);
 888  903  
 889  904          W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1);            /* 55 */
 890  905          e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(7) + SHA1_CONST(2);
 891  906          b = ROTATE_LEFT(b, 30);
 892  907  
 893  908          W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1);            /* 56 */
 894  909          d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(8) + SHA1_CONST(2);
 895  910          a = ROTATE_LEFT(a, 30);
 896  911  
 897  912          W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1);            /* 57 */
 898  913          c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(9) + SHA1_CONST(2);
 899  914          e = ROTATE_LEFT(e, 30);
 900  915  
 901  916          W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1);  /* 58 */
 902  917          b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(10) + SHA1_CONST(2);
 903  918          d = ROTATE_LEFT(d, 30);
 904  919  
 905  920          W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1);  /* 59 */
 906  921          a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(11) + SHA1_CONST(2);
 907  922          c = ROTATE_LEFT(c, 30);
 908  923  
 909  924          /* round 4 */
 910  925          W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1);  /* 60 */
 911  926          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(12) + SHA1_CONST(3);
 912  927          b = ROTATE_LEFT(b, 30);
 913  928  
 914  929          W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 61 */
 915  930          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(13) + SHA1_CONST(3);
 916  931          a = ROTATE_LEFT(a, 30);
 917  932  
 918  933          W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1);  /* 62 */
 919  934          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(14) + SHA1_CONST(3);
 920  935          e = ROTATE_LEFT(e, 30);
 921  936  
 922  937          W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1);  /* 63 */
 923  938          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(15) + SHA1_CONST(3);
 924  939          d = ROTATE_LEFT(d, 30);
 925  940  
 926  941          W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1);            /* 64 */
 927  942          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(0) + SHA1_CONST(3);
 928  943          c = ROTATE_LEFT(c, 30);
 929  944  
 930  945          W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1);            /* 65 */
 931  946          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(1) + SHA1_CONST(3);
 932  947          b = ROTATE_LEFT(b, 30);
 933  948  
 934  949          W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1);   /* 66 */
 935  950          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(2) + SHA1_CONST(3);
 936  951          a = ROTATE_LEFT(a, 30);
 937  952  
 938  953          W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1);            /* 67 */
 939  954          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(3) + SHA1_CONST(3);
 940  955          e = ROTATE_LEFT(e, 30);
 941  956  
 942  957          W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1);            /* 68 */
 943  958          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(4) + SHA1_CONST(3);
 944  959          d = ROTATE_LEFT(d, 30);
 945  960  
 946  961          W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1);            /* 69 */
 947  962          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(5) + SHA1_CONST(3);
 948  963          c = ROTATE_LEFT(c, 30);
 949  964  
 950  965          W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1);            /* 70 */
 951  966          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(6) + SHA1_CONST(3);
 952  967          b = ROTATE_LEFT(b, 30);
 953  968  
 954  969          W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1);            /* 71 */
 955  970          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(7) + SHA1_CONST(3);
 956  971          a = ROTATE_LEFT(a, 30);
 957  972  
 958  973          W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1);            /* 72 */
 959  974          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(8) + SHA1_CONST(3);
 960  975          e = ROTATE_LEFT(e, 30);
 961  976  
 962  977          W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1);            /* 73 */
 963  978          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(9) + SHA1_CONST(3);
 964  979          d = ROTATE_LEFT(d, 30);
 965  980  
 966  981          W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1);  /* 74 */
 967  982          a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(10) + SHA1_CONST(3);
 968  983          c = ROTATE_LEFT(c, 30);
 969  984  
 970  985          W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1);  /* 75 */
 971  986          e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(11) + SHA1_CONST(3);
 972  987          b = ROTATE_LEFT(b, 30);
 973  988  
 974  989          W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1);  /* 76 */
 975  990          d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(12) + SHA1_CONST(3);
 976  991          a = ROTATE_LEFT(a, 30);
 977  992  
 978  993          W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 77 */
 979  994          c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(13) + SHA1_CONST(3);
 980  995          e = ROTATE_LEFT(e, 30);
 981  996  
 982  997          W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1);  /* 78 */
 983  998          b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(14) + SHA1_CONST(3);
 984  999          d = ROTATE_LEFT(d, 30);
 985 1000  
 986 1001          W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1);  /* 79 */
 987 1002  
 988 1003          ctx->state[0] += ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(15) +
 989 1004              SHA1_CONST(3);
 990 1005          ctx->state[1] += b;
 991 1006          ctx->state[2] += ROTATE_LEFT(c, 30);
 992 1007          ctx->state[3] += d;
 993 1008          ctx->state[4] += e;
 994 1009  
 995 1010          /* zeroize sensitive information */
 996 1011          W(0) = W(1) = W(2) = W(3) = W(4) = W(5) = W(6) = W(7) = W(8) = 0;
 997 1012          W(9) = W(10) = W(11) = W(12) = W(13) = W(14) = W(15) = 0;
 998 1013  }
 999 1014  #endif  /* !__amd64 */
1000 1015  
1001 1016  
1002 1017  /*
1003 1018   * Encode()
1004 1019   *
1005 1020   * purpose: to convert a list of numbers from little endian to big endian
1006 1021   *   input: uint8_t *   : place to store the converted big endian numbers
1007 1022   *          uint32_t *  : place to get numbers to convert from
1008 1023   *          size_t      : the length of the input in bytes
1009 1024   *  output: void
1010 1025   */
1011 1026  
1012 1027  static void
1013 1028  Encode(uint8_t *_RESTRICT_KYWD output, const uint32_t *_RESTRICT_KYWD input,
1014 1029      size_t len)
1015 1030  {
1016 1031          size_t          i, j;
1017 1032  
1018 1033  #if     defined(__sparc)
1019 1034          if (IS_P2ALIGNED(output, sizeof (uint32_t))) {
1020 1035                  for (i = 0, j = 0; j < len; i++, j += 4) {
1021 1036                          /* LINTED E_BAD_PTR_CAST_ALIGN */
1022 1037                          *((uint32_t *)(output + j)) = input[i];
1023 1038                  }
1024 1039          } else {
1025 1040  #endif  /* little endian -- will work on big endian, but slowly */
1026 1041                  for (i = 0, j = 0; j < len; i++, j += 4) {
1027 1042                          output[j]       = (input[i] >> 24) & 0xff;
1028 1043                          output[j + 1]   = (input[i] >> 16) & 0xff;
1029 1044                          output[j + 2]   = (input[i] >>  8) & 0xff;
1030 1045                          output[j + 3]   = input[i] & 0xff;
1031 1046                  }
1032 1047  #if     defined(__sparc)
1033 1048          }
1034 1049  #endif
1035 1050  }

↓ open down ↓

583 lines elided

↑ open up ↑

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX