source: project/release/4/modular-crypt/trunk/implementations/SHA-2/README @ 22216

Last change on this file since 22216 was 22216, checked in by sjamaan, 9 years ago

Add initial code for crypt egg

File size: 12.1 KB
Line 
1[ Proposal homepage: http://www.akkadia.org/drepper/sha-crypt.html ]
2
3Unix crypt using SHA-256 and SHA-512
4------------------------------------
5
6Author: Ulrich Drepper <drepper@redhat.com>
7Version: 0.4 2008-4-3
8
9Discussion Group:
10
11Josh Bressers, Red Hat
12Mark Brown, IBM
13David Clissold, IBM
14Don Cragun, Sun
15Casper Dik, Sun
16Ulrich Drepper, Red Hat
17Larry Dwyer, HP
18Steve Grubb, Red Hat
19Ravi A Shankar, IBM
20Borislav Simov, HP
21
22
23Various Unix crypt implementations have been MD5 as an alternative
24method to the traditional DES encryption for the one-way conversion
25needed.  Both DES and MD5 are deemed insecure for their primary
26purpose and by association their use in password encryption is put
27into question.  In addition, the produced output for both DES and MD5
28has a short length which makes it possible to construct rainbow tables.
29
30Requests for a better solution to the problem have been heard for some
31time.  Security departments in companies are trying to phase out all
32uses of MD5.  They demand a method which is officially sanctioned.
33For US-based users this means tested by the NIST.
34
35This rules out the use of another already implemented method with
36limited spread: the use of the Blowfish encryption method.  The choice
37comes down to tested encryption (3DES, AES) or hash sums (the SHA
38family).
39
40Encryption-based solution do not seem to provide better security (no
41proof to the contrary) and the higher CPU requirements can be
42compensated for by adding more complexity to a hash sum-based
43solution.  This is why the decision has been made by a group of Unix
44and Linux vendors to persue this route.
45
46The SHA hash sum functions are well tested.  By choosing the SHA-256
47and SHA-512 algorithms the produced output is 32 or 64 bytes
48respectively in size.  This fulfills the requirement for a large
49output set which makes rainbow tables less useful to impossible, at
50least for the next years.
51
52The algorithm used by the MD5-based password hashing is generally
53deemed safe as well so there is no big problem with using a similar
54algorithm for the SHA-based password hashing solutions.  Parts of the
55algorithm have been changed and in one instance what is thought to be
56a mistake in the MD5-based implementation has been fixed.
57
58The integration into existing systems is easy if those systems already
59support the MD5-based solution.  Ever since the introduction of the
60MD5-based method an extended password format is in used:
61
62   $<ID>$<SALT>$<PWD>
63
64If the password is not of this form it is an old-style DES-encrypted
65password.  If the password has this form the ID identifies the method
66used and this then determines how the rest of the password string is
67interpreted.  So far the following ID values are in use:
68
69
70     ID       |    Method
71  -------------------------------
72     1        |  MD5 (Linux, BSD)
73     2a       |  Blowfish (OpenBSD)
74     md5      |  Sun MD5
75
76
77For the new SHA-256 and SHA-512 methods the following values are
78selected:
79
80
81     ID       |    Method
82  -------------------------------
83     5        |  SHA-256
84     6        |  SHA-512
85
86
87For the SHA-based methods the SALT string can be a simple string of
88which up to 16 characters are used.  The MD5-based implementation used
89up to eight characters..  It was decided to allow one extension which
90follows an invention Sun implemented in their pluggable crypt
91implementation.  If the SALT strings starts with
92
93   rounds=<N>$
94
95where N is an unsigned decimal number the numeric value of N is used
96to modify the algorithm used.  As will be explained later, the
97SHA-based algorithm contains a loop which can be run an arbitrary
98number of times.  The more rounds are performed the higher the CPU
99requirements are.  This is a safety mechanism which might help
100countering brute-force attacks in the face of increasing computing
101power.
102
103The default number of rounds for both algorithms is 5000.  To ensure
104minimal security and stability on the other hand minimum and maximum
105values for N are enforced:
106
107   minimum for N = 1,000
108   maximum for N = 999,999,999
109
110Any selection of N below the minimum will cause the use of 1,000
111rounds and a value of 1 billion and higher will cause 999,999,999
112rounds to be used.  In these cases the output string produced by the
113crypt function will not have the same salt as the input salt string
114since the correct, limited rounds value is used in the output.
115
116The PWD part of the password string is the actual computed password.
117The size of this string is fixed:
118
119      SHA-256    43 characters
120      SHA-512    86 characters
121
122The output consists of the base64-encoded digest.  The maximum length
123of a password string is therefore (excluding final NUL byte in the C
124representation):
125
126     SHA-256     80 characters
127     SHA-512     123 characters
128
129The input string used for the salt parameter of the crypt function can
130potentially be much longer.  But since the salt string is truncated to
131at most 16 characters the size of the output string is limited.
132
133The algorithm used for the password hashing follows the one used in
134the Linux/BSD MD5 implementation.  The following is a description of
135the algorithm where the differences are explicitly pointed out.  Both,
136the SHA-256 and the SHA-512 method, use the same algorithm.  The only
137difference, which is also a difference to the MD5 version, are all the
138cases where an existing digest is used as input for another digest
139computation.  In this case the input size (i.e., the digest size)
140varies.  For MD5 the digest is 16 bytes, for SHA-256 it is 32 bytes,
141and for SHA-512 it is 64 bytes.  The following description will not
142mention this difference further.
143
144The algorithm using three primitives for creating a hash digest:
145
146- start a digest.  This sets up the data structures and initial state
147  as required for the hash function
148
149- add bytes to a digest.  This can happen multiple times.  Only when
150  the required number of bytes for a round of the hash function is
151  added will anything happen.  If the required number of bytes is not
152  yet reached the bytes will simply be queued up.  For SHA-256 and
153  SHA-512 the respective sizes are 64 and 128 bytes.
154
155- finish the context.  This operation causes the currently queued
156  bytes to be padded according to the hash function specification and
157  the result is processed.  The final digest is computed and made
158  available to the use.
159
160When the algorithm talks about adding the salt string this really
161means adding the salt string truncated to 16 characters.
162
163When the algorithm talks about adding a string the terminating NUL
164byte of the C presentation of the string in NOT added.
165
166
167Algorithm for crypt using SHA-256/SHA-512:
168
1691.  start digest A
170
1712.  the password string is added to digest A
172
1733.  the salt string is added to digest A.  This is just the salt string
174    itself without the enclosing '$', without the magic prefix $5$ and
175    $6$ respectively and without the rounds=<N> specification.
176
177    NB: the MD5 algorithm did add the $1$ prefix.  This is not deemed
178    necessary since it is a constant string and does not add security
179    and /possibly/ allows a plain text attack.  Since the rounds=<N>
180    specification should never be added this would also create an
181    inconsistency.
182
1834.  start digest B
184
1855.  add the password to digest B
186
1876.  add the salt string to digest B
188
1897.  add the password again to digest B
190
1918.  finish digest B
192
1939.  For each block of 32 or 64 bytes in the password string (excluding
194    the terminating NUL in the C representation), add digest B to digest A
195
19610. For the remaining N bytes of the password string add the first
197    N bytes of digest B to digest A
198
19911. For each bit of the binary representation of the length of the
200    password string up to and including the highest 1-digit, starting
201    from to lowest bit position (numeric value 1):
202
203    a) for a 1-digit add digest B to digest A
204
205    b) for a 0-digit add the password string
206
207    NB: this step differs significantly from the MD5 algorithm.  It
208    adds more randomness.
209
21012. finish digest A
211
21213. start digest DP
213
21414. for every byte in the password (excluding the terminating NUL byte
215    in the C representation of the string)
216
217      add the password to digest DP
218
21915. finish digest DP
220
22116. produce byte sequence P of the same length as the password where
222
223    a) for each block of 32 or 64 bytes of length of the password string
224       the entire digest DP is used
225
226    b) for the remaining N (up to  31 or 63) bytes use the first N
227       bytes of digest DP
228
22917. start digest DS
230
23118. repeast the following 16+A[0] times, where A[0] represents the first
232    byte in digest A interpreted as an 8-bit unsigned value
233
234      add the salt to digest DS
235
23619. finish digest DS
237
23820. produce byte sequence S of the same length as the salt string where
239
240    a) for each block of 32 or 64 bytes of length of the salt string
241       the entire digest DS is used
242
243    b) for the remaining N (up to  31 or 63) bytes use the first N
244       bytes of digest DS
245
24621. repeat a loop according to the number specified in the rounds=<N>
247    specification in the salt (or the default value if none is
248    present).  Each round is numbered, starting with 0 and up to N-1.
249
250    The loop uses a digest as input.  In the first round it is the
251    digest produced in step 12.  In the latter steps it is the digest
252    produced in step 21.h.  The following text uses the notation
253    "digest A/C" to desribe this behavior.
254
255
256    a) start digest C
257
258    b) for odd round numbers add the byte sequense P to digest C
259
260    c) for even round numbers add digest A/C
261
262    d) for all round numbers not divisible by 3 add the byte sequence S
263
264    e) for all round numbers not divisible by 7 add the byte sequence P
265
266    f) for odd round numbers add digest A/C
267
268    g) for even round numbers add the byte sequence P
269
270    h) finish digest C.
271
27222. Produce the output string.  This is an ASCII string of the maximum
273    size specified above, consisting of multiple pieces:
274
275    a) the salt prefix, $5$ or $6$ respectively
276
277    b) the rounds=<N> specification, if one was present in the input
278       salt string.  A trailing '$' is added in this case to separate
279       the rounds specification from the following text.
280
281    c) the salt string truncated to 16 characters
282
283    d) a '$' character
284
285    e) the base-64 encoded final C digest.  The encoding used is as
286       follows:
287
288                 111111111122222222223333333333444444444455555555556666
289       0123456789012345678901234567890123456789012345678901234567890123
290       ----------------------------------------------------------------
291       ./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
292
293       Each group of three bytes from the digest produces four
294       characters as output:
295
296         1. character: the six low bits of the first byte
297         2. character: the two high bits of the first byte and the
298                       four low bytes from the second byte
299         3. character: the four high butes from the second byte and
300                       the two low bits from the third byte
301         4. character: the six high bits from the third byte
302
303       The groups of three bytes are as follows (in this sequence).
304       These are the indices into the byte array containing the
305       digest, starting with index 0.  For the last group there are
306       not enough bytes left in the digest and the value zero is used
307       in its place.  This group also produces only three or two
308       characters as output for SHA-256 and SHA-512 respecitevely.
309
310       For SHA-256:
311
312         #3   #2   #1   <-- byte number in group
313
314          0 - 10 - 20
315         21 -  1 - 11
316         12 - 22 -  2
317          3 - 13 - 23
318         24 -  4 - 14
319         15 - 25 -  5
320          6 - 16 - 26
321         27 -  7 - 17
322         18 - 28 -  8
323          9 - 19 - 29
324          * - 31 - 30
325
326
327       For SHA-512:
328
329         #3   #2   #1   <-- byte number in group
330
331          0 - 21 - 42
332         22 - 43 -  1
333         44 -  2 - 23
334          3 - 24 - 45
335         25 - 46 -  4
336         47 -  5 - 26
337          6 - 27 - 48
338         28 - 49 -  7
339         50 -  8 - 29
340          9 - 30 - 51
341         31 - 52 - 10
342         53 - 11 - 32
343         12 - 33 - 54
344         34 - 55 - 13
345         56 - 14 - 35
346         15 - 36 - 57
347         37 - 58 - 16
348         59 - 17 - 38
349         18 - 39 - 60
350         40 - 61 - 19
351         62 - 20 - 41
352          * -  * - 63
Note: See TracBrowser for help on using the repository browser.