crc32: move long comment about crc32 fundamentals to Documentation/

Move a long comment from lib/crc32.c to Documentation/crc32.txt where it will more likely get read. Edited the resulting document to add an explanation of the slicing-by-n algorithm. [djwong@us.ibm.com: minor changelog tweaks] [akpm@linux-foundation.org: fix typo, per George] Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Bob Pearson <rpearson@systemfabricworks.com> Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Bob Pearson <rpearson@systemfabricworks.com> 2012-03-23 18:02:22 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2012-03-23 19:58:37 -0400
commit: fbedceb10066430b925cf43fbf926e8abb9e2359 (patch)
tree: ea4f9453fd810c82c106df1e5b5932894ddcadd5 /lib
parent: e30c7a8fcf2d5bba53ea07047b1a0f9161da1078 (diff)
1 files changed, 2 insertions, 127 deletions
diff --git a/lib/crc32.c b/lib/crc32.c
index ffea0c99a1f3..c3ce94a06db8 100644
--- a/lib/crc32.c
+++ b/lib/crc32.c
@@ -20,6 +20,8 @@
 * Version 2.  See the file COPYING for more details.
 */
+/* see: Documentation/crc32.txt for a description of algorithms */
 #include <linux/crc32.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -209,133 +211,6 @@ u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len)
 EXPORT_SYMBOL(crc32_le);
 EXPORT_SYMBOL(crc32_be);
-/*
- * A brief CRC tutorial.
- *
- * A CRC is a long-division remainder.  You add the CRC to the message,
- * and the whole thing (message+CRC) is a multiple of the given
- * CRC polynomial.  To check the CRC, you can either check that the
- * CRC matches the recomputed value, *or* you can check that the
- * remainder computed on the message+CRC is 0.  This latter approach
- * is used by a lot of hardware implementations, and is why so many
- * protocols put the end-of-frame flag after the CRC.
- *
- * It's actually the same long division you learned in school, except that
- * - We're working in binary, so the digits are only 0 and 1, and
- * - When dividing polynomials, there are no carries.  Rather than add and
- *   subtract, we just xor.  Thus, we tend to get a bit sloppy about
- *   the difference between adding and subtracting.
- *
- * A 32-bit CRC polynomial is actually 33 bits long.  But since it's
- * 33 bits long, bit 32 is always going to be set, so usually the CRC
- * is written in hex with the most significant bit omitted.  (If you're
- * familiar with the IEEE 754 floating-point format, it's the same idea.)
- *
- * Note that a CRC is computed over a string of *bits*, so you have
- * to decide on the endianness of the bits within each byte.  To get
- * the best error-detecting properties, this should correspond to the
- * order they're actually sent.  For example, standard RS-232 serial is
- * little-endian; the most significant bit (sometimes used for parity)
- * is sent last.  And when appending a CRC word to a message, you should
- * do it in the right order, matching the endianness.
- *
- * Just like with ordinary division, the remainder is always smaller than
- * the divisor (the CRC polynomial) you're dividing by.  Each step of the
- * division, you take one more digit (bit) of the dividend and append it
- * to the current remainder.  Then you figure out the appropriate multiple
- * of the divisor to subtract to being the remainder back into range.
- * In binary, it's easy - it has to be either 0 or 1, and to make the
- * XOR cancel, it's just a copy of bit 32 of the remainder.
- *
- * When computing a CRC, we don't care about the quotient, so we can
- * throw the quotient bit away, but subtract the appropriate multiple of
- * the polynomial from the remainder and we're back to where we started,
- * ready to process the next bit.
- *
- * A big-endian CRC written this way would be coded like:
- * for (i = 0; i < input_bits; i++) {
- *      multiple = remainder & 0x80000000 ? CRCPOLY : 0;
- *      remainder = (remainder << 1 | next_input_bit()) ^ multiple;
- * }
- * Notice how, to get at bit 32 of the shifted remainder, we look
- * at bit 31 of the remainder *before* shifting it.
- *
- * But also notice how the next_input_bit() bits we're shifting into
- * the remainder don't actually affect any decision-making until
- * 32 bits later.  Thus, the first 32 cycles of this are pretty boring.
- * Also, to add the CRC to a message, we need a 32-bit-long hole for it at
- * the end, so we have to add 32 extra cycles shifting in zeros at the
- * end of every message,
- *
- * So the standard trick is to rearrage merging in the next_input_bit()
- * until the moment it's needed.  Then the first 32 cycles can be precomputed,
- * and merging in the final 32 zero bits to make room for the CRC can be
- * skipped entirely.
- * This changes the code to:
- * for (i = 0; i < input_bits; i++) {
- *      remainder ^= next_input_bit() << 31;
- *      multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
- *      remainder = (remainder << 1) ^ multiple;
- * }
- * With this optimization, the little-endian code is simpler:
- * for (i = 0; i < input_bits; i++) {
- *      remainder ^= next_input_bit();
- *      multiple = (remainder & 1) ? CRCPOLY : 0;
- *      remainder = (remainder >> 1) ^ multiple;
- * }
- *
- * Note that the other details of endianness have been hidden in CRCPOLY
- * (which must be bit-reversed) and next_input_bit().
- *
- * However, as long as next_input_bit is returning the bits in a sensible
- * order, we can actually do the merging 8 or more bits at a time rather
- * than one bit at a time:
- * for (i = 0; i < input_bytes; i++) {
- *      remainder ^= next_input_byte() << 24;
- *      for (j = 0; j < 8; j++) {
- *              multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
- *              remainder = (remainder << 1) ^ multiple;
- *      }
- * }
- * Or in little-endian:
- * for (i = 0; i < input_bytes; i++) {
- *      remainder ^= next_input_byte();
- *      for (j = 0; j < 8; j++) {
- *              multiple = (remainder & 1) ? CRCPOLY : 0;
- *              remainder = (remainder << 1) ^ multiple;
- *      }
- * }
- * If the input is a multiple of 32 bits, you can even XOR in a 32-bit
- * word at a time and increase the inner loop count to 32.
- *
- * You can also mix and match the two loop styles, for example doing the
- * bulk of a message byte-at-a-time and adding bit-at-a-time processing
- * for any fractional bytes at the end.
- *
- * The only remaining optimization is to the byte-at-a-time table method.
- * Here, rather than just shifting one bit of the remainder to decide
- * in the correct multiple to subtract, we can shift a byte at a time.
- * This produces a 40-bit (rather than a 33-bit) intermediate remainder,
- * but again the multiple of the polynomial to subtract depends only on
- * the high bits, the high 8 bits in this case.
- *
- * The multiple we need in that case is the low 32 bits of a 40-bit
- * value whose high 8 bits are given, and which is a multiple of the
- * generator polynomial.  This is simply the CRC-32 of the given
- * one-byte message.
- *
- * Two more details: normally, appending zero bits to a message which
- * is already a multiple of a polynomial produces a larger multiple of that
- * polynomial.  To enable a CRC to detect this condition, it's common to
- * invert the CRC before appending it.  This makes the remainder of the
- * message+crc come out not as zero, but some fixed non-zero value.
- *
- * The same problem applies to zero bits prepended to the message, and
- * a similar solution is used.  Instead of starting with a remainder of
- * 0, an initial remainder of all ones is used.  As long as you start
- * the same way on decoding, it doesn't make a difference.
- */
 #ifdef UNITTEST
 #include <stdlib.h>
author	Bob Pearson <rpearson@systemfabricworks.com>	2012-03-23 18:02:22 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 19:58:37 -0400
commit	fbedceb10066430b925cf43fbf926e8abb9e2359 (patch)
tree	ea4f9453fd810c82c106df1e5b5932894ddcadd5 /lib
parent	e30c7a8fcf2d5bba53ea07047b1a0f9161da1078 (diff)

diff --git a/lib/crc32.c b/lib/crc32.c index ffea0c99a1f3..c3ce94a06db8 100644 --- a/lib/crc32.c +++ b/lib/crc32.c
@@ -20,6 +20,8 @@
20	* Version 2. See the file COPYING for more details.	20	* Version 2. See the file COPYING for more details.
21	*/	21	*/
22		22
		23	/* see: Documentation/crc32.txt for a description of algorithms */
		24
23	#include <linux/crc32.h>	25	#include <linux/crc32.h>
24	#include <linux/kernel.h>	26	#include <linux/kernel.h>
25	#include <linux/module.h>	27	#include <linux/module.h>
@@ -209,133 +211,6 @@ u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len)
209	EXPORT_SYMBOL(crc32_le);	211	EXPORT_SYMBOL(crc32_le);
210	EXPORT_SYMBOL(crc32_be);	212	EXPORT_SYMBOL(crc32_be);
211		213
212	/*
213	* A brief CRC tutorial.
214	*
215	* A CRC is a long-division remainder. You add the CRC to the message,
216	* and the whole thing (message+CRC) is a multiple of the given
217	* CRC polynomial. To check the CRC, you can either check that the
218	* CRC matches the recomputed value, or you can check that the
219	* remainder computed on the message+CRC is 0. This latter approach
220	* is used by a lot of hardware implementations, and is why so many
221	* protocols put the end-of-frame flag after the CRC.
222	*
223	* It's actually the same long division you learned in school, except that
224	* - We're working in binary, so the digits are only 0 and 1, and
225	* - When dividing polynomials, there are no carries. Rather than add and
226	* subtract, we just xor. Thus, we tend to get a bit sloppy about
227	* the difference between adding and subtracting.
228	*
229	* A 32-bit CRC polynomial is actually 33 bits long. But since it's
230	* 33 bits long, bit 32 is always going to be set, so usually the CRC
231	* is written in hex with the most significant bit omitted. (If you're
232	* familiar with the IEEE 754 floating-point format, it's the same idea.)
233	*
234	* Note that a CRC is computed over a string of bits, so you have
235	* to decide on the endianness of the bits within each byte. To get
236	* the best error-detecting properties, this should correspond to the
237	* order they're actually sent. For example, standard RS-232 serial is
238	* little-endian; the most significant bit (sometimes used for parity)
239	* is sent last. And when appending a CRC word to a message, you should
240	* do it in the right order, matching the endianness.
241	*
242	* Just like with ordinary division, the remainder is always smaller than
243	* the divisor (the CRC polynomial) you're dividing by. Each step of the
244	* division, you take one more digit (bit) of the dividend and append it
245	* to the current remainder. Then you figure out the appropriate multiple
246	* of the divisor to subtract to being the remainder back into range.
247	* In binary, it's easy - it has to be either 0 or 1, and to make the
248	* XOR cancel, it's just a copy of bit 32 of the remainder.
249	*
250	* When computing a CRC, we don't care about the quotient, so we can
251	* throw the quotient bit away, but subtract the appropriate multiple of
252	* the polynomial from the remainder and we're back to where we started,
253	* ready to process the next bit.
254	*
255	* A big-endian CRC written this way would be coded like:
256	* for (i = 0; i < input_bits; i++) {
257	* multiple = remainder & 0x80000000 ? CRCPOLY : 0;
258	* remainder = (remainder << 1 \| next_input_bit()) ^ multiple;
259	* }
260	* Notice how, to get at bit 32 of the shifted remainder, we look
261	* at bit 31 of the remainder before shifting it.
262	*
263	* But also notice how the next_input_bit() bits we're shifting into
264	* the remainder don't actually affect any decision-making until
265	* 32 bits later. Thus, the first 32 cycles of this are pretty boring.
266	* Also, to add the CRC to a message, we need a 32-bit-long hole for it at
267	* the end, so we have to add 32 extra cycles shifting in zeros at the
268	* end of every message,
269	*
270	* So the standard trick is to rearrage merging in the next_input_bit()
271	* until the moment it's needed. Then the first 32 cycles can be precomputed,
272	* and merging in the final 32 zero bits to make room for the CRC can be
273	* skipped entirely.
274	* This changes the code to:
275	* for (i = 0; i < input_bits; i++) {
276	* remainder ^= next_input_bit() << 31;
277	* multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
278	* remainder = (remainder << 1) ^ multiple;
279	* }
280	* With this optimization, the little-endian code is simpler:
281	* for (i = 0; i < input_bits; i++) {
282	* remainder ^= next_input_bit();
283	* multiple = (remainder & 1) ? CRCPOLY : 0;
284	* remainder = (remainder >> 1) ^ multiple;
285	* }
286	*
287	* Note that the other details of endianness have been hidden in CRCPOLY
288	* (which must be bit-reversed) and next_input_bit().
289	*
290	* However, as long as next_input_bit is returning the bits in a sensible
291	* order, we can actually do the merging 8 or more bits at a time rather
292	* than one bit at a time:
293	* for (i = 0; i < input_bytes; i++) {
294	* remainder ^= next_input_byte() << 24;
295	* for (j = 0; j < 8; j++) {
296	* multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
297	* remainder = (remainder << 1) ^ multiple;
298	* }
299	* }
300	* Or in little-endian:
301	* for (i = 0; i < input_bytes; i++) {
302	* remainder ^= next_input_byte();
303	* for (j = 0; j < 8; j++) {
304	* multiple = (remainder & 1) ? CRCPOLY : 0;
305	* remainder = (remainder << 1) ^ multiple;
306	* }
307	* }
308	* If the input is a multiple of 32 bits, you can even XOR in a 32-bit
309	* word at a time and increase the inner loop count to 32.
310	*
311	* You can also mix and match the two loop styles, for example doing the
312	* bulk of a message byte-at-a-time and adding bit-at-a-time processing
313	* for any fractional bytes at the end.
314	*
315	* The only remaining optimization is to the byte-at-a-time table method.
316	* Here, rather than just shifting one bit of the remainder to decide
317	* in the correct multiple to subtract, we can shift a byte at a time.
318	* This produces a 40-bit (rather than a 33-bit) intermediate remainder,
319	* but again the multiple of the polynomial to subtract depends only on
320	* the high bits, the high 8 bits in this case.
321	*
322	* The multiple we need in that case is the low 32 bits of a 40-bit
323	* value whose high 8 bits are given, and which is a multiple of the
324	* generator polynomial. This is simply the CRC-32 of the given
325	* one-byte message.
326	*
327	* Two more details: normally, appending zero bits to a message which
328	* is already a multiple of a polynomial produces a larger multiple of that
329	* polynomial. To enable a CRC to detect this condition, it's common to
330	* invert the CRC before appending it. This makes the remainder of the
331	* message+crc come out not as zero, but some fixed non-zero value.
332	*
333	* The same problem applies to zero bits prepended to the message, and
334	* a similar solution is used. Instead of starting with a remainder of
335	* 0, an initial remainder of all ones is used. As long as you start
336	* the same way on decoding, it doesn't make a difference.
337	*/
338
339	#ifdef UNITTEST	214	#ifdef UNITTEST
340		215
341	#include <stdlib.h>	216	#include <stdlib.h>