aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMauro Carvalho Chehab <mchehab@s-opensource.com>2016-10-26 06:14:12 -0400
committerMauro Carvalho Chehab <mchehab@s-opensource.com>2016-12-15 05:54:49 -0500
commitb27a2d04feb6969e74942378d5012d84877d3544 (patch)
tree6fb726a47822b6cb7bcea749ea75309922870da2
parent032d0ab743ff8ee340d5fc2a00c833dfe74c49e4 (diff)
edac.txt: convert EDAC documentation to ReST
Converts the EDAC driver subsystem documentation to ReST: - Put paragraph titles in lower case; - Add code blocks where needed; - Convert tables to ReST markup; - Mark filesystem and module names as verbatim; - Adjust document to be properly displayed in html. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
-rw-r--r--Documentation/edac.txt551
1 files changed, 295 insertions, 256 deletions
diff --git a/Documentation/edac.txt b/Documentation/edac.txt
index 502988524519..316456ba2e0a 100644
--- a/Documentation/edac.txt
+++ b/Documentation/edac.txt
@@ -1,29 +1,34 @@
1.. include:: <isonum.txt>
2
3=====================================
1EDAC - Error Detection And Correction 4EDAC - Error Detection And Correction
2===================================== 5=====================================
3 6
4"bluesmoke" was the name for this device driver when it 7.. note::
5was "out-of-tree" and maintained at sourceforge.net -
6bluesmoke.sourceforge.net. That site is mostly archaic now and can be
7used only for historical purposes.
8 8
9When the subsystem was pushed into 2.6.16 for the first time, it was 9 "bluesmoke" was the name for this device driver when it
10renamed to 'EDAC'. 10 was "out-of-tree" and maintained at http://bluesmoke.sourceforge.net.
11 That site is mostly archaic now and can be used only for historical
12 purposes.
11 13
12PURPOSE 14 When the subsystem was pushed into 2.6.16 for the first time, it was
15 renamed to ``EDAC``.
16
17Purpose
13------- 18-------
14 19
15The 'edac' kernel module's goal is to detect and report hardware errors 20The ``edac`` kernel module's goal is to detect and report hardware errors
16that occur within the computer system running under linux. 21that occur within the computer system running under linux.
17 22
18MEMORY 23Memory
19------ 24------
20 25
21Memory Correctable Errors (CE) and Uncorrectable Errors (UE) are the 26Memory Correctable Errors (CE) and Uncorrectable Errors (UE) are the
22primary errors being harvested. These types of errors are harvested by 27primary errors being harvested. These types of errors are harvested by
23the 'edac_mc' device. 28the ``edac_mc`` device.
24 29
25Detecting CE events, then harvesting those events and reporting them, 30Detecting CE events, then harvesting those events and reporting them,
26*can* but must not necessarily be a predictor of future UE events. With 31**can** but must not necessarily be a predictor of future UE events. With
27CE events only, the system can and will continue to operate as no data 32CE events only, the system can and will continue to operate as no data
28has been damaged yet. 33has been damaged yet.
29 34
@@ -31,10 +36,10 @@ However, preventive maintenance and proactive part replacement of memory
31DIMMs exhibiting CEs can reduce the likelihood of the dreaded UE events 36DIMMs exhibiting CEs can reduce the likelihood of the dreaded UE events
32and system panics. 37and system panics.
33 38
34OTHER HARDWARE ELEMENTS 39Other hardware elements
35----------------------- 40-----------------------
36 41
37A new feature for EDAC, the edac_device class of device, was added in 42A new feature for EDAC, the ``edac_device`` class of device, was added in
38the 2.6.23 version of the kernel. 43the 2.6.23 version of the kernel.
39 44
40This new device type allows for non-memory type of ECC hardware detectors 45This new device type allows for non-memory type of ECC hardware detectors
@@ -48,14 +53,14 @@ reports it, then a edac_device device probably can be constructed to
48harvest and present that to userspace. 53harvest and present that to userspace.
49 54
50 55
51PCI BUS SCANNING 56PCI bus scanning
52---------------- 57----------------
53 58
54In addition, PCI devices are scanned for PCI Bus Parity and SERR Errors 59In addition, PCI devices are scanned for PCI Bus Parity and SERR Errors
55in order to determine if errors are occurring during data transfers. 60in order to determine if errors are occurring during data transfers.
56 61
57The presence of PCI Parity errors must be examined with a grain of salt. 62The presence of PCI Parity errors must be examined with a grain of salt.
58There are several add-in adapters that do *not* follow the PCI specification 63There are several add-in adapters that do **not** follow the PCI specification
59with regards to Parity generation and reporting. The specification says 64with regards to Parity generation and reporting. The specification says
60the vendor should tie the parity status bits to 0 if they do not intend 65the vendor should tie the parity status bits to 0 if they do not intend
61to generate parity. Some vendors do not do this, and thus the parity bit 66to generate parity. Some vendors do not do this, and thus the parity bit
@@ -63,62 +68,64 @@ can "float" giving false positives.
63 68
64There is a PCI device attribute located in sysfs that is checked by 69There is a PCI device attribute located in sysfs that is checked by
65the EDAC PCI scanning code. If that attribute is set, PCI parity/error 70the EDAC PCI scanning code. If that attribute is set, PCI parity/error
66scanning is skipped for that device. The attribute is: 71scanning is skipped for that device. The attribute is::
67 72
68 broken_parity_status 73 broken_parity_status
69 74
70and is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directories for 75and is located in ``/sys/devices/pci<XXX>/0000:XX:YY.Z`` directories for
71PCI devices. 76PCI devices.
72 77
73 78
74VERSIONING 79Versioning
75---------- 80----------
76 81
77EDAC is composed of a "core" module (edac_core.ko) and several Memory 82EDAC is composed of a "core" module (``edac_core.ko``) and several Memory
78Controller (MC) driver modules. On a given system, the CORE is loaded 83Controller (MC) driver modules. On a given system, the CORE is loaded
79and one MC driver will be loaded. Both the CORE and the MC driver (or 84and one MC driver will be loaded. Both the CORE and the MC driver (or
80edac_device driver) have individual versions that reflect current 85``edac_device`` driver) have individual versions that reflect current
81release level of their respective modules. 86release level of their respective modules.
82 87
83Thus, to "report" on what version a system is running, one must report 88Thus, to "report" on what version a system is running, one must report
84both the CORE's and the MC driver's versions. 89both the CORE's and the MC driver's versions.
85 90
86 91
87LOADING 92Loading
88------- 93-------
89 94
90If 'edac' was statically linked with the kernel then no loading 95If ``edac`` was statically linked with the kernel then no loading
91is necessary. If 'edac' was built as modules then simply modprobe 96is necessary. If ``edac`` was built as modules then simply modprobe
92the 'edac' pieces that you need. You should be able to modprobe 97the ``edac`` pieces that you need. You should be able to modprobe
93hardware-specific modules and have the dependencies load the necessary 98hardware-specific modules and have the dependencies load the necessary
94core modules. 99core modules.
95 100
96Example: 101Example::
97 102
98$> modprobe amd76x_edac 103 $ modprobe amd76x_edac
99 104
100loads both the amd76x_edac.ko memory controller module and the edac_mc.ko 105loads both the ``amd76x_edac.ko`` memory controller module and the
101core module. 106``edac_mc.ko`` core module.
102 107
103 108
104SYSFS INTERFACE 109Sysfs interface
105--------------- 110---------------
106 111
107EDAC presents a 'sysfs' interface for control and reporting purposes. It 112EDAC presents a ``sysfs`` interface for control and reporting purposes. It
108lives in the /sys/devices/system/edac directory. 113lives in the /sys/devices/system/edac directory.
109 114
110Within this directory there currently reside 2 components: 115Within this directory there currently reside 2 components:
111 116
117 ======= ==============================
112 mc memory controller(s) system 118 mc memory controller(s) system
113 pci PCI control and status system 119 pci PCI control and status system
120 ======= ==============================
114 121
115 122
116 123
117Memory Controller (mc) Model 124Memory Controller (mc) Model
118---------------------------- 125----------------------------
119 126
120Each 'mc' device controls a set of DIMM memory modules. These modules 127Each ``mc`` device controls a set of DIMM memory modules. These modules
121are laid out in a Chip-Select Row (csrowX) and Channel table (chX). 128are laid out in a Chip-Select Row (``csrowX``) and Channel table (``chX``).
122There can be multiple csrows and multiple channels. 129There can be multiple csrows and multiple channels.
123 130
124Memory controllers allow for several csrows, with 8 csrows being a 131Memory controllers allow for several csrows, with 8 csrows being a
@@ -129,28 +136,28 @@ Dual channels allows for 128 bit data transfers to/from the CPU from/to
129memory. Some newer chipsets allow for more than 2 channels, like Fully 136memory. Some newer chipsets allow for more than 2 channels, like Fully
130Buffered DIMMs (FB-DIMMs). The following example will assume 2 channels: 137Buffered DIMMs (FB-DIMMs). The following example will assume 2 channels:
131 138
132 139 +--------+-----------+-----------+
133 Channel 0 Channel 1 140 | | Channel 0 | Channel 1 |
134 =================================== 141 +========+===========+===========+
135 csrow0 | DIMM_A0 | DIMM_B0 | 142 | csrow0 | DIMM_A0 | DIMM_B0 |
136 csrow1 | DIMM_A0 | DIMM_B0 | 143 +--------+ | |
137 =================================== 144 | csrow1 | | |
138 145 +--------+-----------+-----------+
139 =================================== 146 | csrow2 | DIMM_A1 | DIMM_B1 |
140 csrow2 | DIMM_A1 | DIMM_B1 | 147 +--------+ | |
141 csrow3 | DIMM_A1 | DIMM_B1 | 148 | csrow3 | | |
142 =================================== 149 +--------+-----------+-----------+
143 150
144In the above example table there are 4 physical slots on the motherboard 151In the above example table there are 4 physical slots on the motherboard
145for memory DIMMs: 152for memory DIMMs:
146 153
147 DIMM_A0 154 - DIMM_A0
148 DIMM_B0 155 - DIMM_B0
149 DIMM_A1 156 - DIMM_A1
150 DIMM_B1 157 - DIMM_B1
151 158
152Labels for these slots are usually silk-screened on the motherboard. 159Labels for these slots are usually silk-screened on the motherboard.
153Slots labeled 'A' are channel 0 in this example. Slots labeled 'B' are 160Slots labeled ``A`` are channel 0 in this example. Slots labeled ``B`` are
154channel 1. Notice that there are two csrows possible on a physical DIMM. 161channel 1. Notice that there are two csrows possible on a physical DIMM.
155These csrows are allocated their csrow assignment based on the slot into 162These csrows are allocated their csrow assignment based on the slot into
156which the memory DIMM is placed. Thus, when 1 DIMM is placed in each 163which the memory DIMM is placed. Thus, when 1 DIMM is placed in each
@@ -166,8 +173,7 @@ csrow3.
166The representation of the above is reflected in the directory 173The representation of the above is reflected in the directory
167tree in EDAC's sysfs interface. Starting in directory 174tree in EDAC's sysfs interface. Starting in directory
168/sys/devices/system/edac/mc each memory controller will be represented 175/sys/devices/system/edac/mc each memory controller will be represented
169by its own 'mcX' directory, where 'X' is the index of the MC. 176by its own ``mcX`` directory, where ``X`` is the index of the MC::
170
171 177
172 ..../edac/mc/ 178 ..../edac/mc/
173 | 179 |
@@ -176,9 +182,8 @@ by its own 'mcX' directory, where 'X' is the index of the MC.
176 |->mc2 182 |->mc2
177 .... 183 ....
178 184
179Under each 'mcX' directory each 'csrowX' is again represented by a 185Under each ``mcX`` directory each ``csrowX`` is again represented by a
180'csrowX', where 'X' is the csrow index: 186``csrowX``, where ``X`` is the csrow index::
181
182 187
183 .../mc/mc0/ 188 .../mc/mc0/
184 | 189 |
@@ -194,17 +199,18 @@ csrow3 are populated, this indicates a dual ranked set of DIMMs for
194channels 0 and 1. 199channels 0 and 1.
195 200
196 201
197Within each of the 'mcX' and 'csrowX' directories are several EDAC 202Within each of the ``mcX`` and ``csrowX`` directories are several EDAC
198control and attribute files. 203control and attribute files.
199 204
200 205
201'mcX' directories 206``mcX`` directories
202----------------- 207-------------------
203 208
204In 'mcX' directories are EDAC control and attribute files for 209In ``mcX`` directories are EDAC control and attribute files for
205this 'X' instance of the memory controllers. 210this ``X`` instance of the memory controllers.
206 211
207For a description of the sysfs API, please see: 212For a description of the sysfs API, please see:
213
208 Documentation/ABI/testing/sysfs-devices-edac 214 Documentation/ABI/testing/sysfs-devices-edac
209 215
210 216
@@ -329,21 +335,19 @@ this ``X`` memory module:
329 symlinks inside the sysfs mapping that are automatically created by 335 symlinks inside the sysfs mapping that are automatically created by
330 the sysfs subsystem. Currently, they serve no purpose. 336 the sysfs subsystem. Currently, they serve no purpose.
331 337
332'csrowX' directories 338``csrowX`` directories
333-------------------- 339----------------------
334 340
335When CONFIG_EDAC_LEGACY_SYSFS is enabled, sysfs will contain the csrowX 341When CONFIG_EDAC_LEGACY_SYSFS is enabled, sysfs will contain the csrowX
336directories. As this API doesn't work properly for Rambus, FB-DIMMs and 342directories. As this API doesn't work properly for Rambus, FB-DIMMs and
337modern Intel Memory Controllers, this is being deprecated in favor of 343modern Intel Memory Controllers, this is being deprecated in favor of
338dimmX directories. 344dimmX directories.
339 345
340In the 'csrowX' directories are EDAC control and attribute files for 346In the ``csrowX`` directories are EDAC control and attribute files for
341this 'X' instance of csrow: 347this ``X`` instance of csrow:
342 348
343 349
344Total Uncorrectable Errors count attribute file: 350- ``ue_count`` - Total Uncorrectable Errors count attribute file
345
346 'ue_count'
347 351
348 This attribute file displays the total count of uncorrectable 352 This attribute file displays the total count of uncorrectable
349 errors that have occurred on this csrow. If panic_on_ue is set 353 errors that have occurred on this csrow. If panic_on_ue is set
@@ -351,9 +355,7 @@ Total Uncorrectable Errors count attribute file:
351 will panic the system. 355 will panic the system.
352 356
353 357
354Total Correctable Errors count attribute file: 358- ``ce_count`` - Total Correctable Errors count attribute file
355
356 'ce_count'
357 359
358 This attribute file displays the total count of correctable 360 This attribute file displays the total count of correctable
359 errors that have occurred on this csrow. This count is very 361 errors that have occurred on this csrow. This count is very
@@ -363,65 +365,54 @@ Total Correctable Errors count attribute file:
363 to the system administrator. 365 to the system administrator.
364 366
365 367
366Total memory managed by this csrow attribute file: 368- ``size_mb`` - Total memory managed by this csrow attribute file
367
368 'size_mb'
369 369
370 This attribute file displays, in count of megabytes, the memory 370 This attribute file displays, in count of megabytes, the memory
371 that this csrow contains. 371 that this csrow contains.
372 372
373 373
374Memory Type attribute file: 374- ``mem_type`` - Memory Type attribute file
375
376 'mem_type'
377 375
378 This attribute file will display what type of memory is currently 376 This attribute file will display what type of memory is currently
379 on this csrow. Normally, either buffered or unbuffered memory. 377 on this csrow. Normally, either buffered or unbuffered memory.
380 Examples: 378 Examples:
381 Registered-DDR
382 Unbuffered-DDR
383 379
380 - Registered-DDR
381 - Unbuffered-DDR
384 382
385EDAC Mode of operation attribute file:
386 383
387 'edac_mode' 384- ``edac_mode`` - EDAC Mode of operation attribute file
388 385
389 This attribute file will display what type of Error detection 386 This attribute file will display what type of Error detection
390 and correction is being utilized. 387 and correction is being utilized.
391 388
392 389
393Device type attribute file: 390- ``dev_type`` - Device type attribute file
394
395 'dev_type'
396 391
397 This attribute file will display what type of DRAM device is 392 This attribute file will display what type of DRAM device is
398 being utilized on this DIMM. 393 being utilized on this DIMM.
399 Examples: 394 Examples:
400 x1
401 x2
402 x4
403 x8
404 395
396 - x1
397 - x2
398 - x4
399 - x8
405 400
406Channel 0 CE Count attribute file:
407 401
408 'ch0_ce_count' 402- ``ch0_ce_count`` - Channel 0 CE Count attribute file
409 403
410 This attribute file will display the count of CEs on this 404 This attribute file will display the count of CEs on this
411 DIMM located in channel 0. 405 DIMM located in channel 0.
412 406
413 407
414Channel 0 UE Count attribute file: 408- ``ch0_ue_count`` - Channel 0 UE Count attribute file
415
416 'ch0_ue_count'
417 409
418 This attribute file will display the count of UEs on this 410 This attribute file will display the count of UEs on this
419 DIMM located in channel 0. 411 DIMM located in channel 0.
420 412
421 413
422Channel 0 DIMM Label control file: 414- ``ch0_dimm_label`` - Channel 0 DIMM Label control file
423 415
424 'ch0_dimm_label'
425 416
426 This control file allows this DIMM to have a label assigned 417 This control file allows this DIMM to have a label assigned
427 to it. With this label in the module, when errors occur 418 to it. With this label in the module, when errors occur
@@ -436,25 +427,21 @@ Channel 0 DIMM Label control file:
436 must occur in userland at this time. 427 must occur in userland at this time.
437 428
438 429
439Channel 1 CE Count attribute file: 430- ``ch1_ce_count`` - Channel 1 CE Count attribute file
440 431
441 'ch1_ce_count'
442 432
443 This attribute file will display the count of CEs on this 433 This attribute file will display the count of CEs on this
444 DIMM located in channel 1. 434 DIMM located in channel 1.
445 435
446 436
447Channel 1 UE Count attribute file: 437- ``ch1_ue_count`` - Channel 1 UE Count attribute file
448 438
449 'ch1_ue_count'
450 439
451 This attribute file will display the count of UEs on this 440 This attribute file will display the count of UEs on this
452 DIMM located in channel 0. 441 DIMM located in channel 0.
453 442
454 443
455Channel 1 DIMM Label control file: 444- ``ch1_dimm_label`` - Channel 1 DIMM Label control file
456
457 'ch1_dimm_label'
458 445
459 This control file allows this DIMM to have a label assigned 446 This control file allows this DIMM to have a label assigned
460 to it. With this label in the module, when errors occur 447 to it. With this label in the module, when errors occur
@@ -469,33 +456,44 @@ Channel 1 DIMM Label control file:
469 must occur in userland at this time. 456 must occur in userland at this time.
470 457
471 458
472 459System Logging
473SYSTEM LOGGING
474-------------- 460--------------
475 461
476If logging for UEs and CEs is enabled, then system logs will contain 462If logging for UEs and CEs is enabled, then system logs will contain
477information indicating that errors have been detected: 463information indicating that errors have been detected::
478 464
479EDAC MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0, 465 EDAC MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0, channel 1 "DIMM_B1": amd76x_edac
480channel 1 "DIMM_B1": amd76x_edac 466 EDAC MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0, channel 1 "DIMM_B1": amd76x_edac
481
482EDAC MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0,
483channel 1 "DIMM_B1": amd76x_edac
484 467
485 468
486The structure of the message is: 469The structure of the message is:
487 the memory controller (MC0) 470
488 Error type (CE) 471 +---------------------------------------+-------------+
489 memory page (0x283) 472 | Content + Example |
490 offset in the page (0xce0) 473 +=======================================+=============+
491 the byte granularity (grain 8) 474 | The memory controller | MC0 |
492 or resolution of the error 475 +---------------------------------------+-------------+
493 the error syndrome (0xb741) 476 | Error type | CE |
494 memory row (row 0) 477 +---------------------------------------+-------------+
495 memory channel (channel 1) 478 | Memory page | 0x283 |
496 DIMM label, if set prior (DIMM B1 479 +---------------------------------------+-------------+
497 and then an optional, driver-specific message that may 480 | Offset in the page | 0xce0 |
498 have additional information. 481 +---------------------------------------+-------------+
482 | The byte granularity | grain 8 |
483 | or resolution of the error | |
484 +---------------------------------------+-------------+
485 | The error syndrome | 0xb741 |
486 +---------------------------------------+-------------+
487 | Memory row | row 0 +
488 +---------------------------------------+-------------+
489 | Memory channel | channel 1 |
490 +---------------------------------------+-------------+
491 | DIMM label, if set prior | DIMM B1 |
492 +---------------------------------------+-------------+
493 | And then an optional, driver-specific | |
494 | message that may have additional | |
495 | information. | |
496 +---------------------------------------+-------------+
499 497
500Both UEs and CEs with no info will lack all but memory controller, error 498Both UEs and CEs with no info will lack all but memory controller, error
501type, a notice of "no info" and then an optional, driver-specific error 499type, a notice of "no info" and then an optional, driver-specific error
@@ -512,43 +510,38 @@ Type 01 bridges, the secondary status register is also looked at to see
512if parity occurred on the bus on the other side of the bridge. 510if parity occurred on the bus on the other side of the bridge.
513 511
514 512
515SYSFS CONFIGURATION 513Sysfs configuration
516------------------- 514-------------------
517 515
518Under /sys/devices/system/edac/pci are control and attribute files as follows: 516Under ``/sys/devices/system/edac/pci`` are control and attribute files as
517follows:
519 518
520 519
521Enable/Disable PCI Parity checking control file: 520- ``check_pci_parity`` - Enable/Disable PCI Parity checking control file
522
523 'check_pci_parity'
524
525 521
526 This control file enables or disables the PCI Bus Parity scanning 522 This control file enables or disables the PCI Bus Parity scanning
527 operation. Writing a 1 to this file enables the scanning. Writing 523 operation. Writing a 1 to this file enables the scanning. Writing
528 a 0 to this file disables the scanning. 524 a 0 to this file disables the scanning.
529 525
530 Enable: 526 Enable::
531 echo "1" >/sys/devices/system/edac/pci/check_pci_parity 527
528 echo "1" >/sys/devices/system/edac/pci/check_pci_parity
532 529
533 Disable: 530 Disable::
534 echo "0" >/sys/devices/system/edac/pci/check_pci_parity
535 531
532 echo "0" >/sys/devices/system/edac/pci/check_pci_parity
536 533
537Parity Count:
538 534
539 'pci_parity_count' 535- ``pci_parity_count`` - Parity Count
540 536
541 This attribute file will display the number of parity errors that 537 This attribute file will display the number of parity errors that
542 have been detected. 538 have been detected.
543 539
544 540
545 541Module parameters
546MODULE PARAMETERS
547----------------- 542-----------------
548 543
549Panic on UE control file: 544- ``edac_mc_panic_on_ue`` - Panic on UE control file
550
551 'edac_mc_panic_on_ue'
552 545
553 An uncorrectable error will cause a machine panic. This is usually 546 An uncorrectable error will cause a machine panic. This is usually
554 desirable. It is a bad idea to continue when an uncorrectable error 547 desirable. It is a bad idea to continue when an uncorrectable error
@@ -557,40 +550,49 @@ Panic on UE control file:
557 corruption. If the kernel has MCE configured, then EDAC will never 550 corruption. If the kernel has MCE configured, then EDAC will never
558 notice the UE. 551 notice the UE.
559 552
560 LOAD TIME: module/kernel parameter: edac_mc_panic_on_ue=[0|1] 553 LOAD TIME::
554
555 module/kernel parameter: edac_mc_panic_on_ue=[0|1]
556
557 RUN TIME::
561 558
562 RUN TIME: echo "1" > /sys/module/edac_core/parameters/edac_mc_panic_on_ue 559 echo "1" > /sys/module/edac_core/parameters/edac_mc_panic_on_ue
563 560
564 561
565Log UE control file: 562- ``edac_mc_log_ue`` - Log UE control file
566 563
567 'edac_mc_log_ue'
568 564
569 Generate kernel messages describing uncorrectable errors. These errors 565 Generate kernel messages describing uncorrectable errors. These errors
570 are reported through the system message log system. UE statistics 566 are reported through the system message log system. UE statistics
571 will be accumulated even when UE logging is disabled. 567 will be accumulated even when UE logging is disabled.
572 568
573 LOAD TIME: module/kernel parameter: edac_mc_log_ue=[0|1] 569 LOAD TIME::
574 570
575 RUN TIME: echo "1" > /sys/module/edac_core/parameters/edac_mc_log_ue 571 module/kernel parameter: edac_mc_log_ue=[0|1]
576 572
573 RUN TIME::
577 574
578Log CE control file: 575 echo "1" > /sys/module/edac_core/parameters/edac_mc_log_ue
576
577
578- ``edac_mc_log_ce`` - Log CE control file
579 579
580 'edac_mc_log_ce'
581 580
582 Generate kernel messages describing correctable errors. These 581 Generate kernel messages describing correctable errors. These
583 errors are reported through the system message log system. 582 errors are reported through the system message log system.
584 CE statistics will be accumulated even when CE logging is disabled. 583 CE statistics will be accumulated even when CE logging is disabled.
585 584
586 LOAD TIME: module/kernel parameter: edac_mc_log_ce=[0|1] 585 LOAD TIME::
586
587 module/kernel parameter: edac_mc_log_ce=[0|1]
588
589 RUN TIME::
587 590
588 RUN TIME: echo "1" > /sys/module/edac_core/parameters/edac_mc_log_ce 591 echo "1" > /sys/module/edac_core/parameters/edac_mc_log_ce
589 592
590 593
591Polling period control file: 594- ``edac_mc_poll_msec`` - Polling period control file
592 595
593 'edac_mc_poll_msec'
594 596
595 The time period, in milliseconds, for polling for error information. 597 The time period, in milliseconds, for polling for error information.
596 Too small a value wastes resources. Too large a value might delay 598 Too small a value wastes resources. Too large a value might delay
@@ -599,27 +601,33 @@ Polling period control file:
599 default. Systems which require all the bandwidth they can get, may 601 default. Systems which require all the bandwidth they can get, may
600 increase this. 602 increase this.
601 603
602 LOAD TIME: module/kernel parameter: edac_mc_poll_msec=[0|1] 604 LOAD TIME::
603 605
604 RUN TIME: echo "1000" > /sys/module/edac_core/parameters/edac_mc_poll_msec 606 module/kernel parameter: edac_mc_poll_msec=[0|1]
605 607
608 RUN TIME::
606 609
607Panic on PCI PARITY Error: 610 echo "1000" > /sys/module/edac_core/parameters/edac_mc_poll_msec
608 611
609 'panic_on_pci_parity' 612
613- ``panic_on_pci_parity`` - Panic on PCI PARITY Error
610 614
611 615
612 This control file enables or disables panicking when a parity 616 This control file enables or disables panicking when a parity
613 error has been detected. 617 error has been detected.
614 618
615 619
616 module/kernel parameter: edac_panic_on_pci_pe=[0|1] 620 module/kernel parameter::
621
622 edac_panic_on_pci_pe=[0|1]
623
624 Enable::
625
626 echo "1" > /sys/module/edac_core/parameters/edac_panic_on_pci_pe
617 627
618 Enable: 628 Disable::
619 echo "1" > /sys/module/edac_core/parameters/edac_panic_on_pci_pe
620 629
621 Disable: 630 echo "0" > /sys/module/edac_core/parameters/edac_panic_on_pci_pe
622 echo "0" > /sys/module/edac_core/parameters/edac_panic_on_pci_pe
623 631
624 632
625 633
@@ -631,28 +639,31 @@ and APIs for the EDAC_DEVICE.
631 639
632User space access to an edac_device is through the sysfs interface. 640User space access to an edac_device is through the sysfs interface.
633 641
634At the location /sys/devices/system/edac (sysfs) new edac_device devices will 642At the location ``/sys/devices/system/edac`` (sysfs) new edac_device devices
635appear. 643will appear.
636 644
637There is a three level tree beneath the above 'edac' directory. For example, 645There is a three level tree beneath the above ``edac`` directory. For example,
638the 'test_device_edac' device (found at the bluesmoke.sourceforget.net website) 646the ``test_device_edac`` device (found at the http://bluesmoke.sourceforget.net
639installs itself as: 647website) installs itself as::
640 648
641 /sys/devices/systm/edac/test-instance 649 /sys/devices/system/edac/test-instance
642 650
643in this directory are various controls, a symlink and one or more 'instance' 651in this directory are various controls, a symlink and one or more ``instance``
644directories. 652directories.
645 653
646The standard default controls are: 654The standard default controls are:
647 655
656 ============== =======================================================
648 log_ce boolean to log CE events 657 log_ce boolean to log CE events
649 log_ue boolean to log UE events 658 log_ue boolean to log UE events
650 panic_on_ue boolean to 'panic' the system if an UE is encountered 659 panic_on_ue boolean to ``panic`` the system if an UE is encountered
651 (default off, can be set true via startup script) 660 (default off, can be set true via startup script)
652 poll_msec time period between POLL cycles for events 661 poll_msec time period between POLL cycles for events
662 ============== =======================================================
653 663
654The test_device_edac device adds at least one of its own custom control: 664The test_device_edac device adds at least one of its own custom control:
655 665
666 ============== ==================================================
656 test_bits which in the current test driver does nothing but 667 test_bits which in the current test driver does nothing but
657 show how it is installed. A ported driver can 668 show how it is installed. A ported driver can
658 add one or more such controls and/or attributes 669 add one or more such controls and/or attributes
@@ -660,42 +671,52 @@ The test_device_edac device adds at least one of its own custom control:
660 One out-of-tree driver uses controls here to allow 671 One out-of-tree driver uses controls here to allow
661 for ERROR INJECTION operations to hardware 672 for ERROR INJECTION operations to hardware
662 injection registers 673 injection registers
674 ============== ==================================================
663 675
664The symlink points to the 'struct dev' that is registered for this edac_device. 676The symlink points to the 'struct dev' that is registered for this edac_device.
665 677
666INSTANCES 678Instances
667--------- 679---------
668 680
669One or more instance directories are present. For the 'test_device_edac' case: 681One or more instance directories are present. For the ``test_device_edac``
682case:
670 683
671 test-instance0 684 +----------------+
685 | test-instance0 |
686 +----------------+
672 687
673 688
674In this directory there are two default counter attributes, which are totals of 689In this directory there are two default counter attributes, which are totals of
675counter in deeper subdirectories. 690counter in deeper subdirectories.
676 691
692 ============== ====================================
677 ce_count total of CE events of subdirectories 693 ce_count total of CE events of subdirectories
678 ue_count total of UE events of subdirectories 694 ue_count total of UE events of subdirectories
695 ============== ====================================
679 696
680BLOCKS 697Blocks
681------ 698------
682 699
683At the lowest directory level is the 'block' directory. There can be 0, 1 700At the lowest directory level is the ``block`` directory. There can be 0, 1
684or more blocks specified in each instance. 701or more blocks specified in each instance:
685
686 test-block0
687 702
703 +-------------+
704 | test-block0 |
705 +-------------+
688 706
689In this directory the default attributes are: 707In this directory the default attributes are:
690 708
691 ce_count which is counter of CE events for this 'block' 709 ============== ================================================
710 ce_count which is counter of CE events for this ``block``
692 of hardware being monitored 711 of hardware being monitored
693 ue_count which is counter of UE events for this 'block' 712 ue_count which is counter of UE events for this ``block``
694 of hardware being monitored 713 of hardware being monitored
714 ============== ================================================
695 715
696 716
697The 'test_device_edac' device adds 4 attributes and 1 control: 717The ``test_device_edac`` device adds 4 attributes and 1 control:
698 718
719 ================== ====================================================
699 test-block-bits-0 for every POLL cycle this counter 720 test-block-bits-0 for every POLL cycle this counter
700 is incremented 721 is incremented
701 test-block-bits-1 every 10 cycles, this counter is bumped once, 722 test-block-bits-1 every 10 cycles, this counter is bumped once,
@@ -704,20 +725,23 @@ The 'test_device_edac' device adds 4 attributes and 1 control:
704 and test-block-bits-1 is set to 0 725 and test-block-bits-1 is set to 0
705 test-block-bits-3 every 1000 cycles, this counter is bumped once, 726 test-block-bits-3 every 1000 cycles, this counter is bumped once,
706 and test-block-bits-2 is set to 0 727 and test-block-bits-2 is set to 0
728 ================== ====================================================
707 729
708 730
731 ================== ====================================================
709 reset-counters writing ANY thing to this control will 732 reset-counters writing ANY thing to this control will
710 reset all the above counters. 733 reset all the above counters.
734 ================== ====================================================
711 735
712 736
713Use of the 'test_device_edac' driver should enable any others to create their own 737Use of the ``test_device_edac`` driver should enable any others to create their own
714unique drivers for their hardware systems. 738unique drivers for their hardware systems.
715 739
716The 'test_device_edac' sample driver is located at the 740The ``test_device_edac`` sample driver is located at the
717bluesmoke.sourceforge.net project site for EDAC. 741http://bluesmoke.sourceforge.net project site for EDAC.
718 742
719 743
720NEHALEM USAGE OF EDAC APIs 744Nehalem Usage of EDAC APIs
721-------------------------- 745--------------------------
722 746
723This chapter documents some EXPERIMENTAL mappings for EDAC API to handle 747This chapter documents some EXPERIMENTAL mappings for EDAC API to handle
@@ -739,7 +763,8 @@ were done at i7core_edac driver. This chapter will cover those differences
739 As EDAC API maps the minimum unity is csrows, the driver sequentially 763 As EDAC API maps the minimum unity is csrows, the driver sequentially
740 maps channel/dimm into different csrows. 764 maps channel/dimm into different csrows.
741 765
742 For example, supposing the following layout: 766 For example, supposing the following layout::
767
743 Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs 768 Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
744 dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400 769 dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
745 dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400 770 dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
@@ -747,14 +772,15 @@ were done at i7core_edac driver. This chapter will cover those differences
747 dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400 772 dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
748 Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs 773 Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
749 dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400 774 dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
750 The driver will map it as: 775
776 The driver will map it as::
777
751 csrow0: channel 0, dimm0 778 csrow0: channel 0, dimm0
752 csrow1: channel 0, dimm1 779 csrow1: channel 0, dimm1
753 csrow2: channel 1, dimm0 780 csrow2: channel 1, dimm0
754 csrow3: channel 2, dimm0 781 csrow3: channel 2, dimm0
755 782
756exports one 783 exports one DIMM per csrow.
757 DIMM per csrow.
758 784
759 Each QPI is exported as a different memory controller. 785 Each QPI is exported as a different memory controller.
760 786
@@ -762,47 +788,53 @@ exports one
762 functionality via some error injection nodes: 788 functionality via some error injection nodes:
763 789
764 For injecting a memory error, there are some sysfs nodes, under 790 For injecting a memory error, there are some sysfs nodes, under
765 /sys/devices/system/edac/mc/mc?/: 791 ``/sys/devices/system/edac/mc/mc?/``:
766 792
767 inject_addrmatch/*: 793 - ``inject_addrmatch/*``:
768 Controls the error injection mask register. It is possible to specify 794 Controls the error injection mask register. It is possible to specify
769 several characteristics of the address to match an error code: 795 several characteristics of the address to match an error code::
796
770 dimm = the affected dimm. Numbers are relative to a channel; 797 dimm = the affected dimm. Numbers are relative to a channel;
771 rank = the memory rank; 798 rank = the memory rank;
772 channel = the channel that will generate an error; 799 channel = the channel that will generate an error;
773 bank = the affected bank; 800 bank = the affected bank;
774 page = the page address; 801 page = the page address;
775 column (or col) = the address column. 802 column (or col) = the address column.
803
776 each of the above values can be set to "any" to match any valid value. 804 each of the above values can be set to "any" to match any valid value.
777 805
778 At driver init, all values are set to any. 806 At driver init, all values are set to any.
779 807
780 For example, to generate an error at rank 1 of dimm 2, for any channel, 808 For example, to generate an error at rank 1 of dimm 2, for any channel,
781 any bank, any page, any column: 809 any bank, any page, any column::
810
782 echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm 811 echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm
783 echo 1 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank 812 echo 1 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank
784 813
785 To return to the default behaviour of matching any, you can do: 814 To return to the default behaviour of matching any, you can do::
815
786 echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm 816 echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm
787 echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank 817 echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank
788 818
789 inject_eccmask: 819 - ``inject_eccmask``:
790 specifies what bits will have troubles, 820 specifies what bits will have troubles,
821
822 - ``inject_section``:
823 specifies what ECC cache section will get the error::
791 824
792 inject_section:
793 specifies what ECC cache section will get the error:
794 3 for both 825 3 for both
795 2 for the highest 826 2 for the highest
796 1 for the lowest 827 1 for the lowest
797 828
798 inject_type: 829 - ``inject_type``:
799 specifies the type of error, being a combination of the following bits: 830 specifies the type of error, being a combination of the following bits::
831
800 bit 0 - repeat 832 bit 0 - repeat
801 bit 1 - ecc 833 bit 1 - ecc
802 bit 2 - parity 834 bit 2 - parity
803 835
804 inject_enable starts the error generation when something different 836 - ``inject_enable``:
805 than 0 is written. 837 starts the error generation when something different than 0 is written.
806 838
807 All inject vars can be read. root permission is needed for write. 839 All inject vars can be read. root permission is needed for write.
808 840
@@ -811,21 +843,21 @@ exports one
811 also produce an error. 843 also produce an error.
812 844
813 For example, the following code will generate an error for any write access 845 For example, the following code will generate an error for any write access
814 at socket 0, on any DIMM/address on channel 2: 846 at socket 0, on any DIMM/address on channel 2::
815 847
816 echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/channel 848 echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/channel
817 echo 2 >/sys/devices/system/edac/mc/mc0/inject_type 849 echo 2 >/sys/devices/system/edac/mc/mc0/inject_type
818 echo 64 >/sys/devices/system/edac/mc/mc0/inject_eccmask 850 echo 64 >/sys/devices/system/edac/mc/mc0/inject_eccmask
819 echo 3 >/sys/devices/system/edac/mc/mc0/inject_section 851 echo 3 >/sys/devices/system/edac/mc/mc0/inject_section
820 echo 1 >/sys/devices/system/edac/mc/mc0/inject_enable 852 echo 1 >/sys/devices/system/edac/mc/mc0/inject_enable
821 dd if=/dev/mem of=/dev/null seek=16k bs=4k count=1 >& /dev/null 853 dd if=/dev/mem of=/dev/null seek=16k bs=4k count=1 >& /dev/null
822 854
823 For socket 1, it is needed to replace "mc0" by "mc1" at the above 855 For socket 1, it is needed to replace "mc0" by "mc1" at the above
824 commands. 856 commands.
825 857
826 The generated error message will look like: 858 The generated error message will look like::
827 859
828 EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error)) 860 EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error))
829 861
8303) Nehalem specific Corrected Error memory counters 8623) Nehalem specific Corrected Error memory counters
831 863
@@ -837,9 +869,9 @@ exports one
837 granularity than the default ones), the driver exposes those registers for 869 granularity than the default ones), the driver exposes those registers for
838 UDIMM memories. 870 UDIMM memories.
839 871
840 They can be read by looking at the contents of all_channel_counts/ 872 They can be read by looking at the contents of ``all_channel_counts/``::
841 873
842 $ for i in /sys/devices/system/edac/mc/mc0/all_channel_counts/*; do echo $i; cat $i; done 874 $ for i in /sys/devices/system/edac/mc/mc0/all_channel_counts/*; do echo $i; cat $i; done
843 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0 875 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0
844 0 876 0
845 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1 877 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1
@@ -849,17 +881,21 @@ exports one
849 881
850 What happens here is that errors on different csrows, but at the same 882 What happens here is that errors on different csrows, but at the same
851 dimm number will increment the same counter. 883 dimm number will increment the same counter.
852 So, in this memory mapping: 884 So, in this memory mapping::
885
853 csrow0: channel 0, dimm0 886 csrow0: channel 0, dimm0
854 csrow1: channel 0, dimm1 887 csrow1: channel 0, dimm1
855 csrow2: channel 1, dimm0 888 csrow2: channel 1, dimm0
856 csrow3: channel 2, dimm0 889 csrow3: channel 2, dimm0
890
857 The hardware will increment udimm0 for an error at the first dimm at either 891 The hardware will increment udimm0 for an error at the first dimm at either
858 csrow0, csrow2 or csrow3; 892 csrow0, csrow2 or csrow3;
893
859 The hardware will increment udimm1 for an error at the second dimm at either 894 The hardware will increment udimm1 for an error at the second dimm at either
860 csrow0, csrow2 or csrow3; 895 csrow0, csrow2 or csrow3;
896
861 The hardware will increment udimm2 for an error at the third dimm at either 897 The hardware will increment udimm2 for an error at the third dimm at either
862 csrow0, csrow2 or csrow3; 898 csrow0, csrow2 or csrow3;
863 899
8644) Standard error counters 9004) Standard error counters
865 901
@@ -868,65 +904,68 @@ exports one
868 possible that some errors could be lost. With rdimm's, they display the 904 possible that some errors could be lost. With rdimm's, they display the
869 contents of the registers 905 contents of the registers
870 906
871AMD64_EDAC REFERENCE DOCUMENTS USED 907Reference documents used on ``amd64_edac``
872----------------------------------- 908------------------------------------------
873amd64_edac module is based on the following documents 909
910``amd64_edac`` module is based on the following documents
874(available from http://support.amd.com/en-us/search/tech-docs): 911(available from http://support.amd.com/en-us/search/tech-docs):
875 912
8761. Title: BIOS and Kernel Developer's Guide for AMD Athlon 64 and AMD 9131. :Title: BIOS and Kernel Developer's Guide for AMD Athlon 64 and AMD
877 Opteron Processors 914 Opteron Processors
878 AMD publication #: 26094 915 :AMD publication #: 26094
879 Revision: 3.26 916 :Revision: 3.26
880 Link: http://support.amd.com/TechDocs/26094.PDF 917 :Link: http://support.amd.com/TechDocs/26094.PDF
881 918
8822. Title: BIOS and Kernel Developer's Guide for AMD NPT Family 0Fh 9192. :Title: BIOS and Kernel Developer's Guide for AMD NPT Family 0Fh
883 Processors 920 Processors
884 AMD publication #: 32559 921 :AMD publication #: 32559
885 Revision: 3.00 922 :Revision: 3.00
886 Issue Date: May 2006 923 :Issue Date: May 2006
887 Link: http://support.amd.com/TechDocs/32559.pdf 924 :Link: http://support.amd.com/TechDocs/32559.pdf
888 925
8893. Title: BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h 9263. :Title: BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h
890 Processors 927 Processors
891 AMD publication #: 31116 928 :AMD publication #: 31116
892 Revision: 3.00 929 :Revision: 3.00
893 Issue Date: September 07, 2007 930 :Issue Date: September 07, 2007
894 Link: http://support.amd.com/TechDocs/31116.pdf 931 :Link: http://support.amd.com/TechDocs/31116.pdf
895 932
8964. Title: BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h 9334. :Title: BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h
897 Models 30h-3Fh Processors 934 Models 30h-3Fh Processors
898 AMD publication #: 49125 935 :AMD publication #: 49125
899 Revision: 3.06 936 :Revision: 3.06
900 Issue Date: 2/12/2015 (latest release) 937 :Issue Date: 2/12/2015 (latest release)
901 Link: http://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf 938 :Link: http://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf
902 939
9035. Title: BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h 9405. :Title: BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h
904 Models 60h-6Fh Processors 941 Models 60h-6Fh Processors
905 AMD publication #: 50742 942 :AMD publication #: 50742
906 Revision: 3.01 943 :Revision: 3.01
907 Issue Date: 7/23/2015 (latest release) 944 :Issue Date: 7/23/2015 (latest release)
908 Link: http://support.amd.com/TechDocs/50742_15h_Models_60h-6Fh_BKDG.pdf 945 :Link: http://support.amd.com/TechDocs/50742_15h_Models_60h-6Fh_BKDG.pdf
909 946
9106. Title: BIOS and Kernel Developer's Guide (BKDG) for AMD Family 16h 9476. :Title: BIOS and Kernel Developer's Guide (BKDG) for AMD Family 16h
911 Models 00h-0Fh Processors 948 Models 00h-0Fh Processors
912 AMD publication #: 48751 949 :AMD publication #: 48751
913 Revision: 3.03 950 :Revision: 3.03
914 Issue Date: 2/23/2015 (latest release) 951 :Issue Date: 2/23/2015 (latest release)
915 Link: http://support.amd.com/TechDocs/48751_16h_bkdg.pdf 952 :Link: http://support.amd.com/TechDocs/48751_16h_bkdg.pdf
953
954Credits
955=======
956
957* Written by Doug Thompson <dougthompson@xmission.com>
916 958
917CREDITS: 959 - 7 Dec 2005
918======== 960 - 17 Jul 2007 Updated
919 961
920Written by Doug Thompson <dougthompson@xmission.com> 962* |copy| Mauro Carvalho Chehab
9217 Dec 2005
92217 Jul 2007 Updated
923 963
924(c) Mauro Carvalho Chehab 964 - 05 Aug 2009 Nehalem interface
92505 Aug 2009 Nehalem interface
926 965
927EDAC authors/maintainers: 966* EDAC authors/maintainers:
928 967
929 Doug Thompson, Dave Jiang, Dave Peterson et al, 968 - Doug Thompson, Dave Jiang, Dave Peterson et al,
930 Mauro Carvalho Chehab 969 - Mauro Carvalho Chehab
931 Borislav Petkov 970 - Borislav Petkov
932 original author: Thayne Harbaugh 971 - original author: Thayne Harbaugh