diff options
Diffstat (limited to 'Documentation/drivers/edac/edac.txt')
-rw-r--r-- | Documentation/drivers/edac/edac.txt | 34 |
1 files changed, 17 insertions, 17 deletions
diff --git a/Documentation/drivers/edac/edac.txt b/Documentation/drivers/edac/edac.txt index d37191fe5681..70d96a62e5e1 100644 --- a/Documentation/drivers/edac/edac.txt +++ b/Documentation/drivers/edac/edac.txt | |||
@@ -21,7 +21,7 @@ within the computer system. In the initial release, memory Correctable Errors | |||
21 | 21 | ||
22 | Detecting CE events, then harvesting those events and reporting them, | 22 | Detecting CE events, then harvesting those events and reporting them, |
23 | CAN be a predictor of future UE events. With CE events, the system can | 23 | CAN be a predictor of future UE events. With CE events, the system can |
24 | continue to operate, but with less safety. Preventive maintainence and | 24 | continue to operate, but with less safety. Preventive maintenance and |
25 | proactive part replacement of memory DIMMs exhibiting CEs can reduce | 25 | proactive part replacement of memory DIMMs exhibiting CEs can reduce |
26 | the likelihood of the dreaded UE events and system 'panics'. | 26 | the likelihood of the dreaded UE events and system 'panics'. |
27 | 27 | ||
@@ -29,13 +29,13 @@ the likelihood of the dreaded UE events and system 'panics'. | |||
29 | In addition, PCI Bus Parity and SERR Errors are scanned for on PCI devices | 29 | In addition, PCI Bus Parity and SERR Errors are scanned for on PCI devices |
30 | in order to determine if errors are occurring on data transfers. | 30 | in order to determine if errors are occurring on data transfers. |
31 | The presence of PCI Parity errors must be examined with a grain of salt. | 31 | The presence of PCI Parity errors must be examined with a grain of salt. |
32 | There are several addin adapters that do NOT follow the PCI specification | 32 | There are several add-in adapters that do NOT follow the PCI specification |
33 | with regards to Parity generation and reporting. The specification says | 33 | with regards to Parity generation and reporting. The specification says |
34 | the vendor should tie the parity status bits to 0 if they do not intend | 34 | the vendor should tie the parity status bits to 0 if they do not intend |
35 | to generate parity. Some vendors do not do this, and thus the parity bit | 35 | to generate parity. Some vendors do not do this, and thus the parity bit |
36 | can "float" giving false positives. | 36 | can "float" giving false positives. |
37 | 37 | ||
38 | The PCI Parity EDAC device has the ability to "skip" known flakey | 38 | The PCI Parity EDAC device has the ability to "skip" known flaky |
39 | cards during the parity scan. These are set by the parity "blacklist" | 39 | cards during the parity scan. These are set by the parity "blacklist" |
40 | interface in the sysfs for PCI Parity. (See the PCI section in the sysfs | 40 | interface in the sysfs for PCI Parity. (See the PCI section in the sysfs |
41 | section below.) There is also a parity "whitelist" which is used as | 41 | section below.) There is also a parity "whitelist" which is used as |
@@ -101,7 +101,7 @@ Memory Controller (mc) Model | |||
101 | 101 | ||
102 | First a background on the memory controller's model abstracted in EDAC. | 102 | First a background on the memory controller's model abstracted in EDAC. |
103 | Each mc device controls a set of DIMM memory modules. These modules are | 103 | Each mc device controls a set of DIMM memory modules. These modules are |
104 | layed out in a Chip-Select Row (csrowX) and Channel table (chX). There can | 104 | laid out in a Chip-Select Row (csrowX) and Channel table (chX). There can |
105 | be multiple csrows and two channels. | 105 | be multiple csrows and two channels. |
106 | 106 | ||
107 | Memory controllers allow for several csrows, with 8 csrows being a typical value. | 107 | Memory controllers allow for several csrows, with 8 csrows being a typical value. |
@@ -131,7 +131,7 @@ for memory DIMMs: | |||
131 | DIMM_B1 | 131 | DIMM_B1 |
132 | 132 | ||
133 | Labels for these slots are usually silk screened on the motherboard. Slots | 133 | Labels for these slots are usually silk screened on the motherboard. Slots |
134 | labeled 'A' are channel 0 in this example. Slots labled 'B' | 134 | labeled 'A' are channel 0 in this example. Slots labeled 'B' |
135 | are channel 1. Notice that there are two csrows possible on a | 135 | are channel 1. Notice that there are two csrows possible on a |
136 | physical DIMM. These csrows are allocated their csrow assignment | 136 | physical DIMM. These csrows are allocated their csrow assignment |
137 | based on the slot into which the memory DIMM is placed. Thus, when 1 DIMM | 137 | based on the slot into which the memory DIMM is placed. Thus, when 1 DIMM |
@@ -140,7 +140,7 @@ is placed in each Channel, the csrows cross both DIMMs. | |||
140 | Memory DIMMs come single or dual "ranked". A rank is a populated csrow. | 140 | Memory DIMMs come single or dual "ranked". A rank is a populated csrow. |
141 | Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above | 141 | Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above |
142 | will have 1 csrow, csrow0. csrow1 will be empty. On the other hand, | 142 | will have 1 csrow, csrow0. csrow1 will be empty. On the other hand, |
143 | when 2 dual ranked DIMMs are similiaryly placed, then both csrow0 and | 143 | when 2 dual ranked DIMMs are similarly placed, then both csrow0 and |
144 | csrow1 will be populated. The pattern repeats itself for csrow2 and | 144 | csrow1 will be populated. The pattern repeats itself for csrow2 and |
145 | csrow3. | 145 | csrow3. |
146 | 146 | ||
@@ -246,7 +246,7 @@ Module Version read-only attribute file: | |||
246 | 246 | ||
247 | 'mc_version' | 247 | 'mc_version' |
248 | 248 | ||
249 | The EDAC CORE modules's version and compile date are shown here to | 249 | The EDAC CORE module's version and compile date are shown here to |
250 | indicate what EDAC is running. | 250 | indicate what EDAC is running. |
251 | 251 | ||
252 | 252 | ||
@@ -423,7 +423,7 @@ Total memory managed by this csrow attribute file: | |||
423 | 'size_mb' | 423 | 'size_mb' |
424 | 424 | ||
425 | This attribute file displays, in count of megabytes, of memory | 425 | This attribute file displays, in count of megabytes, of memory |
426 | that this csrow contatins. | 426 | that this csrow contains. |
427 | 427 | ||
428 | 428 | ||
429 | Memory Type attribute file: | 429 | Memory Type attribute file: |
@@ -557,7 +557,7 @@ On Header Type 00 devices the primary status is looked at | |||
557 | for any parity error regardless of whether Parity is enabled on the | 557 | for any parity error regardless of whether Parity is enabled on the |
558 | device. (The spec indicates parity is generated in some cases). | 558 | device. (The spec indicates parity is generated in some cases). |
559 | On Header Type 01 bridges, the secondary status register is also | 559 | On Header Type 01 bridges, the secondary status register is also |
560 | looked at to see if parity ocurred on the bus on the other side of | 560 | looked at to see if parity occurred on the bus on the other side of |
561 | the bridge. | 561 | the bridge. |
562 | 562 | ||
563 | 563 | ||
@@ -588,7 +588,7 @@ Panic on PCI PARITY Error: | |||
588 | 'panic_on_pci_parity' | 588 | 'panic_on_pci_parity' |
589 | 589 | ||
590 | 590 | ||
591 | This control files enables or disables panic'ing when a parity | 591 | This control files enables or disables panicking when a parity |
592 | error has been detected. | 592 | error has been detected. |
593 | 593 | ||
594 | 594 | ||
@@ -616,12 +616,12 @@ PCI Device Whitelist: | |||
616 | 616 | ||
617 | This control file allows for an explicit list of PCI devices to be | 617 | This control file allows for an explicit list of PCI devices to be |
618 | scanned for parity errors. Only devices found on this list will | 618 | scanned for parity errors. Only devices found on this list will |
619 | be examined. The list is a line of hexadecimel VENDOR and DEVICE | 619 | be examined. The list is a line of hexadecimal VENDOR and DEVICE |
620 | ID tuples: | 620 | ID tuples: |
621 | 621 | ||
622 | 1022:7450,1434:16a6 | 622 | 1022:7450,1434:16a6 |
623 | 623 | ||
624 | One or more can be inserted, seperated by a comma. | 624 | One or more can be inserted, separated by a comma. |
625 | 625 | ||
626 | To write the above list doing the following as one command line: | 626 | To write the above list doing the following as one command line: |
627 | 627 | ||
@@ -639,11 +639,11 @@ PCI Device Blacklist: | |||
639 | 639 | ||
640 | This control file allows for a list of PCI devices to be | 640 | This control file allows for a list of PCI devices to be |
641 | skipped for scanning. | 641 | skipped for scanning. |
642 | The list is a line of hexadecimel VENDOR and DEVICE ID tuples: | 642 | The list is a line of hexadecimal VENDOR and DEVICE ID tuples: |
643 | 643 | ||
644 | 1022:7450,1434:16a6 | 644 | 1022:7450,1434:16a6 |
645 | 645 | ||
646 | One or more can be inserted, seperated by a comma. | 646 | One or more can be inserted, separated by a comma. |
647 | 647 | ||
648 | To write the above list doing the following as one command line: | 648 | To write the above list doing the following as one command line: |
649 | 649 | ||
@@ -651,14 +651,14 @@ PCI Device Blacklist: | |||
651 | > /sys/devices/system/edac/pci/pci_parity_blacklist | 651 | > /sys/devices/system/edac/pci/pci_parity_blacklist |
652 | 652 | ||
653 | 653 | ||
654 | To display what the whitelist current contatins, | 654 | To display what the whitelist currently contains, |
655 | simply 'cat' the same file. | 655 | simply 'cat' the same file. |
656 | 656 | ||
657 | ======================================================================= | 657 | ======================================================================= |
658 | 658 | ||
659 | PCI Vendor and Devices IDs can be obtained with the lspci command. Using | 659 | PCI Vendor and Devices IDs can be obtained with the lspci command. Using |
660 | the -n option lspci will display the vendor and device IDs. The system | 660 | the -n option lspci will display the vendor and device IDs. The system |
661 | adminstrator will have to determine which devices should be scanned or | 661 | administrator will have to determine which devices should be scanned or |
662 | skipped. | 662 | skipped. |
663 | 663 | ||
664 | 664 | ||
@@ -669,5 +669,5 @@ Turn OFF a whitelist by an empty echo command: | |||
669 | 669 | ||
670 | echo > /sys/devices/system/edac/pci/pci_parity_whitelist | 670 | echo > /sys/devices/system/edac/pci/pci_parity_whitelist |
671 | 671 | ||
672 | and any previous blacklist will be utililzed. | 672 | and any previous blacklist will be utilized. |
673 | 673 | ||