diff options
author | Tony Luck <tony.luck@intel.com> | 2014-10-29 13:36:50 -0400 |
---|---|---|
committer | Mauro Carvalho Chehab <mchehab@osg.samsung.com> | 2014-12-02 09:06:52 -0500 |
commit | f7cf2a22a2896d3b3595b71d7936b6d7a3316b00 (patch) | |
tree | fbacb747a192c664962731bcd5e8859527ebcfe0 /drivers/edac/sb_edac.c | |
parent | 8c009100295597f23978c224aec5751a365bc965 (diff) |
sb_edac: Fix discovery of top-of-low-memory for Haswell
Haswell moved the TOLM/TOHM registers to a different device and offset.
The sb_edac driver accounted for the change of device, but not for the
new offset. There was also a typo in the constant to fill in the low
26 bits (was 0x1ffffff, should be 0x3ffffff).
This resulted in a bogus value for the top of low memory:
EDAC DEBUG: get_memory_layout: TOLM: 0.032 GB (0x0000000001ffffff)
which would result in EDAC refusing to translate addresses for
errors above the bogus value and below 4GB:
sbridge MC3: HANDLING MCE MEMORY ERROR
sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
sbridge MC3: TSC 0
sbridge MC3: ADDR 2000000
sbridge MC3: MISC 523eac86
sbridge MC3: PROCESSOR 0:306f3 TIME 1414600951 SOCKET 0 APIC 0
MC3: 1 CE Error at TOLM area, on addr 0x02000000 on any memory ( page:0x0 offset:0x0 grain:32 syndrome:0x0)
With the fix we see the correct TOLM value:
DEBUG: get_memory_layout: TOLM: 2.048 GB (0x000000007fffffff)
and we decode address 2000000 correctly:
sbridge MC3: HANDLING MCE MEMORY ERROR
sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
sbridge MC3: TSC 0
sbridge MC3: ADDR 2000000
sbridge MC3: MISC 523e1086
sbridge MC3: PROCESSOR 0:306f3 TIME 1414601319 SOCKET 0 APIC 0
DEBUG: get_memory_error_data: SAD interleave package: 0 = CPU socket 0, HA 0, shiftup: 0
DEBUG: get_memory_error_data: TAD#0: address 0x0000000002000000 < 0x000000007fffffff, socket interleave 1, channel interleave 4 (offset 0x00000000), index 0, base ch: 0, ch mask: 0x01
DEBUG: get_memory_error_data: RIR#0, limit: 4.095 GB (0x00000000ffffffff), way: 1
DEBUG: get_memory_error_data: RIR#0: channel address 0x00200000 < 0xffffffff, RIR interleave 0, index 0
DEBUG: sbridge_mce_output_error: area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0
MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2000 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Diffstat (limited to 'drivers/edac/sb_edac.c')
-rw-r--r-- | drivers/edac/sb_edac.c | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index f37d01f3bb17..ead0bf9a5d2d 100644 --- a/drivers/edac/sb_edac.c +++ b/drivers/edac/sb_edac.c | |||
@@ -135,6 +135,7 @@ static inline int sad_pkg(const struct interleave_pkg *table, u32 reg, | |||
135 | 135 | ||
136 | #define TOLM 0x80 | 136 | #define TOLM 0x80 |
137 | #define TOHM 0x84 | 137 | #define TOHM 0x84 |
138 | #define HASWELL_TOLM 0xd0 | ||
138 | #define HASWELL_TOHM_0 0xd4 | 139 | #define HASWELL_TOHM_0 0xd4 |
139 | #define HASWELL_TOHM_1 0xd8 | 140 | #define HASWELL_TOHM_1 0xd8 |
140 | 141 | ||
@@ -706,8 +707,8 @@ static u64 haswell_get_tolm(struct sbridge_pvt *pvt) | |||
706 | { | 707 | { |
707 | u32 reg; | 708 | u32 reg; |
708 | 709 | ||
709 | pci_read_config_dword(pvt->info.pci_vtd, TOLM, ®); | 710 | pci_read_config_dword(pvt->info.pci_vtd, HASWELL_TOLM, ®); |
710 | return (GET_BITFIELD(reg, 26, 31) << 26) | 0x1ffffff; | 711 | return (GET_BITFIELD(reg, 26, 31) << 26) | 0x3ffffff; |
711 | } | 712 | } |
712 | 713 | ||
713 | static u64 haswell_get_tohm(struct sbridge_pvt *pvt) | 714 | static u64 haswell_get_tohm(struct sbridge_pvt *pvt) |