aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorXunlei Pang <xlpang@redhat.com>2016-12-05 07:09:07 -0500
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2017-01-12 05:39:28 -0500
commitafd7e2b4258aa4b01720a2d65846030b3a06689f (patch)
treebd1f5226f23df8d099832396012342f72c2ac94b
parentef41459ab279edd5fa11da4efac4aa0155c06a35 (diff)
iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped
commit aec0e86172a79eb5e44aff1055bb953fe4d47c59 upstream. We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers under kdump, it can be steadily reproduced on several different machines, the dmesg log is like: HP HPSA Driver (v 3.4.16-0) hpsa 0000:02:00.0: using doorbell to reset controller hpsa 0000:02:00.0: board ready after hard reset. hpsa 0000:02:00.0: Waiting for controller to respond to no-op DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff] DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff] DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set hpsa 0000:02:00.0: controller message 03:00 timed out hpsa 0000:02:00.0: no-op failed; re-trying After some debugging, we found that the fault addr is from DMA initiated at the driver probe stage after reset(not in-flight DMA), and the corresponding pte entry value is correct, the fault is likely due to the old iommu caches of the in-flight DMA before it. Thus we need to flush the old cache after context mapping is setup for the device, where the device is supposed to finish reset at its driver probe stage and no in-flight DMA exists hereafter. I'm not sure if the hardware is responsible for invalidating all the related caches allocated in the iommu hardware before, but seems not the case for hpsa, actually many device drivers have problems in properly resetting the hardware. Anyway flushing (again) by software in kdump kernel when the device gets context mapped which is a quite infrequent operation does little harm. With this patch, the problematic machine can survive the kdump tests. CC: Myron Stowe <myron.stowe@gmail.com> CC: Joseph Szczypek <jszczype@redhat.com> CC: Don Brace <don.brace@microsemi.com> CC: Baoquan He <bhe@redhat.com> CC: Dave Young <dyoung@redhat.com> Fixes: 091d42e43d21 ("iommu/vt-d: Copy translation tables from old kernel") Fixes: dbcd861f252d ("iommu/vt-d: Do not re-use domain-ids from the old kernel") Fixes: cf484d0e6939 ("iommu/vt-d: Mark copied context entries") Signed-off-by: Xunlei Pang <xlpang@redhat.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-rw-r--r--drivers/iommu/intel-iommu.c19
1 files changed, 19 insertions, 0 deletions
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 96566d504d2a..d82637ab09fd 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2037,6 +2037,25 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
2037 if (context_present(context)) 2037 if (context_present(context))
2038 goto out_unlock; 2038 goto out_unlock;
2039 2039
2040 /*
2041 * For kdump cases, old valid entries may be cached due to the
2042 * in-flight DMA and copied pgtable, but there is no unmapping
2043 * behaviour for them, thus we need an explicit cache flush for
2044 * the newly-mapped device. For kdump, at this point, the device
2045 * is supposed to finish reset at its driver probe stage, so no
2046 * in-flight DMA will exist, and we don't need to worry anymore
2047 * hereafter.
2048 */
2049 if (context_copied(context)) {
2050 u16 did_old = context_domain_id(context);
2051
2052 if (did_old >= 0 && did_old < cap_ndoms(iommu->cap))
2053 iommu->flush.flush_context(iommu, did_old,
2054 (((u16)bus) << 8) | devfn,
2055 DMA_CCMD_MASK_NOBIT,
2056 DMA_CCMD_DEVICE_INVL);
2057 }
2058
2040 pgd = domain->pgd; 2059 pgd = domain->pgd;
2041 2060
2042 context_clear_entry(context); 2061 context_clear_entry(context);