aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/platforms/powernv/pci.c
Commit message (Collapse)AuthorAge
* Merge branch 'next-sriov' of ↵Michael Ellerman2015-04-13
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into next Merge Richard's work to support SR-IOV on PowerNV. All generic PCI patches acked by Bjorn. Some minor conflicts with Daniel's pci_controller_ops work. Conflicts: arch/powerpc/include/asm/machdep.h arch/powerpc/platforms/powernv/pci-ioda.c
| * powerpc/powernv: Shift VF resource with an offsetWei Yang2015-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On PowerNV platform, resource position in M64 BAR implies the PE# the resource belongs to. In some cases, adjustment of a resource is necessary to locate it to a correct position in M64 BAR . This patch adds pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR address according to an offset. Note: After doing so, there would be a "hole" in the /proc/iomem when offset is a positive value. It looks like the device return some mmio back to the system, which actually no one could use it. [bhelgaas: rework loops, rework overlap check, index resource[] conventionally, remove pci_regs.h include, squashed with next patch] Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Move controller ops from ppc_md to controller_opsDaniel Axtens2015-04-11
| | | | | | | | | | | | | | | | This moves the PowerNV platform to use the pci_controller_ops structure rather than ppc_md for PCI controller operations. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/powernv: Remove powernv RTAS supportMichael Ellerman2015-04-07
|/ | | | | | | | | | | | | | | The powernv code has some conditional support for running on bare metal machines that have no OPAL firmware, but provide RTAS. No released machines ever supported that, and even in the lab it was just a transitional hack in the days when OPAL was still being developed. So remove the code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* powerpc/powernv: Use pci_dn, not device_node, in PCI config accessorGavin Shan2015-03-23
| | | | | | | | | | The PCI config accessors previously relied on device_node. Unfortunately, VFs don't have a corresponding device_node, so change the accessors to use pci_dn instead. [bhelgaas: changelog] Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/iommu: Remove IOMMU device references via bus notifierNishanth Aravamudan2015-03-03
| | | | | | | | | | | | | | | | | | | | | | After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the refcnt on the kobject backing the IOMMU group for a PCI device is elevated by each call to pci_dma_dev_setup_pSeriesLP() (via set_iommu_table_base_and_group). When we go to dlpar a multi-function PCI device out: iommu_reconfig_notifier -> iommu_free_table -> iommu_group_put BUG_ON(tbl->it_group) We trip this BUG_ON, because there are still references on the table, so it is not freed. Fix this by moving the powernv bus notifier to common code and calling it for both powernv and pseries. Fixes: d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier") Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Remove pnv_pci_probe_mode()Gavin Shan2015-01-22
| | | | | | | | | | | | | | The callback (ppc_md.pci_probe_mode()) is used to determine if the child PCI devices of the indicated PCI bus should be probed from device-tree or hardware. On PowerNV platform, we always expect probing PCI devices from hardware, which is PowerPC PCI core's default behaviour. Also, the callback had some delay implemented based on PHB's device node property "reset-clear-timestamp", which wasn't exported from skiboot. So we don't need this function and it's safe to remove it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge tag 'powerpc-3.19-1' of ↵Linus Torvalds2014-12-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Some nice cleanups like removing bootmem, and removal of __get_cpu_var(). There is one patch to mm/gup.c. This is the generic GUP implementation, but is only used by us and arm(64). We have an ack from Steve Capper, and although we didn't get an ack from Andrew he told us to take the patch through the powerpc tree. There's one cxl patch. This is in drivers/misc, but Greg said he was happy for us to manage fixes for it. There is an infrastructure patch to support an IPMI driver for OPAL. There is also an RTC driver for OPAL. We weren't able to get any response from the RTC maintainer, Alessandro Zummo, so in the end we just merged the driver. The usual batch of Freescale updates from Scott" * tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (101 commits) powerpc/powernv: Return to cpu offline loop when finished in KVM guest powerpc/book3s: Fix partial invalidation of TLBs in MCE code. powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault powerpc/xmon: Cleanup the breakpoint flags powerpc/xmon: Enable HW instruction breakpoint on POWER8 powerpc/mm/thp: Use tlbiel if possible powerpc/mm/thp: Remove code duplication powerpc/mm/hugetlb: Sanity check gigantic hugepage count powerpc/oprofile: Disable pagefaults during user stack read powerpc/mm: Check for matching hpte without taking hpte lock powerpc: Drop useless warning in eeh_init() powerpc/powernv: Cleanup unused MCE definitions/declarations. powerpc/eeh: Dump PHB diag-data early powerpc/eeh: Recover EEH error on ownership change for BCM5719 powerpc/eeh: Set EEH_PE_RESET on PE reset powerpc/eeh: Refactor eeh_reset_pe() powerpc: Remove more traces of bootmem powerpc/pseries: Initialise nvram_pstore_info's buf_lock cxl: Name interrupts in /proc/interrupt cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning ...
| * powerpc: Remove superfluous bootmem includesAnton Blanchard2014-11-09
| | | | | | | | | | | | | | | | Lots of places included bootmem.h even when not using bootmem. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'irq-irqdomain-for-linus' of ↵Linus Torvalds2014-12-10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq domain updates from Thomas Gleixner: "The real interesting irq updates: - Support for hierarchical irq domains: For complex interrupt routing scenarios where more than one interrupt related chip is involved we had no proper representation in the generic interrupt infrastructure so far. That made people implement rather ugly constructs in their nested irq chip implementations. The main offenders are x86 and arm/gic. To distangle that mess we have now hierarchical irqdomains which seperate the various interrupt chips and connect them via the hierarchical domains. That keeps the domain specific details internal to the particular hierarchy level and removes the criss/cross referencing of chip internals. The resulting hierarchy for a complex x86 system will look like this: vector mapped: 74 msi-0 mapped: 2 dmar-ir-1 mapped: 69 ioapic-1 mapped: 4 ioapic-0 mapped: 20 pci-msi-2 mapped: 45 dmar-ir-0 mapped: 3 ioapic-2 mapped: 1 pci-msi-1 mapped: 2 htirq mapped: 0 Neither ioapic nor pci-msi know about the dmar interrupt remapping between themself and the vector domain. If interrupt remapping is disabled ioapic and pci-msi become direct childs of the vector domain. In hindsight we should have done that years ago, but in hindsight we always know better :) - Support for generic MSI interrupt domain handling We have more and more non PCI related MSI interrupts, so providing a generic infrastructure for this is better than having all affected architectures implementing their own private hacks. - Support for PCI-MSI interrupt domain handling, based on the generic MSI support. This part carries the pci/msi branch from Bjorn Helgaas pci tree to avoid a massive conflict. The PCI/MSI parts are acked by Bjorn. I have two more branches on top of this. The full conversion of x86 to hierarchical domains and a partial conversion of arm/gic" * 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) genirq: Move irq_chip_write_msi_msg() helper to core PCI/MSI: Allow an msi_controller to be associated to an irq domain PCI/MSI: Provide mechanism to alloc/free MSI/MSIX interrupt from irqdomain PCI/MSI: Enhance core to support hierarchy irqdomain PCI/MSI: Move cached entry functions to irq core genirq: Provide default callbacks for msi_domain_ops genirq: Introduce msi_domain_alloc/free_irqs() asm-generic: Add msi.h genirq: Add generic msi irq domain support genirq: Introduce callback irq_chip.irq_write_msi_msg genirq: Work around __irq_set_handler vs stacked domains ordering issues irqdomain: Introduce helper function irq_domain_add_hierarchy() irqdomain: Implement a method to automatically call parent domains alloc/free genirq: Introduce helper irq_domain_set_info() to reduce duplicated code genirq: Split out flow handler typedefs into seperate header file genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip genirq: Introduce irq_chip.irq_compose_msi_msg() to support stacked irqchip genirq: Add more helper functions to support stacked irq_chip genirq: Introduce helper functions to support stacked irq_chip irqdomain: Do irq_find_mapping and set_type for hierarchy irqdomain in case OF ...
| * | PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg()Jiang Liu2014-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI specific. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | powerpc/powernv: Honor the generic "no_64bit_msi" flagBenjamin Herrenschmidt2014-11-23
| |/ |/| | | | | | | | | | | | | Instead of the arch specific quirk which we are deprecating and that drivers don't understand. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2014-10-21
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull more powerpc updates from Michael Ellerman: "Here's some more updates for powerpc for 3.18. They are a bit late I know, though must are actually bug fixes. In my defence I nearly cut the top of my finger off last weekend in a gruesome bike maintenance accident, so I spent a good part of the week waiting around for doctors. True story, I can send photos if you like :) Probably the most interesting fix is the sys_call_table one, which enables syscall tracing for powerpc. There's a fix for HMI handling for old firmware, more endian fixes for firmware interfaces, more EEH fixes, Anton fixed our routine that gets the current stack pointer, and a few other misc bits" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (22 commits) powerpc: Only do dynamic DMA zone limits on platforms that need it powerpc: sync pseries_le_defconfig with pseries_defconfig powerpc: Add printk levels to setup_system output powerpc/vphn: NUMA node code expects big-endian powerpc/msi: Use WARN_ON() in msi bitmap selftests powerpc/msi: Fix the msi bitmap alignment tests powerpc/eeh: Block CFG upon frozen Shiner adapter powerpc/eeh: Don't collect logs on PE with blocked config space powerpc/eeh: Block PCI config access upon frozen PE powerpc/pseries: Drop config requests in EEH accessors powerpc/powernv: Drop config requests in EEH accessors powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED powerpc/eeh: Fix condition for isolated state powerpc/pseries: Make CPU hotplug path endian safe powerpc/pseries: Use dump_stack instead of show_stack powerpc: Rename __get_SP() to current_stack_pointer() powerpc: Reimplement __get_SP() as a function not a define powerpc/numa: Add ability to disable and debug topology updates powerpc/numa: check error return from proc_create powerpc/powernv: Fallback to old HMI handling behavior for old firmware ...
| * powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKEDGavin Shan2014-10-14
| | | | | | | | | | | | | | | | | | | | | | | | The flag EEH_PE_RESET indicates blocking config space of the PE during reset time. We potentially need block PE's config space other than reset time. So it's reasonable to replace it with EEH_PE_CFG_BLOCKED to indicate its usage. There are no substantial code or logic changes in this patch. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'for-linus' of ↵Linus Torvalds2014-10-11
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Here's a first pull request for powerpc updates for 3.18. The bulk of the additions are for the "cxl" driver, for IBM's Coherent Accelerator Processor Interface (CAPI). Most of it's in drivers/misc, which Greg & Arnd maintain, Greg said he was happy for us to take it through our tree. There's the usual minor cleanups and fixes, including a bit of noise in drivers from some of those. A bunch of updates to our EEH code, which has been getting more testing. Several nice speedups from Anton, including 20% in clear_page(). And a bunch of updates for freescale from Scott" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits) cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking cxl: Add documentation for userspace APIs cxl: Add driver to Kbuild and Makefiles cxl: Add userspace header file cxl: Driver code for powernv PCIe based cards for userspace access cxl: Add base builtin support powerpc/mm: Add hooks for cxl powerpc/opal: Add PHB to cxl mode call powerpc/mm: Add new hash_page_mm() powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts cxl: Add new header for call backs and structs powerpc/powernv: Split out set MSI IRQ chip code powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize powerpc/msi: Improve IRQ bitmap allocator powerpc/cell: Make spu_flush_all_slbs() generic powerpc/cell: Move data segment faulting code out of cell platform powerpc/cell: Move spu_handle_mm_fault() out of cell platform powerpc/pseries: Use new defines when calling H_SET_MODE powerpc: Update contact info in Documentation files powerpc/perf/hv-24x7: Simplify catalog_read() ...
| * powerpc/powernv: Override dma_get_required_mask()Gavin Shan2014-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dma_get_required_mask() function is used by some drivers to query the platform about what DMA mask is needed to cover all of memory. This is a bit of a strange semantic when we have to choose between IOMMU translation or bypass, but essentially what it means is "what DMA mask will give best performances". Currently, our IOMMU backend always returns a 32-bit mask here, we don't do anything special to it when we have bypass available. This causes some drivers to choose a 32-bit mask, thus losing the ability to use the bypass window, thinking this is more efficient. The problem was reported from the driver of following device: 0004:03:00.0 0107: 1000:0087 (rev 05) 0004:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios \ Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05) This patch adds an override of that function in order to, instead, return a 64-bit mask whenever a bypass window is available in order for drivers to prefer this configuration. Reported-by: Murali N. Iyer <mniyer@us.ibm.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | PCI/MSI/PPC: Remove arch_msi_check_device()Alexander Gordeev2014-10-01
|/ | | | | | | | | Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs(). This makes the code more compact and allows removing arch_msi_check_device() from generic MSI code. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Handle compound PE in config accessorsGavin Shan2014-08-05
| | | | | | | | | | The PCI config accessors check for PE frozen state and clear it if EEH isn't functional. The patch handles compound PE in config accessors if PHB supports it. For consistency, all PEs will be put into frozen state if any one in compound group gets frozen by hardware. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: Make diag-data not endian dependentGavin Shan2014-08-05
| | | | | | | | | | It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness problems in EEH"). The patch helps to get non-endian-dependent diag-data. Cc: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall()Michael Ellerman2014-07-28
| | | | | | | | | | | | | | | | | A lot of the code in platforms/powernv is using non-machine initcalls. That means if a kernel built with powernv support runs on another platform, for example pseries, the initcalls will still run. That is usually OK, because the initcalls will check for something in the device tree or elsewhere before doing anything, so on other platforms they will usually just return. But it's fishy for powernv code to be running on other platforms, so switch them all to be machine initcalls. If we want any of them to run on other platforms in future they should move to sysdev. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table()Alexey Kardashevskiy2014-07-11
| | | | | | | | Since a TCE page size can be other than 4K, make it configurable for P5IOC2 and IODA PHBs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Use it_page_shift in TCE buildAlexey Kardashevskiy2014-07-11
| | | | | | | | This makes use of iommu_table::it_page_shift instead of TCE_SHIFT and TCE_RPN_SHIFT hardcoded values. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Fix endianness problems in EEHGuo Chao2014-06-11
| | | | | | | | | | | | EEH information fetched from OPAL need fix before using in LE environment. To be included in sparse's endian check, declare them as __beXX and access them by accessors. Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: No hotplug on permanently removed devGavin Shan2014-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issue was detected in a bit complicated test case where we have multiple hierarchical PEs shown as following figure: +-----------------+ | PE#3 p2p#0 | | p2p#1 | +-----------------+ | +-----------------+ | PE#4 pdev#0 | | pdev#1 | +-----------------+ PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p bridges. We accidentally had less-known scenario: PE#4 was removed permanently from the system because of permanent failure (e.g. exceeding the max allowd failure times in last hour), then we detects EEH errors on PE#3 and tried to recover it. However, eeh_dev instances for pdev#0/1 were not detached from PE#4, which was still connected to PE#3. All of that was because of the fact that we rely on count-based pcibios_release_device(), which isn't reliable enough. When doing recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which are not valid any more. Eventually, we run into kernel crash. The patch fixes above issue from two aspects. For unplug, we simply skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED && !EEH_PE_STATE_RECOVERING) and its frozen count should be greater than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read its PCI config so that PCI core will omit them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: Block PCI-CFG access during PE resetGavin Shan2014-04-28
| | | | | | | | | | | | | | | | | | | We've observed multiple PE reset failures because of PCI-CFG access during that period. Potentially, some device drivers can't support EEH very well and they can't put the device to motionless state before PE reset. So those device drivers might produce PCI-CFG accesses during PE reset. Also, we could have PCI-CFG access from user space (e.g. "lspci"). Since access to frozen PE should return 0xFF's, we can block PCI-CFG access during the period of PE reset so that we won't get recrusive EEH errors. The patch adds flag EEH_PE_RESET, which is kept during PE reset. The PowerNV/pSeries PCI-CFG accessors reuse the flag to block PCI-CFG accordingly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Remove fields in PHB diag-data dumpGavin Shan2014-04-28
| | | | | | | | | | | | | | For some fields (e.g. LEM, MMIO, DMA) in PHB diag-data dump, it's meaningless to print them if they have non-zero value in the corresponding mask registers because we always have non-zero values in the mask registers. The patch only prints those fieds if we have non-zero values in the primary registers (e.g. LEM, MMIO, DMA status) so that we can save couple of lines. The patch also removes unnecessary spare line before "brdgCtl:" and two leading spaces as prefix in each line as Ben suggested. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Move PNV_EEH_STATE_ENABLED aroundGavin Shan2014-04-28
| | | | | | | | | | | The flag PNV_EEH_STATE_ENABLED is put into pnv_phb::eeh_state, which is protected by CONFIG_EEH. We needn't that. Instead, we can have pnv_phb::flags and maintain all flags there, which is the purpose of the patch. The patch also renames PNV_EEH_STATE_ENABLED to PNV_PHB_FLAG_EEH. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Refactor PHB diag-data dumpGavin Shan2014-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | As Ben suggested, the patch prints PHB diag-data with multiple fields in one line and omits the line if the fields of that line are all zero. With the patch applied, the PHB3 diag-data dump looks like: PHB3 PHB#3 Diag-data (Version: 1) brdgCtl: 00000002 RootSts: 0000000f 00400000 b0830008 00100147 00002000 nFir: 0000000000000000 0030006e00000000 0000000000000000 PhbSts: 0000001c00000000 0000000000000000 Lem: 0000000000100000 42498e327f502eae 0000000000000000 InAErr: 8000000000000000 8000000000000000 0402030000000000 0000000000000000 PE[ 8] A/B: 8480002b00000000 8000000000000000 [ The current diag data is so big that it overflows the printk buffer pretty quickly in cases when we get a handful of errors at once which can happen. --BenH ] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Add iommu DMA bypass support for IODA2Benjamin Herrenschmidt2014-02-11
| | | | | | | | | | | | | | | This patch adds the support for to create a direct iommu "bypass" window on IODA2 bridges (such as Power8) allowing to bypass iommu page translation completely for 64-bit DMA capable devices, thus significantly improving DMA performances. Additionally, this adds a hook to the struct iommu_table so that the IOMMU API / VFIO can disable the bypass when external ownership is requested, since in that case, the device will be used by an environment such as userspace or a KVM guest which must not be allowed to bypass translations. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/iommu: Update the generic code to use dynamic iommu page sizesAlistair Popple2013-12-29
| | | | | | | | | This patch updates the generic iommu backend code to use the it_page_shift field to determine the iommu page size instead of using hardcoded values. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/iommu: Add it_page_shift field to determine iommu page sizeAlistair Popple2013-12-29
| | | | | | | | This patch adds a it_page_shift field to struct iommu_table and initiliases it to 4K for all platforms. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/iommu: Update constant names to reflect their hardcoded page sizeAlistair Popple2013-12-29
| | | | | | | | | | | The powerpc iommu uses a hardcoded page size of 4K. This patch changes the name of the IOMMU_PAGE_* macros to reflect the hardcoded values. A future patch will use the existing names to support dynamic page sizes. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Move PHB-diag dump functions aroundGavin Shan2013-12-05
| | | | | | | | | | | | | | | Prior to the completion of PCI enumeration, we actively detects EEH errors on PCI config cycles and dump PHB diag-data if necessary. The EEH backend also dumps PHB diag-data in case of frozen PE or fenced PHB. However, we are using different functions to dump the PHB diag-data for those 2 cases. The patch merges the functions for dumping PHB diag-data to one so that we can avoid duplicate code. Also, we never dump PHB3 diag-data during PCI config cycles with frozen PE. The patch fixes it as well. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* PPC: POWERNV: move iommu_add_device earlierAlexey Kardashevskiy2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of IOMMU on sPAPR does not use iommu_ops and therefore does not call IOMMU API's bus_set_iommu() which 1) sets iommu_ops for a bus 2) registers a bus notifier Instead, PCI devices are added to IOMMU groups from subsys_initcall_sync(tce_iommu_init) which does basically the same thing without using iommu_ops callbacks. However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158) implements iommu_ops and when tce_iommu_init is called, every PCI device is already added to some group so there is a conflict. This patch does 2 things: 1. removes the loop in which PCI devices were added to groups and adds explicit iommu_add_device() calls to add devices as soon as they get the iommu_table pointer assigned to them. 2. moves a bus notifier to powernv code in order to avoid conflict with the notifier from Freescale driver. iommu_add_device() and iommu_del_device() are public now. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Reserve the correct PE numberGavin Shan2013-11-05
| | | | | | | | | | | | | | | | | | We're assigning PE numbers after the completion of PCI probe. During the PCI probe, we had PE#0 as the super container to encompass all PCI devices. However, that's inappropriate since PELTM has ascending order of priority on search on P7IOC. So we need PE#127 takes the role that PE#0 has previously. For PHB3, we still have PE#0 as the reserved PE. The patch supposes that the underly firmware has built the RID to PE# mapping after resetting IODA tables: all PELTM entries except last one has invalid mapping on P7IOC, but all RTEs have binding to PE#0. The reserved PE# is being exported by firmware by device tree. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'for-kvm' into nextBenjamin Herrenschmidt2013-10-11
|\ | | | | | | | | | | | | | | | | Topic branch for commits that the KVM tree might want to pull in separately. Hand merged a few files due to conflicts with the LE stuff Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: add real mode support for dma operations on powernvAlexey Kardashevskiy2013-10-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing TCE machine calls (tce_build and tce_free) only support virtual mode as they call __raw_writeq for TCE invalidation what fails in real mode. This introduces tce_build_rm and tce_free_rm real mode versions which do mostly the same but use "Store Doubleword Caching Inhibited Indexed" instruction for TCE invalidation. This new feature is going to be utilized by real mode support of VFIO. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix some PCI sparse errors and one LE bugAnton Blanchard2013-10-11
| | | | | | | | | | | | | | | | pnv_pci_setup_bml_iommu was missing a byteswap of a device tree property. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix endian issues in powernv PCI codeBenjamin Herrenschmidt2013-10-11
| | | | | | | | Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Little endian fix for arch/powerpc/platforms/powernv/pci.cAlistair Popple2013-10-11
|/ | | | | Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Use dev-node in PCI config accessorsGavin Shan2013-06-30
| | | | | | | | | | | | | | | Currently, we're using the combo (PCI bus + devfn) in the PCI config accessors and PCI config accessors in EEH depends on them. However, it's not safe to refer the PCI bus which might have been removed during hotplug. So we're using device node in the PCI config accessors and the corresponding backends just reuse them. The patch also fix one potential risk: We possiblly have frozen PE during the early PCI probe time, but we haven't setup the PE mapping yet. So the errors should be counted to PE#0. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Replace variables with flagsGavin Shan2013-06-30
| | | | | | | | | We have 2 fields in "struct pnv_phb" to trace the states. The patch replace the fields with one and introduces flags for that. The patch doesn't impact the logic. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: Enable EEH check for config accessGavin Shan2013-06-20
| | | | | | | | | | | The patch enables EEH check and let EEH core to process the EEH errors for PowerNV platform while accessing config space. Originally, the implementation already had mechanism to check EEH errors and tried to recover from them. However, we never let EEH core to handle the EEH errors. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: Sync OPAL API with firmwareGavin Shan2013-06-20
| | | | | | | | | | The patch synchronizes OPAL APIs between kernel and firmware. Also, we starts to replace opal_pci_get_phb_diag_data() with the similar opal_pci_get_phb_diag_data2() and the former OPAL API would return OPAL_UNSUPPORTED from now on. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/vfio: Enable on PowerNV platformAlexey Kardashevskiy2013-06-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This initializes IOMMU groups based on the IOMMU configuration discovered during the PCI scan on POWERNV (POWER non virtualized) platform. The IOMMU groups are to be used later by the VFIO driver, which is used for PCI pass through. It also implements an API for mapping/unmapping pages for guest PCI drivers and providing DMA window properties. This API is going to be used later by QEMU-VFIO to handle h_put_tce hypercalls from the KVM guest. The iommu_put_tce_user_mode() does only a single page mapping as an API for adding many mappings at once is going to be added later. Although this driver has been tested only on the POWERNV platform, it should work on any platform which supports TCE tables. As h_put_tce hypercall is received by the host kernel and processed by the QEMU (what involves calling the host kernel again), performance is not the best - circa 220MB/s on 10Gb ethernet network. To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config option and configure VFIO as required. Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Make radeon 32-bit MSI quirk work on powernvBenjamin Herrenschmidt2013-05-24
| | | | | | | | | This moves the quirk itself to pci_64.c as to get built on all ppc64 platforms (the only ones with a pci_dn), factors the two implementations of get_pdn() into a single pci_get_dn() and use the quirk to do 32-bit MSIs on IODA based powernv platforms. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Fix condition for when to invalidate the TCE cacheBenjamin Herrenschmidt2013-05-24
| | | | | | | | | We use two flags, one to indicate an invalidation is needed after creating a new entry and one to indicate an invalidation is needed after removing an entry. However we were testing the wrong flag in the remove case. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Improve kexec reliabilityBenjamin Herrenschmidt2013-05-10
| | | | | | | | | | We add a machine_shutdown hook that frees the OPAL interrupts (so they get masked at the source and don't fire while kexec'ing) and which triggers an IODA reset on all the PCIe host bridges which will have the effect of blocking all DMAs and subsequent PCIs interrupts. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: TCE invalidation for PHB3Gavin Shan2013-04-26
| | | | | | | | | | | | The TCE should be invalidated while it's created or free'd. The approach to do that for IODA1 and IODA2 compliant PHBs are different. So the patch differentiate them with different functions called to do that for IODA1 and IODA2 compliant PHBs. It's notable that the PCI address is used to invalidate the corresponding TCE on IODA2 compliant PHB3. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Patch MSI EOI handler on P8Gavin Shan2013-04-26
| | | | | | | | | | | | | The EOI handler of MSI/MSI-X interrupts for P8 (PHB3) need additional steps to handle the P/Q bits in IVE before EOIing the corresponding interrupt. The patch changes the EOI handler to cover that. we have individual IRQ chip in each PHB instance. During the MSI IRQ setup time, the IRQ chip is copied over from the original one for that IRQ, and the EOI handler is patched with the one that will handle the P/Q bits (As Ben suggested). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>