aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* [SCSI] lpfc 8.3.35: Expand I/O channel support for large systemsJames Smart2012-10-08
| | | | | Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] lpfc 8.3.35: Correct missing queue destroy on function resetJames Smart2012-10-08
| | | | | Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] lpfc 8.3.35: Fix incorrect comment in T10 DIF attributesJames Smart2012-10-08
| | | | | Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] lpfc 8.3.35: Added checking BMBX register for RDY bit before writing ↵James Smart2012-10-08
| | | | | | | the first address in Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] lpfc 8.3.35: Fix interrupt delay multipler conversion for eq_createJames Smart2012-10-08
| | | | | Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] Documentation: Add lesb/ to path for LESB attributes in FCoE bus ↵Robert Love2012-10-07
| | | | | | | | | | | | documentation The Link Error Status Block attributes are incorrectly named as they do not have the lesb_ prefix, but instead are grouped in the lesb/ attribute group. Signed-off-by: Robert Love <robert.w.love@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] libfc: fix lun reset failure bugs in fc_fcp_resp handling of FCP_RSP_INFOYi Zou2012-10-07
| | | | | | | | | | | | | | | | | | | | | | In LUN RESET testing involving NetApp targets, it is observed that LUN RESET is failing. The fc_fcp_resp() is not completing the completion for the LUN RESET task since fc_fcp_resp assumes that the FCP_RSP_INFO is 8 bytes with the 4 byte reserved field, where in case of NetApp targets the FCP_RSP to LUN RESET only has 4 bytes of FCP_RSP_INFO. This leads fc_fcp_resp to error out w/o completing the task completion, eventually causing LUN RESET to be escalated to host reset, which is not very nice. Per FCP-3 r04, clause 9.5.15 and Table 23, the FCP_RSP_INFO field can be either 4 bytes or 8 bytes, with the last 4 bytes as "Reserved (if any)". Therefore it is valid to have 4 bytes FCP_RSP_INFO like some of the NetApp targets behave. Fixing this by validating the FCP_RSP_INFO against both the two spec allowed length. Reported-by: Frank Zhang <frank_1.zhang@intel.com> Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] fcoe: Fix write errors on NPIV portsNeerav Parikh2012-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCSI errors were generated while writing to LUNs connected via NPIV ports. Debugging this it was found that the FCoE packets transmitted via the NPIV ports were not tagged with correct user priority as negotiated with peer by DCB agent. This resulted in FCoE traffic going with priority zero(0) that did not have priority flow control (PFC) enabled for it. The initiator after transferring data to the target never saw any reply indicating the transfer was complete. This resulted in error recovery (ABTS) and SCSI command retries by the scsi-mid layer; eventually resulting in I/O errors. This patch fixes this issue by keeping the FCoE user priority information in the fcoe_interface instance that is common for both the physical port as well as NPIV ports connected to that physical port; instead of storing it in fcoe_port structure that has a per port instance. Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com> Acked-by: Yi Zou <yi.zou@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] mvumi: Add support for Marvell SAS/SATA RAID-on-Chip(ROC) 88RC9580Shun Fu2012-10-07
| | | | | | [jejb: fix up for spelling correction patch] Signed-off-by: Shun Fu <fushun@marvell.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Update the driver version to 3.1.2.1Krishna Gudipati2012-10-07
| | | | | Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Firmware image naming convention updateKrishna Gudipati2012-10-07
| | | | | | | | | | | | - Modified the firmware naming convention to contain the firmware image version (3.1.0.0). - The new convention is <firmware-image>-<firmware-version>.bin - The change will enforce loading only compatible firmware with this driver and also avoid over-writing the old firmware image in-order to load new version driver as the firmware names used to be the same. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add support to read/update the FRU data.Krishna Gudipati2012-10-07
| | | | | | | | | - Add FRU sub-module to support FRU read/write/update. - Add support to read/write from the temp FRU module. [jejb: fix checkpatch issues] Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Support Power on Hours display and diag temp sensor fixesKrishna Gudipati2012-10-07
| | | | | | | | - Add Power On Hours display support during sfpshow - Fix to properly set the diag temperature sensor status variable. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add support to configure min/max bandwidth for a pcifnKrishna Gudipati2012-10-07
| | | | | | | | | | - Added support to configure minimum bandwidth for a pcifn. - Minimum bandwith is guaranteed at per queue level. - Added support to update pcifn bandwidth dynamically without a server reboot. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add support for IO throttling at port levelKrishna Gudipati2012-10-07
| | | | | | | | | Add capability to limit the number of exchanges on a port to avoid queue-full conditions from the target side. Signed-off-by: Sudarsana Reddy Kalluru <skalluru@brocade.com> Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add support for user to configure bandwidth on QoS prioritiesKrishna Gudipati2012-10-07
| | | | | | | | | Made changes to provide an option for user to configure the bandwidth percentage for High/Medium/Low QoS priorities. Signed-off-by: Sudarsana Reddy Kalluru <skalluru@brocade.com> Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Fabric Assigned Address implementation fixKrishna Gudipati2012-10-07
| | | | | | | | | | | | | | - Made changes such that once the PWWN is acquired from the fabric through FAA, and if the FAPWWN configuration is modified on the switch side, driver should show relevant information to the user. - Added logic to cache the reason code when the given port is disabled implicitl due to FAA error condition. - If the port is disabled, while sending SCN to upper layer, update the reason code appropriately. With this, BFA FC port state machine will enter into faa_err_config state. This state will be shown to the user. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add diagnostic port (D-Port) supportKrishna Gudipati2012-10-07
| | | | | | | | | | | | | | - Introduced support for D-Port which is a new port mode during which link level diagnostics can be run. - Provided mechanism to dynamically configure D-Port and initiate diagnostic tests to isolate any link level issues. - In D-Port mode, the HBA port does not participate in fabric or login to the remote device or run data traffic. - Diagnostic tests include running various loopback tests in conjunction with the attached device. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Fix to handle firmware tskim abort request responseKrishna Gudipati2012-10-07
| | | | | | | | | | - Enhance tracing to include both tskim tag and event. - Handle the tskim abort response from firmware in the tskim state machine cleanup state and proceed with the tskim cleanup. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Fix few attributes in the RHBA CT passthru commandKrishna Gudipati2012-10-07
| | | | | | | | | | | - Made changes to set the RHBA command max payload based on the port configured frame size. - Made changes to fix the driver/fw version size in FMDI structure. - Fix to pass the fw version for FDMI attribute type FDMI_HBA_ATTRIB_FW_VERSION rather than driver version. Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add support to have mfg date as part of adapter attributesKrishna Gudipati2012-10-07
| | | | | | | | | Made changes to expose mfg day/month/year as part of the adapter attributes for user space applications. Signed-off-by: Vijaya Mohan Guvva <vmohan@brocade.com> Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Flash Controller PLL initialization fixesKrishna Gudipati2012-10-07
| | | | | | | | | | - Made changes to check the flash controller status before IOC initialization. - Made changes to poll on the FLASH_STS_REG bit to check if the flash controller initialization is completed during the PLL init. Signed-off-by: Vijaya Mohan Guvva <vmohan@brocade.com> Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: IOCFC state machine enhancementsKrishna Gudipati2012-10-07
| | | | | | | | | | - Add support to handle STOP/DISABLE events in the IOCFC state machine. - Made changes to bring the IOC down on a flash driver config read failure. - Added logic to clean the use count and fail sync registers during IOCFC init. Signed-off-by: Vijaya Mohan Guvva <vmohan@brocade.com> Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* [SCSI] bfa: Add support for FC Arbitrated Loop topology.Krishna Gudipati2012-10-07
| | | | | | | | | | | | | | | | - Add private loop topology support at 2G/4G/8G speeds with following limitations 1. No support for multiple initiators in the loop 2. No public loop support. If attached to a loop with an FL_Port, device continues to work as a private NL_Port in the loop 3. No auto topology detection. User has to manually set the configured topology to loop if attaching to loop. - When loop topology is configured, enabling FC port features QoS/Trunk/TRL are not allowed and vice versa. Signed-off-by: Vijaya Mohan Guvva <vmohan@brocade.com> Signed-off-by: Krishna Gudipati <kgudipat@brocade.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Merge tag 'stable/for-linus-3.7-x86-tag' of ↵Linus Torvalds2012-10-03
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen update from Konrad Rzeszutek Wilk: "Features: - When hotplugging PCI devices in a PV guest we can allocate Xen-SWIOTLB later. - Cleanup Xen SWIOTLB. - Support pages out grants from HVM domains in the backends. - Support wild cards in xen-pciback.hide=(BDF) arguments. - Update grant status updates with upstream hypervisor. - Boot PV guests with more than 128GB. - Cleanup Xen MMU code/add comments. - Obtain XENVERS using a preferred method. - Lay out generic changes to support Xen ARM. - Allow privcmd ioctl for HVM (used to do only PV). - Do v2 of mmap_batch for privcmd ioctls. - If hypervisor saves the LED keyboard light - we will now instruct the kernel about its state. Fixes: - More fixes to Xen PCI backend for various calls/FLR/etc. - With more than 4GB in a 64-bit PV guest disable native SWIOTLB. - Fix up smatch warnings. - Fix up various return values in privmcmd and mm." * tag 'stable/for-linus-3.7-x86-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (48 commits) xen/pciback: Restore the PCI config space after an FLR. xen-pciback: properly clean up after calling pcistub_device_find() xen/vga: add the xen EFI video mode support xen/x86: retrieve keyboard shift status flags from hypervisor. xen/gndev: Xen backend support for paged out grant targets V4. xen-pciback: support wild cards in slot specifications xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer. xen/swiotlb: Remove functions not needed anymore. xen/pcifront: Use Xen-SWIOTLB when initting if required. xen/swiotlb: For early initialization, return zero on success. xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used. xen/swiotlb: Move the error strings to its own function. xen/swiotlb: Move the nr_tbl determination in its own function. xen/arm: compile and run xenbus xen: resynchronise grant table status codes with upstream xen/privcmd: return -EFAULT on error xen/privcmd: Fix mmap batch ioctl error status copy back. xen/privcmd: add PRIVCMD_MMAPBATCH_V2 ioctl xen/mm: return more precise error from xen_remap_domain_range() xen/mmu: If the revector fails, don't attempt to revector anything else. ...
| * xen/pciback: Restore the PCI config space after an FLR.Konrad Rzeszutek Wilk2012-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we do an FLR, or D0->D3_hot we may lose the BARs as the device has turned itself off (and on). This means the device cannot function unless the pci_restore_state is called - which it is when the PCI device is unbound from the Xen PCI backend driver. For PV guests it ends up calling pci_enable_device / pci_enable_msi[x] which does the proper steps That however is not happening if a HVM guest is run as QEMU deals with PCI configuration space. QEMU also requires that the device be "parked" under the ownership of a pci-stub driver to guarantee that the PCI device is not being used. Hence we follow the same incantation as pci_reset_function does - by doing an FLR, then restoring the PCI configuration space. The result of this patch is that when you run lspci, you get now this: - Region 0: [virtual] Memory at fe8c0000 (32-bit, non-prefetchable) [size=128K] - Region 1: [virtual] Memory at fe800000 (32-bit, non-prefetchable) [size=512K] + Region 0: Memory at fe8c0000 (32-bit, non-prefetchable) [size=128K] + Region 1: Memory at fe800000 (32-bit, non-prefetchable) [size=512K] Region 2: I/O ports at c000 [size=32] - Region 3: [virtual] Memory at fe8e0000 (32-bit, non-prefetchable) [size=16K] + Region 3: Memory at fe8e0000 (32-bit, non-prefetchable) [size=16K] The [virtual] means that lspci read those entries from SysFS but when it read them from the device it got a different value (0xfffffff). CC: stable@vger.kernel.org #only for 3.5, 3.6 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen-pciback: properly clean up after calling pcistub_device_find()Jan Beulich2012-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | As the function calls pcistub_device_get() before returning non-NULL, its callers need to take care of calling pcistub_device_put() on (mostly, but not exclusively) error paths. Otoh, the function already guarantees that the 'dev' member is non-NULL upon successful return, so callers do not need to check for this a second time. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/vga: add the xen EFI video mode supportJan Beulich2012-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to add xen EFI frambebuffer video support, it is required to add xen-efi's new video type (XEN_VGATYPE_EFI_LFB) case and handle it in the function xen_init_vga and set the video type to VIDEO_TYPE_EFI to enable efi video mode. The original patch from which this was broken out from: http://marc.info/?i=4E099AA6020000780004A4C6@nat28.tlf.novell.com Signed-off-by: Jan Beulich <JBeulich@novell.com> Signed-off-by: Tang Liang <liang.tang@oracle.com> [v2: The original author is Jan Beulich and Liang Tang ported it to upstream] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/x86: retrieve keyboard shift status flags from hypervisor.Konrad Rzeszutek Wilk2012-09-24
| | | | | | | | | | | | | | | | | | The xen c/s 25873 allows the hypervisor to retrieve the NUMLOCK flag. With this patch, the Linux kernel can get the state according to the data in the BIOS. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * Merge branch 'stable/late-swiotlb.v3.3' into stable/for-linus-3.7Konrad Rzeszutek Wilk2012-09-22
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable/late-swiotlb.v3.3: xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer. xen/swiotlb: Remove functions not needed anymore. xen/pcifront: Use Xen-SWIOTLB when initting if required. xen/swiotlb: For early initialization, return zero on success. xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used. xen/swiotlb: Move the error strings to its own function. xen/swiotlb: Move the nr_tbl determination in its own function. swiotlb: add the late swiotlb initialization function with iotlb memory xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB. xen/swiotlb: Simplify the logic. Conflicts: arch/x86/xen/pci-swiotlb-xen.c Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: Fix compile warnings when using plain integer instead of NULL ↵Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer. arch/x86/xen/pci-swiotlb-xen.c:96:1: warning: Using plain integer as NULL pointer arch/x86/xen/pci-swiotlb-xen.c:96:1: warning: Using plain integer as NULL pointer Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: Remove functions not needed anymore.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse warns us off: drivers/xen/swiotlb-xen.c:506:1: warning: symbol 'xen_swiotlb_map_sg' was not declared. Should it be static? drivers/xen/swiotlb-xen.c:534:1: warning: symbol 'xen_swiotlb_unmap_sg' was not declared. Should it be static? and it looks like we do not need this function at all. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/pcifront: Use Xen-SWIOTLB when initting if required.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We piggyback on "xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used." functionality to start up the Xen-SWIOTLB if we are hot-plugged. This allows us to bypass the need to supply 'iommu=soft' on the Linux command line (mostly). With this patch, if a user forgot 'iommu=soft' on the command line, and hotplug a PCI device they will get: pcifront pci-0: Installing PCI frontend Warning: only able to allocate 4 MB for software IO TLB software IO TLB [mem 0x2a000000-0x2a3fffff] (4MB) mapped at [ffff88002a000000-ffff88002a3fffff] pcifront pci-0: Creating PCI Frontend Bus 0000:00 pcifront pci-0: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x0000-0xffff] pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffff] pci 0000:00:00.0: [8086:10d3] type 00 class 0x020000 pci 0000:00:00.0: reg 10: [mem 0xfe5c0000-0xfe5dffff] pci 0000:00:00.0: reg 14: [mem 0xfe500000-0xfe57ffff] pci 0000:00:00.0: reg 18: [io 0xe000-0xe01f] pci 0000:00:00.0: reg 1c: [mem 0xfe5e0000-0xfe5e3fff] pcifront pci-0: claiming resource 0000:00:00.0/0 pcifront pci-0: claiming resource 0000:00:00.0/1 pcifront pci-0: claiming resource 0000:00:00.0/2 pcifront pci-0: claiming resource 0000:00:00.0/3 e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k e1000e: Copyright(c) 1999 - 2012 Intel Corporation. e1000e 0000:00:00.0: Disabling ASPM L0s L1 e1000e 0000:00:00.0: enabling device (0000 -> 0002) e1000e 0000:00:00.0: Xen PCI mapped GSI16 to IRQ34 e1000e 0000:00:00.0: (unregistered net_device): Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode e1000e 0000:00:00.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:1b:21:ab:c6:13 e1000e 0000:00:00.0: eth0: Intel(R) PRO/1000 Network Connection e1000e 0000:00:00.0: eth0: MAC: 3, PHY: 8, PBA No: E46981-005 The "Warning only" will go away if one supplies 'iommu=soft' instead as we have a higher chance of being able to allocate large swaths of memory. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: For early initialization, return zero on success.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If everything is setup properly we would return -ENOMEM since rc by default is set to that value. Lets not do that and return a proper return code. Note: The reason the early code needs this special treatment is that it SWIOTLB library call does not return anything (and had it failed it would call panic()) - but our function does. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late ↵Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when PV PCI is used. With this patch we provide the functionality to initialize the Xen-SWIOTLB late in the bootup cycle - specifically for Xen PCI-frontend. We still will work if the user had supplied 'iommu=soft' on the Linux command line. Note: We cannot depend on after_bootmem to automatically determine whether this is early or not. This is because when PCI IOMMUs are initialized it is after after_bootmem but before a lot of "other" subsystems are initialized. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [v1: Fix smatch warnings] [v2: Added check for xen_swiotlb] [v3: Rebased with new xen-swiotlb changes] [v4: squashed xen/swiotlb: Depending on after_bootmem is not correct in] Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: Move the error strings to its own function.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | That way we can more easily reuse those errors when using the late SWIOTLB init. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: Move the nr_tbl determination in its own function.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | Moving the function out of the way to prepare for the late SWIOTLB init. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * swiotlb: add the late swiotlb initialization function with iotlb memoryKonrad Rzeszutek Wilk2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables the caller to initialize swiotlb with its own iotlb memory late in the bootup. See git commit eb605a5754d050a25a9f00d718fb173f24c486ef "swiotlb: add swiotlb_tbl_map_single library function" which will explain the full details of what it can be used for. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [v1: Fold in smatch warning] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.Konrad Rzeszutek Wilk2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a PV guest is booted the native SWIOTLB should not be turned on. It does not help us (we don't have any PCI devices) and it eats 64MB of good memory. In the case of PV guests with PCI devices we need the Xen-SWIOTLB one. [v1: Rewrite comment per Stefano's suggestion] Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * xen/swiotlb: Simplify the logic.Konrad Rzeszutek Wilk2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Its pretty easy: 1). We only check to see if we need Xen SWIOTLB for PV guests. 2). If swiotlb=force or iommu=soft is set, then Xen SWIOTLB will be enabled. 3). If it is an initial domain, then Xen SWIOTLB will be enabled. 4). Native SWIOTLB must be disabled for PV guests. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/gndev: Xen backend support for paged out grant targets V4.Andres Lagar-Cavilla2012-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Xen-4.2, hvm domains may have portions of their memory paged out. When a foreign domain (such as dom0) attempts to map these frames, the map will initially fail. The hypervisor returns a suitable errno, and kicks an asynchronous page-in operation carried out by a helper. The foreign domain is expected to retry the mapping operation until it eventually succeeds. The foreign domain is not put to sleep because itself could be the one running the pager assist (typical scenario for dom0). This patch adds support for this mechanism for backend drivers using grant mapping and copying operations. Specifically, this covers the blkback and gntdev drivers (which map foreign grants), and the netback driver (which copies foreign grants). * Add a retry method for grants that fail with GNTST_eagain (i.e. because the target foreign frame is paged out). * Insert hooks with appropriate wrappers in the aforementioned drivers. The retry loop is only invoked if the grant operation status is GNTST_eagain. It guarantees to leave a new status code different from GNTST_eagain. Any other status code results in identical code execution as before. The retry loop performs 256 attempts with increasing time intervals through a 32 second period. It uses msleep to yield while waiting for the next retry. V2 after feedback from David Vrabel: * Explicit MAX_DELAY instead of wrap-around delay into zero * Abstract GNTST_eagain check into core grant table code for netback module. V3 after feedback from Ian Campbell: * Add placeholder in array of grant table error descriptions for unrelated error code we jump over. * Eliminate single map and retry macro in favor of a generic batch flavor. * Some renaming. * Bury most implementation in grant_table.c, cleaner interface. V4 rebased on top of sync of Xen grant table interface headers. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> [v5: Fixed whitespace issues] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen-pciback: support wild cards in slot specificationsJan Beulich2012-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Particularly for hiding sets of SR-IOV devices, specifying them all individually is rather cumbersome. Therefore, allow function and slot numbers to be replaced by a wildcard character ('*'). Unfortunately this gets complicated by the in-kernel sscanf() implementation not being really standard conformant - matching of plain text tails cannot be checked by the caller (a patch to overcome this will be sent shortly, and a follow-up patch for simplifying the code is planned to be sent when that fixed went upstream). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/arm: compile and run xenbusStefano Stabellini2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bind_evtchn_to_irqhandler can legitimately return 0 (irq 0): it is not an error. If Linux is running as an HVM domain and is running as Dom0, use xenstored_local_init to initialize the xenstore page and event channel. Changes in v4: - do not xs_reset_watches on dom0. Changes in v2: - refactor xenbus_init. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v5: Fixed case switch indentations] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen: resynchronise grant table status codes with upstreamIan Campbell2012-09-14
| | | | | | | | | | | | | | | | | | | | | Adds GNTST_address_too_big and GNTST_eagain. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | Merge branch 'stable/128gb.v5.1' into stable/for-linus-3.7Konrad Rzeszutek Wilk2012-09-12
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable/128gb.v5.1: xen/mmu: If the revector fails, don't attempt to revector anything else. xen/p2m: When revectoring deal with holes in the P2M array. xen/mmu: Release just the MFN list, not MFN list and part of pagetables. xen/mmu: Remove from __ka space PMD entries for pagetables. xen/mmu: Copy and revector the P2M tree. xen/p2m: Add logic to revector a P2M tree to use __va leafs. xen/mmu: Recycle the Xen provided L4, L3, and L2 pages xen/mmu: For 64-bit do not call xen_map_identity_early xen/mmu: use copy_page instead of memcpy. xen/mmu: Provide comments describing the _ka and _va aliasing issue xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything. Revert "xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain." and "xen/x86: Use memblock_reserve for sensitive areas." xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain. xen/x86: Use memblock_reserve for sensitive areas. xen/p2m: Fix the comment describing the P2M tree. Conflicts: arch/x86/xen/mmu.c The pagetable_init is the old xen_pagetable_setup_done and xen_pagetable_setup_start rolled in one. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | xen/mmu: If the revector fails, don't attempt to revector anything else.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the P2M revectoring would fail, we would try to continue on by cleaning the PMD for L1 (PTE) page-tables. The xen_cleanhighmap is greedy and erases the PMD on both boundaries. Since the P2M array can share the PMD, we would wipe out part of the __ka that is still used in the P2M tree to point to P2M leafs. This fixes it by bypassing the revectoring and continuing on. If the revector fails, a nice WARN is printed so we can still troubleshoot this. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | xen/p2m: When revectoring deal with holes in the P2M array.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we free the PFNs and then subsequently populate them back during bootup: Freeing 20000-20200 pfn range: 512 pages freed 1-1 mapping on 20000->20200 Freeing 40000-40200 pfn range: 512 pages freed 1-1 mapping on 40000->40200 Freeing bad80-badf4 pfn range: 116 pages freed 1-1 mapping on bad80->badf4 Freeing badf6-bae7f pfn range: 137 pages freed 1-1 mapping on badf6->bae7f Freeing bb000-100000 pfn range: 282624 pages freed 1-1 mapping on bb000->100000 Released 283999 pages of unused memory Set 283999 page(s) to 1-1 mapping Populating 1acb8a-1f20e9 pfn range: 283999 pages added We end up having the P2M array (that is the one that was grafted on the P2M tree) filled with IDENTITY_FRAME or INVALID_P2M_ENTRY) entries. The patch titled "xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID." recycles said slots and replaces the P2M tree leaf's with &mfn_list[xx] with p2m_identity or p2m_missing. And re-uses the P2M array sections for other P2M tree leaf's. For the above mentioned bootup excerpt, the PFNs at 0x20000->0x20200 are going to be IDENTITY based: P2M[0][256][0] -> P2M[0][257][0] get turned in IDENTITY_FRAME. We can re-use that and replace P2M[0][256] to point to p2m_identity. The "old" page (the grafted P2M array provided by Xen) that was at P2M[0][256] gets put somewhere else. Specifically at P2M[6][358], b/c when we populate back: Populating 1acb8a-1f20e9 pfn range: 283999 pages added we fill P2M[6][358][0] (and P2M[6][358], P2M[6][359], ...) with the new MFNs. That is all OK, except when we revector we assume that the PFN count would be the same in the grafted P2M array and in the newly allocated. Since that is no longer the case, as we have holes in the P2M that point to p2m_missing or p2m_identity we have to take that into account. [v2: Check for overflow] [v3: Move within the __va check] [v4: Fix the computation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | xen/mmu: Release just the MFN list, not MFN list and part of pagetables.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We call memblock_reserve for [start of mfn list] -> [PMD aligned end of mfn list] instead of <start of mfn list> -> <page aligned end of mfn list]. This has the disastrous effect that if at bootup the end of mfn_list is not PMD aligned we end up returning to memblock parts of the region past the mfn_list array. And those parts are the PTE tables with the disastrous effect of seeing this at bootup: Write protecting the kernel read-only data: 10240k Freeing unused kernel memory: 1860k freed Freeing unused kernel memory: 200k freed (XEN) mm.c:2429:d0 Bad type (saw 1400000000000002 != exp 7000000000000000) for mfn 116a80 (pfn 14e26) ... (XEN) mm.c:908:d0 Error getting mfn 116a83 (pfn 14e2a) from L1 entry 8000000116a83067 for l1e_owner=0, pg_owner=0 (XEN) mm.c:908:d0 Error getting mfn 4040 (pfn 5555555555555555) from L1 entry 0000000004040601 for l1e_owner=0, pg_owner=0 .. and so on. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | xen/mmu: Remove from __ka space PMD entries for pagetables.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Please first read the description in "xen/mmu: Copy and revector the P2M tree." At this stage, the __ka address space (which is what the old P2M tree was using) is partially disassembled. The cleanup_highmap has removed the PMD entries from 0-16MB and anything past _brk_end up to the max_pfn_mapped (which is the end of the ramdisk). The xen_remove_p2m_tree and code around has ripped out the __ka for the old P2M array. Here we continue on doing it to where the Xen page-tables were. It is safe to do it, as the page-tables are addressed using __va. For good measure we delete anything that is within MODULES_VADDR and up to the end of the PMD. At this point the __ka only contains PMD entries for the start of the kernel up to __brk. [v1: Per Stefano's suggestion wrapped the MODULES_VADDR in debug] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| | * | xen/mmu: Copy and revector the P2M tree.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Please first read the description in "xen/p2m: Add logic to revector a P2M tree to use __va leafs" patch. The 'xen_revector_p2m_tree()' function allocates a new P2M tree copies the contents of the old one in it, and returns the new one. At this stage, the __ka address space (which is what the old P2M tree was using) is partially disassembled. The cleanup_highmap has removed the PMD entries from 0-16MB and anything past _brk_end up to the max_pfn_mapped (which is the end of the ramdisk). We have revectored the P2M tree (and the one for save/restore as well) to use new shiny __va address to new MFNs. The xen_start_info has been taken care of already in 'xen_setup_kernel_pagetable()' and xen_start_info->shared_info in 'xen_setup_shared_info()', so we are free to roam and delete PMD entries - which is exactly what we are going to do. We rip out the __ka for the old P2M array. [v1: Fix smatch warnings] [v2: memset was doing 0 instead of 0xff] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>