aboutsummaryrefslogtreecommitdiffstats
path: root/arch/s390/kvm
Commit message (Collapse)AuthorAge
...
| | * KVM: s390: gaccess: function for preparing translation exceptionsDavid Hildenbrand2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Let's provide a function trans_exc() that can be used for handling preparation of translation exceptions on a central basis. We will use that function to replace existing code in gaccess. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: gaccess: store guest address on ALC prot exceptionsDavid Hildenbrand2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Let's pass the effective guest address to get_vcpu_asce(), so we can properly set the guest address in case we inject an ALC protection exception. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: forward ESOP if availableDavid Hildenbrand2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ESOP guarantees that during a protection exception, bit 61 of real location 168-175 will only be set to 1 if it was because of ALCP or DATP. If the exception is due to LAP or KCP, the bit will always be set to 0. The old SOP definition allowed bit 61 to be unpredictable in case of LAP or KCP in some conditions. So ESOP replaces this unpredictability by a guarantee. Therefore, we can directly forward ESOP if it is available on our machine. We don't have to do anything when ESOP is disabled - the guest will simply expect unpredictable values. Our guest access functions are already handling ESOP properly. Please note that future functionality in KVM will require knowledge about ESOP being enabled for a guest or not. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: interface to query and configure cpu featuresDavid Hildenbrand2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For now, we only have an interface to query and configure facilities indicated via STFL(E). However, we also have features indicated via SCLP, that have to be indicated to the guest by user space and usually require KVM support. This patch allows user space to query and configure available cpu features for the guest. Please note that disabling a feature doesn't necessarily mean that it is completely disabled (e.g. ESOP is mostly handled by the SIE). We will try our best to disable it. Most features (e.g. SCLP) can't directly be forwarded, as most of them need in addition to hardware support, support in KVM. As we later on want to turn these features in KVM explicitly on/off (to simulate different behavior), we have to filter all features provided by the hardware and make them configurable. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: Add mnemonic print to kvm_s390_intercept_progAlexander Yarygin2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a table of mnemonic names for intercepted program interruptions, let's print readable name of the interruption in the kvm_s390_intercept_prog trace event. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: Limit sthyi executionJanosch Frank2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store hypervisor information is a valid instruction not only in supervisor state but also in problem state, i.e. the guest's userspace. Its execution is not only computational and memory intensive, but also has to get hold of the ipte lock to write to the guest's memory. This lock is not intended to be held often and long, especially not from the untrusted guest userspace. Therefore we apply rate limiting of sthyi executions per VM. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: Add sthyi emulationJanosch Frank2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store Hypervisor Information is an emulated z/VM instruction that provides a guest with basic information about the layers it is running on. This includes information about the cpu configuration of both the machine and the lpar, as well as their names, machine model and machine type. This information enables an application to determine the maximum capacity of CPs and IFLs available to software. The instruction is available whenever the facility bit 74 is set, otherwise executing it results in an operation exception. It is important to check the validity flags in the sections before using data from any structure member. It is not guaranteed that all members will be valid on all machines / machine configurations. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * KVM: s390: Add operation exception interception handlerJanosch Frank2016-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces code that handles operation exception interceptions. With this handler we can emulate instructions by using illegal opcodes. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: S390: Fix typoAndrea Gelmini2016-06-14
| |/ | | | | | | | | Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2016-07-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "There are a couple of new things for s390 with this merge request: - a new scheduling domain "drawer" is added to reflect the unusual topology found on z13 machines. Performance tests showed up to 8 percent gain with the additional domain. - the new crc-32 checksum crypto module uses the vector-galois-field multiply and sum SIMD instruction to speed up crc-32 and crc-32c. - proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in the generic vmlinux.lds linker script definitions. - kcov instrumentation support. A prerequisite for that is the inline assembly basic block cleanup, which is the reason for the net/iucv/iucv.c change. - support for 2GB pages is added to the hugetlbfs backend. Then there are two removals: - the oprofile hardware sampling support is dead code and is removed. The oprofile user space uses the perf interface nowadays. - the ETR clock synchronization is removed, this has been superseeded be the STP clock synchronization. And it always has been "interesting" code.. And the usual bug fixes and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits) s390/pci: Delete an unnecessary check before the function call "pci_dev_put" s390/smp: clean up a condition s390/cio/chp : Remove deprecated create_singlethread_workqueue s390/chsc: improve channel path descriptor determination s390/chsc: sanitize fmt check for chp_desc determination s390/cio: make fmt1 channel path descriptor optional s390/chsc: fix ioctl CHSC_INFO_CU command s390/cio/device_ops: fix kernel doc s390/cio: allow to reset channel measurement block s390/console: Make preferred console handling more consistent s390/mm: fix gmap tlb flush issues s390/mm: add support for 2GB hugepages s390: have unique symbol for __switch_to address s390/cpuinfo: show maximum thread id s390/ptrace: clarify bits in the per_struct s390: stack address vs thread_info s390: remove pointless load within __switch_to s390: enable kcov support s390/cpumf: use basic block for ecctr inline assembly s390/hypfs: use basic block for diag inline assembly ...
| * | s390/time: remove ETR supportMartin Schwidefsky2016-06-13
| |/ | | | | | | | | | | | | | | The External-Time-Reference (ETR) clock synchronization interface has been superseded by Server-Time-Protocol (STP). Remove the outdated ETR interface. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | KVM: s390: Add stats for PEI eventsAlexander Yarygin2016-06-10
| | | | | | | | | | | | | | | | | | Add partial execution intercepted events in kvm_stats_debugfs. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* | KVM: s390: ignore IBC if zeroDavid Hildenbrand2016-06-10
|/ | | | | | | | | | | Looks like we forgot about the special IBC value of 0 meaning "no IBC". Let's fix that, otherwise it gets rounded up and suddenly an IBC is active with the lowest possible machine. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Fixes: commit 053dd2308d81 ("KVM: s390: force ibc into valid range") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: halt_polling: provide a way to qualify wakeups during pollChristian Borntraeger2016-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some wakeups should not be considered a sucessful poll. For example on s390 I/O interrupts are usually floating, which means that _ALL_ CPUs would be considered runnable - letting all vCPUs poll all the time for transactional like workload, even if one vCPU would be enough. This can result in huge CPU usage for large guests. This patch lets architectures provide a way to qualify wakeups if they should be considered a good/bad wakeups in regard to polls. For s390 the implementation will fence of halt polling for anything but known good, single vCPU events. The s390 implementation for floating interrupts does a wakeup for one vCPU, but the interrupt will be delivered by whatever CPU checks first for a pending interrupt. We prefer the woken up CPU by marking the poll of this CPU as "good" poll. This code will also mark several other wakeup reasons like IPI or expired timers as "good". This will of course also mark some events as not sucessful. As KVM on z runs always as a 2nd level hypervisor, we prefer to not poll, unless we are really sure, though. This patch successfully limits the CPU usage for cases like uperf 1byte transactional ping pong workload or wakeup heavy workload like OLTP while still providing a proper speedup. This also introduced a new vcpu stat "halt_poll_no_tuning" that marks wakeups that are considered not good for polling. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version) Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> [Rename config symbol. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* KVM: s390: Populate mask of non-hypervisor managed facility bitsAlexander Yarygin2016-05-09
| | | | | | | | | | | | | | | | | When a guest is initializing, KVM provides facility bits that can be successfully used by the guest. It's done by applying kvm_s390_fac_list_mask mask on host facility bits stored by the STFLE instruction. Facility bits can be one of two kinds: it's either a hypervisor managed bit or non-hypervisor managed. The hardware provides information which bits need special handling. Let's automatically passthrough to guests new facility bits, that don't require hypervisor support. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: Enable all facility bits that are known good for passthroughAlexander Yarygin2016-05-09
| | | | | | | | | Some facility bits are in a range that is defined to be "ok for guests without any necessary hypervisor changes". Enable those bits. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: force ibc into valid rangeDavid Hildenbrand2016-05-09
| | | | | | | | | | | | Some hardware variants will round the ibc value up/down themselves, others will report a validity intercept. Let's always round it up/down. This patch will also make sure that the ibc is set to 0 in case we don't have ibc support (lowest_ibc == 0). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: cleanup cpuid handlingDavid Hildenbrand2016-05-09
| | | | | | | | | We only have one cpuid for all VCPUs, so let's directly use the one in the cpu model. Also always store it directly as u64, no need for struct cpuid. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: enable SRS only if enabled for the guestDavid Hildenbrand2016-05-09
| | | | | | | | | If we don't have SIGP SENSE RUNNING STATUS enabled for the guest, let's not enable interpretation so we can correctly report an invalid order. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: enable PFMFI only if guest has EDAT1David Hildenbrand2016-05-09
| | | | | | | | | | Only enable PFMF interpretation if the necessary facility (EDAT1) is available, otherwise the pfmf handler in priv.c will inject an exception Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: support NQ only if the facility is enabled for the guestDavid Hildenbrand2016-05-04
| | | | | | | | | While we can not fully fence of the Nonquiescing Key-Setting facility, we should as try our best to hide it. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: cmma: don't check entry contentDavid Hildenbrand2016-05-04
| | | | | | | | | | | | | | | | | | We should never inject an exception after we manually rewound the PSW (to retry the ESSA instruction in this case). This will mess up the PSW. So this never worked and therefore never really triggered. Looking at the details, we don't even have to perform any validity checks. 1. Bits 52-63 of an entry are stored as 0 by the hardware. 2. We are dealing with absolute addresses but only check for the prefix starting at address 0. This isn't correct and doesn't make much sense, cpus could still zap the prefix of other cpus. But as prefix pages cannot be swapped out without a notifier being called for the affected VCPU, a zap can never remove a protected prefix. Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* KVM: s390: add clear I/O irq operation for FLICHalil Pasic2016-04-20
| | | | | | | | | | | | | | | | | | | Introduce a FLIC operation for clearing I/O interrupts for a subchannel. Rationale: According to the platform specification, pending I/O interruption requests have to be revoked in certain situations. For instance, according to the Principles of Operation (page 17-27), a subchannel put into the installed parameters initialized state is in the same state as after an I/O system reset (just parameters possibly changed). This implies that any I/O interrupts for that subchannel are no longer pending (as I/O system resets clear I/O interrupts). Therefore, we need an interface to clear pending I/O interrupts. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* KVM: s390: implement has_attr for FLICHalil Pasic2016-04-20
| | | | | | | | | | | HAS_ATTR is useful for determining the supported attributes; let's implement it. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2016-03-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - Add the CPU id for the new z13s machine - Add a s390 specific XOR template for RAID-5 checksumming based on the XC instruction. Remove all other alternatives, XC is always faster - The merge of our four different stack tracers into a single one - Tidy up the code related to page tables, several large inline functions are now out-of-line. Bloat-o-meter reports ~11K text size reduction - A binary interface for the priviledged CLP instruction to retrieve the hardware view of the installed PCI functions - Improvements for the dasd format code - Bug fixes and cleanups * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (31 commits) s390/pci: enforce fmb page boundary rule s390: fix floating pointer register corruption (again) s390/cpumf: add missing lpp magic initialization s390: Fix misspellings in comments s390/mm: split arch/s390/mm/pgtable.c s390/mm: uninline pmdp_xxx functions from pgtable.h s390/mm: uninline ptep_xxx functions from pgtable.h s390/pci: add ioctl interface for CLP s390: Use pr_warn instead of pr_warning s390/dasd: remove casts to dasd_*_private s390/dasd: Refactor dasd format functions s390/dasd: Simplify code in format logic s390/dasd: Improve dasd format code s390/percpu: remove this_cpu_cmpxchg_double_4 s390/cpumf: Improve guest detection heuristics s390/fault: merge report_user_fault implementations s390/dis: use correct escape sequence for '%' character s390/kvm: simplify set_guest_storage_key s390/oprofile: add z13/z13s model numbers s390: add z13s model number to z13 elf platform ...
| * s390: Fix misspellings in commentsAdam Buchbinder2016-03-08
| | | | | | | | | | | | Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390/mm: split arch/s390/mm/pgtable.cMartin Schwidefsky2016-03-08
| | | | | | | | | | | | | | | | | | The pgtable.c file is quite big, before it grows any larger split it into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related header definitions into the new gmap.h header and all of the pgste helpers from pgtable.h to pgtable.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * s390/mm: uninline ptep_xxx functions from pgtable.hMartin Schwidefsky2016-03-08
| | | | | | | | | | | | | | | | | | | | | | The code in the various ptep_xxx functions has grown quite large, consolidate them to four out-of-line functions: ptep_xchg_direct to exchange a pte with another with immediate flushing ptep_xchg_lazy to exchange a pte with another in a batched update ptep_modify_prot_start to begin a protection flags update ptep_modify_prot_commit to commit a protection flags update Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2016-03-16
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...
| * | KVM: s390: allocate only one DMA page per VMDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can fit the 2k for the STFLE interpretation and the crypto control block into one DMA page. As we now only have to allocate one DMA page, we can clean up the code a bit. As a nice side effect, this also fixes a problem with crycbd alignment in case special allocation debug options are enabled, debugged by Sascha Silbe. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: enable STFLE interpretation only if enabled for the guestDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | Not setting the facility list designation disables STFLE interpretation, this is what we want if the guest was told to not have it. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: wake up when the VCPU cpu timer expiresDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the VCPU cpu timer expires, we have to wake up just like when the ckc triggers. For now, setting up a cpu timer in the guest and going into enabled wait will never lead to a wakeup. This patch fixes this problem. Just as for the ckc, we have to take care of waking up too early. We have to recalculate the sleep time and go back to sleep. Please note that the timer callback calls kvm_s390_get_cpu_timer() from interrupt context. As the timer is canceled when leaving handle_wait(), and we don't do any VCPU cpu timer writes/updates in that function, we can be sure that we will never try to read the VCPU cpu timer from the same cpu that is currentyl updating the timer (deadlock). Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Tested-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: step the VCPU timer while in enabled waitDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cpu timer is a mean to measure task execution time. We want to account everything for a VCPU for which it is responsible. Therefore, if the VCPU wants to sleep, it shall be accounted for it. We can easily get this done by not disabling cpu timer accounting when scheduled out while sleeping because of enabled wait. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: protect VCPU cpu timer with a seqcountDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For now, only the owning VCPU thread (that has loaded the VCPU) can get a consistent cpu timer value when calculating the delta. However, other threads might also be interested in a more recent, consistent value. Of special interest will be the timer callback of a VCPU that executes without having the VCPU loaded and could run in parallel with the VCPU thread. The cpu timer has a nice property: it is only updated by the owning VCPU thread. And speaking about accounting, a consistent value can only be calculated by looking at cputm_start and the cpu timer itself in one shot, otherwise the result might be wrong. As we only have one writing thread at a time (owning VCPU thread), we can use a seqcount instead of a seqlock and retry if the VCPU refreshed its cpu timer. This avoids any heavy locking and only introduces a counter update/check plus a handful of smp_wmb(). The owning VCPU thread should never have to retry on reads, and also for other threads this might be a very rare scenario. Please note that we have to use the raw_* variants for locking the seqcount as lockdep will produce false warnings otherwise. The rq->lock held during vcpu_load/put is also acquired from hardirq context. Lockdep cannot know that we avoid potential deadlocks by disabling preemption and thereby disable concurrent write locking attempts (via vcpu_put/load). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: step VCPU cpu timer during kvm_run ioctlDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Architecturally we should only provide steal time if we are scheduled away, and not if the host interprets a guest exit. We have to step the guest CPU timer in these cases. In the first shot, we will step the VCPU timer only during the kvm_run ioctl. Therefore all time spent e.g. in interception handlers or on irq delivery will be accounted for that VCPU. We have to take care of a few special cases: - Other VCPUs can test for pending irqs. We can only report a consistent value for the VCPU thread itself when adding the delta. - We have to take care of STP sync, therefore we have to extend kvm_clock_sync() and disable preemption accordingly - During any call to disable/enable/start/stop we could get premeempted and therefore get start/stop calls. Therefore we have to make sure we don't get into an inconsistent state. Whenever a VCPU is scheduled out, sleeping, in user space or just about to enter the SIE, the guest cpu timer isn't stepped. Please note that all primitives are prepared to be called from both environments (cpu timer accounting enabled or not), although not completely used in this patch yet (e.g. kvm_s390_set_cpu_timer() will never be called while cpu timer accounting is enabled). Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: abstract access to the VCPU cpu timerDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | We want to manually step the cpu timer in certain scenarios in the future. Let's abstract any access to the cpu timer, so we can hide the complexity internally. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: store cpu id in vcpu->cpu when scheduled inDavid Hildenbrand2016-03-08
| | | | | | | | | | | | | | | | | | | | | | | | By storing the cpu id, we have a way to verify if the current cpu is owning a VCPU. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: bail out early on fatal signal in dirty loggingChristian Borntraeger2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A KVM_GET_DIRTY_LOG ioctl might take a long time. This can result in fatal signals seemingly being ignored. Lets bail out during the dirty bit sync, if a fatal signal is pending. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: do not block CPU on dirty loggingChristian Borntraeger2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing dirty logging on huge guests (e.g.600GB) we sometimes get rcu stall timeouts with backtraces like [ 2753.194083] ([<0000000000112fb2>] show_trace+0x12a/0x130) [ 2753.194092] [<0000000000113024>] show_stack+0x6c/0xe8 [ 2753.194094] [<00000000001ee6a8>] rcu_pending+0x358/0xa48 [ 2753.194099] [<00000000001f20cc>] rcu_check_callbacks+0x84/0x168 [ 2753.194102] [<0000000000167654>] update_process_times+0x54/0x80 [ 2753.194107] [<00000000001bdb5c>] tick_sched_handle.isra.16+0x4c/0x60 [ 2753.194113] [<00000000001bdbd8>] tick_sched_timer+0x68/0x90 [ 2753.194115] [<0000000000182a88>] __run_hrtimer+0x88/0x1f8 [ 2753.194119] [<00000000001838ba>] hrtimer_interrupt+0x122/0x2b0 [ 2753.194121] [<000000000010d034>] do_extint+0x16c/0x170 [ 2753.194123] [<00000000005e206e>] ext_skip+0x38/0x3e [ 2753.194129] [<000000000012157c>] gmap_test_and_clear_dirty+0xcc/0x118 [ 2753.194134] ([<00000000001214ea>] gmap_test_and_clear_dirty+0x3a/0x118) [ 2753.194137] [<0000000000132da4>] kvm_vm_ioctl_get_dirty_log+0xd4/0x1b0 [ 2753.194143] [<000000000012ac12>] kvm_vm_ioctl+0x21a/0x548 [ 2753.194146] [<00000000002b57f6>] do_vfs_ioctl+0x30e/0x518 [ 2753.194149] [<00000000002b5a9c>] SyS_ioctl+0x9c/0xb0 [ 2753.194151] [<00000000005e1ae6>] sysc_tracego+0x14/0x1a [ 2753.194153] [<000003ffb75f3972>] 0x3ffb75f3972 We should do a cond_resched in here. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: do not take mmap_sem on dirty log queryChristian Borntraeger2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dirty log query can take a long time for huge guests. Holding the mmap_sem for very long times can cause some unwanted latencies. Turns out that we do not need to hold the mmap semaphore. We hold the slots_lock for gfn->hva translation and walk the page tables with that address, so no need to look at the VMAs. KVM also holds a reference to the mm, which should prevent other things going away. During the walk we take the necessary ptl locks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: instruction-fetching exceptions on SIE faultsDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On instruction-fetch exceptions, we have to forward the PSW by any valid ilc and correctly use that ilc when injecting the irq. Injection will already take care of rewinding the PSW if we injected a nullifying program irq, so we don't need special handling prior to injection. Until now, autodetection would have guessed an ilc of 0. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: provide prog irq ilc on SIE faultsDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On SIE faults, the ilc cannot be detected automatically, as the icptcode is 0. The ilc indicated in the program irq will always be 0. Therefore we have to manually specify the ilc in order to tell the guest which ilen was used when forwarding the PSW. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: irq delivery should not rely on icptcodeDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Program irq injection during program irq intercepts is the last candidates that injects nullifying irqs and relies on delivery to do the right thing. As we should not rely on the icptcode during any delivery (because that value will not be migrated), let's add a flag, telling prog IRQ delivery to not rewind the PSW in case of nullifying prog IRQs. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: clean up prog irq injection on prog irq icptsDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __extract_prog_irq() is used only once for getting the program check data in one place. Let's combine it with an injection function to avoid a memset and to prevent misuse on injection by simplifying the interface to only have the VCPU as parameter. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: read the correct opcode on SIE faultsDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Let's use our fresh new function read_guest_instr() to access guest storage via the correct addressing schema. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: gaccess: implement instruction fetching modeDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an instruction is to be fetched, special handling applies to secondary-space mode and access-register mode. The instruction is to be fetched from primary space. We can easily support this by selecting the right asce for translation. Access registers will never be used during translation, so don't include them in the interface. As we only want to read from the current PSW address for now, let's also hide that detail. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: gaccess: introduce access modesDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will need special handling when fetching instructions, so let's introduce new guest access modes GACC_FETCH and GACC_STORE instead of a write flag. An additional patch will then introduce GACC_IFETCH. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: migration / injection of prog irq ilcDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have to migrate the program irq ilc and someday we will have to specify the ilc without KVM trying to autodetect the value. Let's reuse one of the spare fields in our program irq that should always be set to 0 by user space. Because we also want to make use of 0 ilcs ("not available"), we need a validity indicator. If no valid ilc is given, we try to autodetect the ilc via the current icptcode and icptstatus + parameter and store the valid ilc in the irq structure. This has a nice effect: QEMU's making use of KVM_S390_IRQ / KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will directly migrate the ilc without any changes. Please note that we use bit 0 as validity and bit 1,2 for the ilc, so by applying the ilc mask we directly get the ilen which is usually what we work with. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: PSW forwarding / rewinding / ilc reworkDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have some confusion about ilc vs. ilen in our current code. So let's correctly use the term ilen when dealing with (ilc << 1). Program irq injection didn't take care of the correct ilc in case of irqs triggered by EXECUTE functions, let's provide one function kvm_s390_get_ilen() to take care of all that. Also, manually specifying in intercept handlers the size of the instruction (and sometimes overwriting that value for EXECUTE internally) doesn't make too much sense. So also provide the functions: - kvm_s390_retry_instr to retry the currently intercepted instruction - kvm_s390_rewind_psw to rewind the PSW without internal overwrites - kvm_s390_forward_psw to forward the PSW Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | KVM: s390: sync of fp registers via kvm_runDavid Hildenbrand2016-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we already store the floating point registers in the vector save area in floating point register format when we don't have MACHINE_HAS_VX, we can directly expose them to user space using a new sync flag. The floating point registers will be valid when KVM_SYNC_FPRS is set. The fpc will also be valid when KVM_SYNC_FPRS is set. Either KVM_SYNC_FPRS or KVM_SYNC_VRS will be enabled, never both. Let's also change two positions where we access vrs, making the code easier to read and one comment superfluous. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>