| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
| |
This patch implements the emulator intercept checks for the
RDTSCP, MONITOR, and MWAIT instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch adds the necessary code changes in the
instruction emulator and the extensions to svm.c to
implement intercept checks for the svm instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch add intercept checks into the KVM instruction
emulator to check for the 8 instructions that access the
descriptor table addresses.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch adds the intercept checks for instruction
accessing the debug registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch adds all necessary intercept checks for
instructions that access the crX registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch adds a callback into kvm_x86_ops so that svm and
vmx code can do intercept checks on emulated instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch adds a flag for the opcoded to tag instruction
which are only recognized in protected mode. The necessary
check is added too.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a check_perm callback for each opcode into
the instruction emulator. This will be used to do all
necessary permission checks on instructions before checking
whether they are intercepted or not.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch prevents the changed CPU state to be written back
when the emulator detected that the instruction was
intercepted by the guest.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Add intercept codes for instructions defined by SVM as
interceptable.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running in guest mode, certain instructions can be intercepted by
hardware. This also holds for nested guests running on emulated
virtualization hardware, in particular instructions emulated by kvm
itself.
This patch adds a framework for intercepting instructions. If an
instruction is marked for interception, and if we're running in guest
mode, a callback is called to check whether an intercept is needed or
not. The callback is called at three points in time: immediately after
beginning execution, after checking privilge exceptions, and after
checking memory exception. This suits the different interception points
defined for different instructions and for the various virtualization
instruction sets.
In addition, a new X86EMUL_INTERCEPT is defined, which any callback or
memory access may define, allowing the more complicated intercepts to be
implemented in existing callbacks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
| |
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
| |
Add support for marking an instruction as SSE, switching registers used
to the SSE register file.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most SIMD instructions use the 66/f2/f3 prefixes to distinguish between
different variants of the same instruction. Usually the encoding is quite
regular, but in some cases (including non-SIMD instructions) the prefixes
generate very different instructions. Examples include XCHG/PAUSE,
MOVQ/MOVDQA/MOVDQU, and MOVBE/CRC32.
Allow the emulator to handle these special cases by splitting such opcodes
into groups, with different decode flags and execution functions for different
prefixes.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
Needed for emulating fpu instructions.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
Currently we store a rep prefix as 1 or 2 depending on whether it is a REPE or
REPNE. Since sse instructions depend on the prefix value, store it as the
original opcode to simplify things further on.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Since sse instructions can issue 16-byte mmios, we need to support them. We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
Make room for sse mmio completions.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
Needed for coalesced mmio using sse.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
Fix race between nmi injection and enabling nmi window in a simpler way.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
|
|
|
|
|
|
| |
This reverts commit f86368493ec038218e8663cc1b6e5393cd8e008a.
Simpler fix to follow.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As Avi recently mentioned, the new standard mechanism for exposing features
is KVM_GET_SUPPORTED_CPUID, not spamming CAPs. For some reason async pf
missed that.
So expose async_pf here.
Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Gleb Natapov <gleb@redhat.com>
CC: Avi Kivity <avi@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use vmx_set_nmi_mask() instead of open-coding management of
the hardware bit and the software hint (nmi_known_unmasked).
There's a slight change of behaviour when running without
hardware virtual NMI support - we now clear the NMI mask if
NMI delivery faulted in that case as well. This improves
emulation accuracy.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
| |
We use boot_cpu_has now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
vmx_complete_atomic_exit() cached it for us, so we can use it here.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
Only read it if we're going to use it later.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
| |
Move the exit reason checks to the front of the function, for early
exit in the common case.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
| |
Check for the exit reason first; this allows us, later,
to avoid a VMREAD for VM_EXIT_INTR_INFO_FIELD.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
When we haven't injected an interrupt, we don't need to recover
the nmi blocking state (since the guest can't set it by itself).
This allows us to avoid a VMREAD later on.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
We may read the cpl quite often in the same vmexit (instruction privilege
check, memory access checks for instruction and operands), so we gain
a bit if we cache the value.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
| |
In long mode, vm86 mode is disallowed, so we need not check for
it. Reading rflags.vm may require a VMREAD, so it is expensive.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
| |
If called several times within the same exit, return cached results.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
|
| |
Some rflags bits are owned by the host, not guest, so we need to use
kvm_get_rflags() to strip those bits away or kvm_set_rflags() to add them
back.
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
|
|
|
|
|
| |
We can get memslot id from memslot->id directly
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
slcan: fix ldisc->open retval
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
xfrm: Don't allow esn with disabled anti replay detection
xfrm: Assign the inner mode output function to the dst entry
net: dev_close() should check IFF_UP
vlan: fix GVRP at dismantle time
netfilter: revert a2361c8735e07322023aedc36e4938b35af31eb0
netfilter: IPv6: fix DSCP mangle code
netfilter: IPv6: initialize TOS field in REJECT target module
IPVS: init and cleanup restructuring
IPVS: Change of socket usage to enable name space exit.
netfilter: ebtables: only call xt_compat_add_offset once per rule
netfilter: fix ebtables compat support
netfilter: ctnetlink: fix timestamp support for new conntracks
pch_gbe: support ML7223 IOH
PCH_GbE : Fixed the issue of checksum judgment
PCH_GbE : Fixed the issue of collision detection
NET: slip, fix ldisc->open retval
be2net: Fixed bugs related to PVID.
ehea: fix wrongly reported speed and port
...
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch reverts a2361c8735e07322023aedc36e4938b35af31eb0:
"[PATCH] netfilter: xt_conntrack: warn about use in raw table"
Florian Wesphal says:
"... when the packet was sent from the local machine the skb
already has ->nfct attached, and -m conntrack seems to do
the right thing."
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Reported-by: Florian Wesphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The IPv6 header is not zeroed out in alloc_skb so we must initialize
it properly unless we want to see IPv6 packets with random TOS fields
floating around. The current implementation resets the flow label
but this could be changed if deemed necessary.
We stumbled upon this issue when trying to apply a mangle rule to
the RST packet generated by the REJECT target module.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.
IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.
This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.
The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.
Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.
Thanks Eric W Biederman for your advices.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The optimizations in commit 255d0dc34068a976
(netfilter: x_table: speedup compat operations) assume that
xt_compat_add_offset is called once per rule.
ebtables however called it for each match/target found in a rule.
The match/watcher/target parser already returns the needed delta, so it
is sufficient to move the xt_compat_add_offset call to a more reasonable
location.
While at it, also get rid of the unused COMPAT iterator macros.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
commit 255d0dc34068a976 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.
1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call
Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch fixes the missing initialization of the start time if
the timestamp support is enabled.
libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp 6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=1303296401 use=2
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
TTY layer expects 0 if the ldisc->open operation succeeded.
Reported-by: Matvejchikov Ilya <matvejchikov@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Like other mobile broadband device ethernet interfaces, mark the LG
VL600 with the 'wwan' devtype so userspace knows it needs additional
configuration via the AT port before the interface can be used.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Unlike the standard case, disabled anti replay detection needs some
nontrivial extra treatment on ESN. RFC 4303 states:
Note: If a receiver chooses to not enable anti-replay for an SA, then
the receiver SHOULD NOT negotiate ESN in an SA management protocol.
Use of ESN creates a need for the receiver to manage the anti-replay
window (in order to determine the correct value for the high-order
bits of the ESN, which are employed in the ICV computation), which is
generally contrary to the notion of disabling anti-replay for an SA.
So return an error if an ESN state with disabled anti replay detection
is inserted for now and add the extra treatment later if we need it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.
With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit 443457242beb (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()
Following actions trigger a BUG :
modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy
dev_close() must not close a non IFF_UP device.
With help from Frank Blaschka and Einar EL Lueck
Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3 # driver providing eth2
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
PGD 11d251067 PUD 11b9e0067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
CPU 0
Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]
Pid: 11494, comm: rmmod Tainted: G W 2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
RIP: 0010:[<ffffffffa0030c9e>] [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
RSP: 0018:ffff88007a19bae8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
Stack:
ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
Call Trace:
[<ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
[<ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
[<ffffffff8137e427>] __dev_close_many+0x87/0xe0
[<ffffffff8137e507>] dev_close_many+0x87/0x110
[<ffffffff8137e630>] rollback_registered_many+0xa0/0x240
[<ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
[<ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
[<ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
[<ffffffff81479d03>] notifier_call_chain+0x53/0x80
[<ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
[<ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
[<ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
[<ffffffff8137e85f>] rollback_registered+0x2f/0x40
[<ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
[<ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
[<ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]
We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|