litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	First draft of C-FL-splitwip-pgm-split	Namhoon Kim	2013-11-25
\|
*	Integrate preemption state machine with Linux scheduler	Bjoern Brandenburg	2013-08-07
\| \| \| \|	Track when a processor is going to schedule "soon".
*	Add LITMUS^RT core implementation	Bjoern Brandenburg	2013-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the core of LITMUS^RT: - library functionality (heaps, rt_domain, prioritization, etc.) - budget enforcement logic - job management - system call backends - virtual devices (control page, etc.) - scheduler plugin API (and dummy plugin) This code compiles, but is not yet integrated with the rest of Linux.
*	Add object descriptor table to Linux's task_struct	Bjoern Brandenburg	2013-08-07
\| \| \| \| \|	This table is similar to a file descriptor table. It keeps track of which "objects" (locks) a real-time task holds a handle to.
*	Add tracepoint support	Felipe Cerqueira	2013-08-07
\| \| \| \| \| \| \|	This patch integrates LITMUS^RT's sched_trace_XXX() macros with Linux's notion of tracepoints. This is useful to visualize schedules in kernel shark and similar tools. Historically, LITMUS^RT's sched_trace predates Linux's tracepoint infrastructure.
*	Add schedule tracing support	Felipe Cerqueira	2013-08-07
\| \| \| \| \| \|	This patch introduces the sched_trace infrastructure, which in principle allows tracing the generated schedule. However, this patch does not yet integrate the callbacks with the kernel.
*	Feather-Trace: add support for latency tracing/	Bjoern Brandenburg	2013-08-07
\| \| \| \|	This patch adds support for tracing release latency in LITMUS^RT.
*	Introduce main LITMUS^RT header	Bjoern Brandenburg	2013-08-07
\| \| \| \| \|	This patch adds a basic litmus/litmus.h, which is required for basic LITMUS^RT infrastructure to compile.
*	Extend task_struct with rt_param	Bjoern Brandenburg	2013-08-07
\| \| \| \|	This patch adds the PCB extensions required for LITMUS^RT.
*	Add hrtimer_start_on() support	Felipe Cerqueira	2013-08-07
\| \| \| \| \| \|	This patch adds hrtimer_start_on(), which allows arming timers on remote CPUs. This is needed to avoided timer interrupts on "shielded" CPUs and is also useful for implementing semi-partitioned schedulers.
*	Add TRACE() debug tracing support	Bjoern Brandenburg	2013-08-07
\| \| \| \|	This patch adds the infrastructure for the TRACE() debug macro.
*	Add quantum aligment code	Felipe Cerqueira	2013-08-07
\| \| \| \|	Enable staggered ticks for PD^2.
*	Add object list to inodes	Bjoern Brandenburg	2013-08-07
\| \| \| \| \| \| \|	This patch adds a list of arbitrary objects to inodes. This is used by Linux's locking API to attach lock objects to inodes (which represent namespaces in Linux's locking API).
*	Integrate ft_irq_fired() with Linux	Bjoern Brandenburg	2013-08-07
\| \| \| \| \|	This patch hooks up Feather-Trace's ft_irq_fired() handler with Linux's interrupt handling infrastructure.
*	Feather-Trace: add LITMUS^RT overhead tracing infrastructure	Bjoern Brandenburg	2013-08-07
\| \| \| \| \|	This patch adds the main infrastructure for tracing overheads in LITMUS^RT. It does not yet introduce any tracepoints into the kernel.
*	Feather-Trace: add generic ftdev device driver	Bjoern Brandenburg	2013-08-07
\| \| \| \| \|	This patch adds the ftdev device driver, which is used to export samples collected with Feather-Trace to userspace.
*	Feather-Trace: add platform independent implementation	Bjoern Brandenburg	2013-08-05
\| \| \| \| \|	This patch adds the simple fallback implementation and creates dummy hooks in the x86 and ARM Kconfig files.
*	iscsi-target: Fix iscsit_sequence_cmd reject handling for iser	Nicholas Bellinger	2013-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 561bf15892375597ee59d473a704a3e634c4f311 upstream This patch moves ISCSI_OP_REJECT failures into iscsit_sequence_cmd() in order to avoid external iscsit_reject_cmd() reject usage for all PDU types. It also updates PDU specific handlers for traditional iscsi-target code to not reset the session after posting a ISCSI_OP_REJECT during setup. (v2: Fix CMDSN_LOWER_THAN_EXP for ISCSI_OP_SCSI to call target_put_sess_cmd() after iscsit_sequence_cmd() failure) Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iscsi-target: Fix iscsit_add_reject* usage for iser	Nicholas Bellinger	2013-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit ba159914086f06532079fc15141f46ffe7e04a41 upstream This patch changes iscsit_add_reject() + iscsit_add_reject_from_cmd() usage to not sleep on iscsi_cmd->reject_comp to address a free-after-use usage bug in v3.10 with iser-target code. It saves ->reject_reason for use within iscsit_build_reject() so the correct value for both transport cases. It also drops the legacy fail_conn parameter usage throughput iscsi-target code and adds two iscsit_add_reject_cmd() and iscsit_reject_cmd helper functions, along with various small cleanups. (v2: Re-enable target_put_sess_cmd() to be called from iscsit_add_reject_from_cmd() for rejects invoked after target_get_sess_cmd() has been called) Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	firewire: fix libdc1394/FlyCap2 iso event regression	Clemens Ladisch	2013-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 0699a73af3811b66b1ab5650575acee5eea841ab upstream. Commit 18d627113b83 (firewire: prevent dropping of completed iso packet header data) was intended to be an obvious bug fix, but libdc1394 and FlyCap2 depend on the old behaviour by ignoring all returned information and thus not noticing that not all packets have been received yet. The result was that the video frame buffers would be saved before they contained the correct data. Reintroduce the old behaviour for old clients. Tested-by: Stepan Salenikovich <stepan.salenikovich@gmail.com> Tested-by: Josep Bosch <jep250@gmail.com> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iser-target: Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED	Nicholas Bellinger	2013-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit b2cb96494d83b894a43ba8b9023eead8ff50684b upstream. This patch addresses a bug where RDMA_CM_EVENT_DISCONNECTED may occur before the connection shutdown has been completed by rx/tx threads, that causes isert_free_conn() to wait indefinately on ->conn_wait. This patch allows isert_disconnect_work code to invoke rdma_disconnect when isert_disconnect_work() process context is started by client session reset before isert_free_conn() code has been reached. It also adds isert_conn->conn_mutex protection for ->state within isert_disconnect_work(), isert_cq_comp_err() and isert_free_conn() code, along with isert_check_state() for wait_event usage. (v2: Add explicit iscsit_cause_connection_reinstatement call during isert_disconnect_work() to force conn reset) Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	EDAC: Fix lockdep splat	Borislav Petkov	2013-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 88d84ac97378c2f1d5fec9af1e8b7d9a662d6b00 upstream. Fix the following: BUG: key ffff88043bdd0330 not in .data! ------------[ cut here ]------------ WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0() DEBUG_LOCKS_WARN_ON(1) Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor microcode CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1 Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 0000000000000009 ffff880439a1d920 ffffffff8160a9a9 ffff880439a1d958 ffffffff8103d9e0 ffff88043af4a510 ffffffff81a16e11 0000000000000000 ffff88043bdd0330 0000000000000000 ffff880439a1d9b8 ffffffff8103dacc Call Trace: dump_stack warn_slowpath_common warn_slowpath_fmt lockdep_init_map ? trace_hardirqs_on_caller ? trace_hardirqs_on debug_mutex_init __mutex_init bus_register edac_create_sysfs_mci_device edac_mc_add_mc sbridge_probe pci_device_probe driver_probe_device __driver_attach ? driver_probe_device bus_for_each_dev driver_attach bus_add_driver driver_register __pci_register_driver ? 0xffffffffa0010fff sbridge_init ? 0xffffffffa0010fff do_one_initcall load_module ? unset_module_init_ro_nx SyS_init_module tracesys ---[ end trace d24a70b0d3ddf733 ]--- EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0 EDAC sbridge: Driver loaded. What happens is that bus_register needs a statically allocated lock_key because the last is handed in to lockdep. However, struct mem_ctl_info embeds struct bus_type (the whole struct, not a pointer to it) and the whole thing gets dynamically allocated. Fix this by using a statically allocated struct bus_type for the MC bus. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	vlan: mask vlan prio bits	Eric Dumazet	2013-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit d4b812dea4a236f729526facf97df1a9d18e191c ] In commit 48cc32d38a52d0b68f91a171a8d00531edc6a46e ("vlan: don't deliver frames for unknown vlans to protocols") Florian made sure we set pkt_type to PACKET_OTHERHOST if the vlan id is set and we could find a vlan device for this particular id. But we also have a problem if prio bits are set. Steinar reported an issue on a router receiving IPv6 frames with a vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device, because skb->vlan_tci is set. Forwarded frame is completely corrupted : We can see (8100:4000) being inserted in the middle of IPv6 source address : 16:48:00.780413 IP6 2001:16d8:8100:4000:ee1c:0:9d9:bc87 > 9f94:4d95:2001:67c:29f4::: ICMP6, unknown icmp6 type (0), length 64 0x0000: 0000 0029 8000 c7c3 7103 0001 a0ae e651 0x0010: 0000 0000 ccce 0b00 0000 0000 1011 1213 0x0020: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 0x0030: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 It seems we are not really ready to properly cope with this right now. We can probably do better in future kernels : vlan_get_ingress_priority() should be a netdev property instead of a per vlan_dev one. For stable kernels, lets clear vlan_tci to fix the bugs. Reported-by: Steinar H. Gunderson <sesse@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	virtio: support unlocked queue poll	Michael S. Tsirkin	2013-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit cc229884d3f77ec3b1240e467e0236c3e0647c0c ] This adds a way to check ring empty state after enable_cb outside any locks. Will be used by virtio_net. Note: there's room for more optimization: caller is likely to have a memory barrier already, which means we might be able to get rid of a barrier here. Deferring this optimization until we do some benchmarking. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET ↵	Hannes Frederic Sowa	2013-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pending data [ Upstream commit 8822b64a0fa64a5dd1dfcf837c5b0be83f8c05d1 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Dave Jones <davej@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ipv6,mcast: always hold idev->lock before mca_lock	Amerigo Wang	2013-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 8965779d2c0e6ab246c82a405236b1fb2adae6b2, with some bits from commit b7b1bfce0bb68bd8f6e62a28295922785cc63781 ("ipv6: split duplicate address detection and router solicitation timer") to get the __ipv6_get_lladdr() used by this patch. ] dingtianhong reported the following deadlock detected by lockdep: ====================================================== [ INFO: possible circular locking dependency detected ] 3.4.24.05-0.1-default #1 Not tainted ------------------------------------------------------- ksoftirqd/0/3 is trying to acquire lock: (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 but task is already holding lock: (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mc->mca_lock){+.+...}: [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60 [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120 [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60 [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80 [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0 [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80 [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20 [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60 [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80 [<ffffffff813d9360>] dev_change_flags+0x40/0x70 [<ffffffff813ea627>] do_setlink+0x237/0x8a0 [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600 [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310 [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0 [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40 [<ffffffff81403e20>] netlink_unicast+0x140/0x180 [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380 [<ffffffff813c4252>] sock_sendmsg+0x112/0x130 [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460 [<ffffffff813c5544>] sys_sendmsg+0x44/0x70 [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b -> #0 (&ndev->lock){+.+...}: [<ffffffff810a798e>] check_prev_add+0x3de/0x440 [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f6c82>] rt_read_lock+0x42/0x60 [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 [<ffffffff8149b036>] mld_newpack+0xb6/0x160 [<ffffffff8149b18b>] add_grhead+0xab/0xc0 [<ffffffff8149d03b>] add_grec+0x3ab/0x460 [<ffffffff8149d14a>] mld_send_report+0x5a/0x150 [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0 [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0 [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0 [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0 [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0 [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210 [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0 [<ffffffff8106c7de>] kthread+0xae/0xc0 [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10 actually we can just hold idev->lock before taking pmc->mca_lock, and avoid taking idev->lock again when iterating idev->addr_list, since the upper callers of mld_newpack() already take read_lock_bh(&idev->lock). Reported-by: dingtianhong <dingtianhong@huawei.com> Cc: dingtianhong <dingtianhong@huawei.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Chen Weilong <chenweilong@huawei.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	net: Swap ver and type in pppoe_hdr	Changli Gao	2013-07-28
\| \| \| \| \| \| \| \| \| \|	[ Upstream commit b1a5a34bd0b8767ea689e68f8ea513e9710b671e ] Ver and type in pppoe_hdr should be swapped as defined by RFC2516 section-4. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	thermal: cpu_cooling: fix stub function	Arnd Bergmann	2013-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e8d39240d635ed9bcaddbec898b1c9f063c5dbb2 upstream. The function stub for cpufreq_cooling_get_level introduced in 57df81069 "Thermal: exynos: fix cooling state translation" is not syntactically correct C and needs to be fixed to avoid this error: In file included from drivers/thermal/db8500_thermal.c:20:0: include/linux/cpu_cooling.h: In function 'cpufreq_cooling_get_level': include/linux/cpu_cooling.h:57:1: error: parameter name omitted unsigned long cpufreq_cooling_get_level(unsigned int, unsigned int) ^ include/linux/cpu_cooling.h:57:1: error: parameter name omitted Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Eduardo Valentin <eduardo.valentin@ti.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Amit Daniel kachhap <amit.daniel@samsung.com> Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iio: Fix iio_channel_has_info	Alexandre Belloni	2013-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 1c297a66654a3295ae87e2b7f3724d214eb2b5ec upstream. Since the info_mask split, iio_channel_has_info() is not working correctly. info_mask_separate and info_mask_shared_by_type, it is not possible to compare them directly with the iio_chan_info_enum enum. Correct that bit using the BIT() macro. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	nbd: correct disconnect behavior	Paul Clements	2013-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit c378f70adbc1bbecd9e6db145019f14b2f688c7c upstream. Currently, when a disconnect is requested by the user (via NBD_DISCONNECT ioctl) the return from NBD_DO_IT is undefined (it is usually one of several error codes). This means that nbd-client does not know if a manual disconnect was performed or whether a network error occurred. Because of this, nbd-client's persist mode (which tries to reconnect after error, but not after manual disconnect) does not always work correctly. This change fixes this by causing NBD_DO_IT to always return 0 if a user requests a disconnect. This means that nbd-client can correctly either persist the connection (if an error occurred) or disconnect (if the user requested it). Signed-off-by: Paul Clements <paul.clements@steeleye.com> Acked-by: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	cgroup: fix RCU accesses to task->cgroups	Tejun Heo	2013-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 14611e51a57df10240817d8ada510842faf0ec51 upstream. task->cgroups is a RCU pointer pointing to struct css_set. A task switches to a different css_set on cgroup migration but a css_set doesn't change once created and its pointers to cgroup_subsys_states aren't RCU protected. task_subsys_state[_check]() is the macro to acquire css given a task and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not task->cgroups, so the RCU pointer task->cgroups ends up being dereferenced without read_barrier_depends() after it. It's broken. Fix it by introducing task_css_set[_check]() which does RCU-dereference on task->cgroups. task_subsys_state[_check]() is reimplemented to directly dereference ->subsys[] of the css_set returned from task_css_set[_check](). This removes some of sparse RCU warnings in cgroup. v2: Fixed unbalanced parenthsis and there's no need to use rcu_dereference_raw() when !CONFIG_PROVE_RCU. Both spotted by Li. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	futex: Take hugepages into account when generating futex_key	Zhang Yi	2013-07-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 13d60f4b6ab5b702dc8d2ee20999f98a93728aec upstream. The futex_keys of process shared futexes are generated from the page offset, the mapping host and the mapping index of the futex user space address. This should result in an unique identifier for each futex. Though this is not true when futexes are located in different subpages of an hugepage. The reason is, that the mapping index for all those futexes evaluates to the index of the base page of the hugetlbfs mapping. So a futex at offset 0 of the hugepage mapping and another one at offset PAGE_SIZE of the same hugepage mapping have identical futex_keys. This happens because the futex code blindly uses page->index. Steps to reproduce the bug: 1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0 and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs mapping. The mutexes must be initialized as PTHREAD_PROCESS_SHARED because PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as their keys solely depend on the user space address. 2. Lock mutex1 and mutex2 3. Create thread1 and in the thread function lock mutex1, which results in thread1 blocking on the locked mutex1. 4. Create thread2 and in the thread function lock mutex2, which results in thread2 blocking on the locked mutex2. 5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2 still blocks on mutex2 because the futex_key points to mutex1. To solve this issue we need to take the normal page index of the page which contains the futex into account, if the futex is in an hugetlbfs mapping. In other words, we calculate the normal page mapping index of the subpage in the hugetlbfs mapping. Mappings which are not based on hugetlbfs are not affected and still use page->index. Thanks to Mel Gorman who provided a patch for adding proper evaluation functions to the hugetlbfs code to avoid exposing hugetlbfs specific details to the futex code. [ tglx: Massaged changelog ] Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn> Reviewed-by: 'Mel Gorman' <mgorman@suse.de> Acked-by: 'Darren Hart' <dvhart@linux.intel.com> Cc: 'Peter Zijlstra' <peterz@infradead.org> Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	libceph: fix invalid unsigned->signed conversion for timespec encoding	Josh Durgin	2013-07-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 8b8cf8917f9b5d74e04f281272d8719ce335a497 upstream. __kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit architectures. Just drop this check as it has limited value. This fixes a crash like: [ 957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164! [ 957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc [ 957.932547] CPU: 1 Tainted: G W (3.9.0-ceph-19bb6a83-highbank #1) [ 957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph] [ 957.945967] LR is at 0xec520904 [ 957.949103] pc : [<bf13e76c>] lr : [<ec520904>] psr: 20000153 [ 957.949103] sp : ec753df8 ip : 00000001 fp : ec53e100 [ 957.960571] r10: ebef25c0 r9 : ec5fa400 r8 : ecbcc000 [ 957.965788] r7 : 00000000 r6 : 00000000 r5 : ffffffff r4 : 00000020 [ 957.972307] r3 : 51cc8143 r2 : ec520900 r1 : ec753e58 r0 : ec520908 [ 957.978827] Flags: nzCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user [ 957.986039] Control: 10c5387d Table: 2c59c04a DAC: 00000015 [ 957.991777] Process rbd (pid: 2138, stack limit = 0xec752238) [ 957.997514] Stack: (0xec753df8 to 0xec754000) [ 958.001864] 3de0: 00000001 00000001 [ 958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe [ 958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68 [ 958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2 [ 958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000 [ 958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060 [ 958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8 [ 958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70 [ 958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000 [ 958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8 [ 958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84 [ 958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48 [ 958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858 [ 958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68 [ 958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80 [ 958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920 [ 958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21 [ 958.140815] [<bf13e76c>] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd]) [ 958.152739] [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) [ 958.164486] [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd]) [ 958.175967] [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd]) [ 958.185975] [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd]) from [<c02e4c44>] (bus_attr_store+0x20/0x2c) [ 958.194850] [<c02e4c44>] (bus_attr_store+0x20/0x2c) from [<c016b3dc>] (sysfs_write_file+0x168/0x198) [ 958.203984] [<c016b3dc>] (sysfs_write_file+0x168/0x198) from [<c0108444>] (vfs_write+0x9c/0x170) [ 958.212768] [<c0108444>] (vfs_write+0x9c/0x170) from [<c0108858>] (sys_write+0x3c/0x70) [ 958.220768] [<c0108858>] (sys_write+0x3c/0x70) from [<c000dbc0>] (ret_fast_syscall+0x0/0x30) [ 958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2) Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	2013-06-27
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull networking fixes from David Miller: 1) Found via trinity: If you connect up an ipv6 socket to an ipv4 mapped address then an ipv6 one, sendmsg() can croak because ip6_sk_dst_check() assumes the route cached in the socket is an ipv6 one. In this case there is an ipv4 route attached, so it gets stomped on. Reported by Dave Jones and Hannes Frederic Sowa, fixed by Eric Dumazet. 2) AF_KEY notifications leak some kernel memory to userspace, fix from Mathias Krause. 3) DLCI calls __dev_get_by_name() without proper locking, and dlci_del doesn't validate that the device being deleted is actually a DLCI one. Fixes from Li Zefan. 4) Length check on bluetooth l2cap information responses is wrong, each response type has a different lenth, so we should make sure it's in a given range rather than enforce one single valid length. From Jaganath Kanakkassery. 5) Receive FIFO overflow is really easy to trigger in stress scenerios in the sh_eth driver, but the event isn't being handled properly at all. Specifically, the mask of error interrupts doesn't include the event so we never clear it, resulting in the driver becomming wedged processing an interrupt that never gets cleared. Fix from Sergei Shtylyov. 6) qlcnic sleeps while holding a spinlock, use mdelay() instead of msleep(). From Shahed Shaikh. 7) Missing curly braces causes SIP netfilter NAT module to always drop packets. Fix from Balazs Peter Odor. 8) ipt_ULOG in netfilter passes the wrong value to timer setup, causing the timer to dereference crap when it fires. Fix from Gao Feng. 9) Missing RCU protection around txq->axq_acq traversal in ath_txq_schedule(). Fix from Felix Fietkau. 10) Idle state transition test in ath9k_htc_config() is reversed, fix from Sujith Manoharan. 11) IPV6 forwarding handles unicast Router Alert packets incorrectly. It tests the wrong option state. Previously opt->ra being non-zero indicated a router alert marking in the SKB, but now it's indicated by a bit in opt->flags. Fix from YOSHIFUJI Hideaki. 12) SKB leak in GRE tunnel GSO handling, from Eric Dumazet. 13) get_user_pages_fast() error handling in TUN and MACVTAP use the same local variable for the base index and the loop iterator for page traversal, oops! Fix from Michael S Tsirkin. 14) ipv6_get_lladdr() can fail, and we must therefore check it's return value in inet6_set_iftoken(). For from Hannes Frederic Sowa. 15) If you change an interface name and meanwhile can sneak in something that looks up the name (like SO_BINDTODEVICE or SIOCGIFNAME) we can deadlock with CONFIG_PREEMPT=n. Fix this by providing a helper function that properly uses raw_seqcount_begin(). From Nicolas Schichan. 16) Chain noise calibration test is inverted in iwlwifi, fix from Nikolay Martynov. 17) Properly set TX iwlwifi descriptor flags for back requests. Fix from Emmanuel Grumbach. 18) We can't assume skb_transport_header() is set in xt_TCPOPTSTRAP module, fix from Pablo Neira Ayuso. 19) Some crummy APs don't provide the proper High Throughput info in association response frames. Add a workaround by assume we'll use whatever is in the beacon/probe. Fix from Johannes Berg. 20) mac80211 call to rate_idx_match_mask() swaps two arguments (mask and channel width). Fix from Simon Wunderlich. 21) xt_TCPMSS (like xt_TCPOPTSTRAP) must not try to handle fragmented frames. Fix from Phil Oester. 22) Fix rate control regression causing iwlwifi/iwlegacy chips to use 1Mbit/s on pre-11n networks. From Moshe Benji and Stanslaw Gruszka. 23) Disable brcmsmac power-save functions, they cause regressions. From Arend van Spriel. 24) Enforce a sane minimum MTU in l2cap_build_cmd() otherwise we can easily crash. Fix from Anderson Lizardo. 25) If a learning packet arrives during vxlan_stop() we crash, easily fixed by checking netif_running(). From Stephen Hemminger. 26) Static vxlan FDB entries should not be migrated, also from Stephen. 27) skb_clone() failures not handled in vxlan_xmit(), oops. Also from Stephen. 28) Add minimal driver for AR816x/AR817x ethernet chips, from Johannes Berg. 29) Fix regression in userspace VLAN acceleration control, added by the 802.1ad support changes. Fix from Fernando Luis Vazquez Cao. 30) Interval selection for MLD queries in the bridging code was reversed. Fix from Linus Lüssing. 31) ipv6's ndisc_send_redirect() erroneously writes to the packet we received not the packet we are building to send out. Fix from Matthias Schiffer. 32) Don't free netdev before unregistering it, in usb_8dev can driver. From Marc Kleine-Budde. 33) Fix nl80211 attribute buffer races, from Johannes Berg. 34) Although netlink_diag.h is under uapi/ it isn't present in Kbuild. From Stephen Hemminger. 35) Wrong address and family passed to MD5 key lookups in TCP, from Aydin Arik. 36) phy_type attribute created by SFC driver should not be writable. From Ben Hutchings. 37) Receive/Transmit queue allocations in pxa168_eth and mv643xx_eth should use kzalloc(). Otherwise if setup fails half-way, we'll dereference garbage when trying to teardown the rings. From Lubomir Rintel. 38) Fix double-allocation of dst (resulting in unfreeable net device) in ipv6's init_loopback(). From Gao Feng. 39) Fix fragmentation handling SKB leak in netfilter conntrack, we were freeing the wrong skb pointer. From Phil Oester. 40) Don't report "-1" (SPEED_UNKNOWN) in bond_miimon_commit(), from Nikolay Aleksandrov. 41) davinci_cpdma doesn't check for DMA mapping errors, letting the device scribble to random addresses. From Sebastian Siewior. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) dlci: validate the net device in dlci_del() dlci: acquire rtnl_lock before calling __dev_get_by_name() af_key: fix info leaks in notify messages ipv6: ip6_sk_dst_check() must not assume ipv6 dst net: fix kernel deadlock with interface rename and netdev name retrieval. net/tg3: Avoid delay during MMIO access ipv6: check return value of ipv6_get_lladdr macvtap: fix recovery from gup errors tun: fix recovery from gup errors gre: fix a possible skb leak ipv6: Process unicast packet with Router Alert by checking flag in skb. ath9k_htc: Handle IDLE state transition properly ath9k: fix an RCU issue in calling ieee80211_get_tx_rates netfilter: ipt_ULOG: fix incorrect setting of ulog timer netfilter: ctnetlink: send event when conntrack label was modified netfilter: nf_nat_sip: fix mangling qlcnic: Do not sleep while holding spinlock drivers: net: cpsw: fix compilation error with cpsw driver tcp: doc : fix the syncookies default value sh_eth: fix misreporting of transmit abort ...
\| *	net: fix kernel deadlock with interface rename and netdev name retrieval.	Nicolas Schichan	2013-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the kernel (compiled with CONFIG_PREEMPT=n) is performing the rename of a network interface, it can end up waiting for a workqueue to complete. If userland is able to invoke a SIOCGIFNAME ioctl or a SO_BINDTODEVICE getsockopt in between, the kernel will deadlock due to the fact that read_secklock_begin() will spin forever waiting for the writer process (the one doing the interface rename) to update the devnet_rename_seq sequence. This patch fixes the problem by adding a helper (netdev_get_name()) and using it in the code handling the SIOCGIFNAME ioctl and SO_BINDTODEVICE setsockopt. The netdev_get_name() helper uses raw_seqcount_begin() to avoid spinning forever, waiting for devnet_rename_seq->sequence to become even. cond_resched() is used in the contended case, before retrying the access to give the writer process a chance to finish. The use of raw_seqcount_begin() will incur some unneeded work in the reader process in the contended case, but this is better than deadlocking the system. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	gre: fix a possible skb leak	Eric Dumazet	2013-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 68c331631143 ("v4 GRE: Add TCP segmentation offload for GRE") added a possible skb leak, because it frees only the head of segment list, in case a skb_linearize() call fails. This patch adds a kfree_skb_list() helper to fix the bug. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: vlan: fix comment for vlan_ethhdr->h_vlan_proto	Olaf Hering	2013-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After addition of 8021AD h_vlan_proto can be either ETH_P_8021Q or ETH_P_8021AD. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	netlink: export netlink_diag.h header	stephen hemminger	2013-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The netlink_diag.h is in include/uapi/linux but not in the Kbuild necessary to cause it to be exported by make headers_install. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ACPI / dock / PCI: Synchronous handling of dock events for PCI devices	Rafael J. Wysocki	2013-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The interactions between the ACPI dock driver and the ACPI-based PCI hotplug (acpiphp) are currently problematic because of ordering issues during hot-remove operations. First of all, the current ACPI glue code expects that physical devices will always be deleted before deleting the companion ACPI device objects. Otherwise, acpi_unbind_one() will fail with a warning message printed to the kernel log, for example: [ 185.026073] usb usb5: Oops, 'acpi_handle' corrupt [ 185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt [ 185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt [ 180.013656] port1: Oops, 'acpi_handle' corrupt This means, in particular, that struct pci_dev objects have to be deleted before the struct acpi_device objects they are "glued" with. Now, the following happens the during the undocking of an ACPI-based dock station: 1) hotplug_dock_devices() invokes registered hotplug callbacks to destroy physical devices associated with the ACPI device objects depending on the dock station. It calls dd->ops->handler() for each of those device objects. 2) For PCI devices dd->ops->handler() points to handle_hotplug_event_func() that queues up a separate work item to execute _handle_hotplug_event_func() for the given device and returns immediately. That work item will be executed later. 3) hotplug_dock_devices() calls dock_remove_acpi_device() for each device depending on the dock station. This runs acpi_bus_trim() for each of them, which causes the underlying ACPI device object to be destroyed, but the work items queued up by handle_hotplug_event_func() haven't been started yet. 4) _handle_hotplug_event_func() queued up in step 2) are executed and cause the above failure to happen, because the PCI devices they handle do not have the companion ACPI device objects any more (those objects have been deleted in step 3). The possible breakage doesn't end here, though, because hotplug_dock_devices() may return before at least some of the _handle_hotplug_event_func() work items spawned by it have a chance to complete and then undock() will cause _DCK to be evaluated and that will cause the devices handled by the _handle_hotplug_event_func() to go away possibly while they are being accessed. This means that dd->ops->handler() for PCI devices should not point to handle_hotplug_event_func(). Instead, it should point to a function that will do the work of _handle_hotplug_event_func() synchronously. For this reason, introduce such a function, hotplug_event_func(), and modity acpiphp_dock_ops to point to it as the handler. Unfortunately, however, this is not sufficient, because if the dock code were not changed further, hotplug_event_func() would now deadlock with hotplug_dock_devices() that called it, since it would run unregister_hotplug_dock_device() which in turn would attempt to acquire the dock station's hp_lock mutex already acquired by hotplug_dock_devices(). To resolve that deadlock use the observation that unregister_hotplug_dock_device() won't need to acquire hp_lock if PCI bridges the devices on the dock station depend on are prevented from being removed prematurely while the first loop in hotplug_dock_devices() is in progress. To make that possible, introduce a mechanism by which the callers of register_hotplug_dock_device() can provide "init" and "release" routines that will be executed, respectively, during the addition and removal of the physical device object associated with the given ACPI device handle. Make acpiphp use two new functions, acpiphp_dock_init() and acpiphp_dock_release(), that call get_bridge() and put_bridge(), respectively, on the acpiphp bridge holding the given device, for this purpose. In addition to that, remove the dock station's list of "hotplug devices" and make the dock code always walk the whole list of "dependent devices" instead in such a way that the loops in hotplug_dock_devices() and dock_event() (replacing the loops over "hotplug devices") will take references to the list entries that register_hotplug_dock_device() has been called for. That prevents the "release" routines associated with those entries from being called while the given entry is being processed and for PCI devices this means that their bridges won't be removed (by a concurrent thread) while hotplug_event_func() handling them is being executed. This change is based on two earlier patches from Jiang Liu. References: https://bugzilla.kernel.org/show_bug.cgi?id=59501 Reported-and-tested-by: Alexander E. Patrakov <patrakov@gmail.com> Tracked-down-by: Jiang Liu <jiang.liu@huawei.com> Tested-by: Illya Klymov <xanf@xanf.me> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: 3.9+ <stable@vger.kernel.org>
* \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2013-06-22
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Several fixes for bugs caught while looking through f_pos (ab)users" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aout32 coredump compat fix splice: don't pass the address of ->f_pos to methods mconsole: we'd better initialize pos before passing it to vfs_read()...
\| * \|	splice: don't pass the address of ->f_pos to methods	Al Viro	2013-06-20
\| \|/ \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	Merge tag 'acpi-3.10-rc7' of ↵	Linus Torvalds	2013-06-21
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: - Fix for a regression causing a failure to turn on some devices on some systems during initialization introduced by a recent revert of an ACPI PM change that broke something else. Fortunately, we know exactly what devices are affected, so we can add a fix just for them leaving everyone else alone. - ACPI power resources initialization fix preventing a NULL pointer from being dereferenced in the acpi_add_power_resource() error code path. - ACPI dock station driver fix that adds missing locking to write_undock(). - ACPI resources allocation fix changing the scope of an old workaround so that it doesn't affect systems that aren't actually buggy. This was reported a couple of days ago to fix DMA problems on some new platforms so we need it in -stable. From Mika Westerberg. * tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / LPSS: Power up LPSS devices during enumeration ACPI / PM: Fix error code path for power resources initialization ACPI / dock: Take ACPI scan lock in write_undock() ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources
\| * \|	ACPI / LPSS: Power up LPSS devices during enumeration	Rafael J. Wysocki	2013-06-19
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 7cd8407 (ACPI / PM: Do not execute _PS0 for devices without _PSC during initialization) introduced a regression on some systems with Intel Lynxpoint Low-Power Subsystem (LPSS) where some devices need to be powered up during initialization, but their device objects in the ACPI namespace have _PS0 and _PS3 only (without _PSC or power resources). To work around this problem, make the ACPI LPSS driver power up devices it knows about by using a new helper function acpi_device_fix_up_power() that does all of the necessary sanity checks and calls acpi_dev_pm_explicit_set() to put the device into D0. Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* \|	Merge branch 'sched-urgent-for-linus' of ↵	Linus Torvalds	2013-06-20
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Two smaller fixes - plus a context tracking tracing fix that is a bit bigger" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing/context-tracking: Add preempt_schedule_context() for tracing sched: Fix clear NOHZ_BALANCE_KICK sched/x86: Construct all sibling maps if smt
\| * \|	tracing/context-tracking: Add preempt_schedule_context() for tracing	Steven Rostedt	2013-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dave Jones hit the following bug report: =============================== [ INFO: suspicious RCU usage. ] 3.10.0-rc2+ #1 Not tainted ------------------------------- include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! 2 locks held by cc1/63645: #0: (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0 CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369] Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010 0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28 ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500 0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645 Call Trace: [<ffffffff816ae383>] dump_stack+0x19/0x1b [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130 [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0 [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0 [<ffffffff8108dffc>] update_curr+0xec/0x240 [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480 [<ffffffff816b3a71>] __schedule+0x161/0x9b0 [<ffffffff816b4721>] preempt_schedule+0x51/0x80 [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60 [<ffffffff816b6824>] ? retint_careful+0x12/0x2e [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210 [<ffffffff816be280>] ftrace_call+0x5/0x2f [<ffffffff816b681d>] ? retint_careful+0xb/0x2e [<ffffffff816b4805>] ? schedule_user+0x5/0x70 [<ffffffff816b4805>] ? schedule_user+0x5/0x70 [<ffffffff816b6824>] ? retint_careful+0x12/0x2e ------------[ cut here ]------------ What happened was that the function tracer traced the schedule_user() code that tells RCU that the system is coming back from userspace, and to add the CPU back to the RCU monitoring. Because the function tracer does a preempt_disable/enable_notrace() calls the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set, then preempt_schedule() is called. But this is called before the user_exit() function can inform the kernel that the CPU is no longer in user mode and needs to be accounted for by RCU. The fix is to create a new preempt_schedule_context() that checks if the kernel is still in user mode and if so to switch it to kernel mode before calling schedule. It also switches back to user mode coming back from schedule in need be. The only user of this currently is the preempt_enable_notrace(), which is only used by the tracing subsystem. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \| \|	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds	2013-06-20
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Four fixes. The mmap ones are unfortunately larger than desired - fuzzing uncovered bugs that needed perf context life time management changes to fix properly" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix broken PEBS-LL support on SNB-EP/IVB-EP perf: Fix mmap() accounting hole perf: Fix perf mmap bugs kprobes: Fix to free gone and unused optprobes
\| * \| \|	perf: Fix perf mmap bugs	Peter Zijlstra	2013-05-28
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Vince reported a problem found by his perf specific trinity fuzzer. Al noticed 2 problems with perf's mmap(): - it has issues against fork() since we use vma->vm_mm for accounting. - it has an rb refcount leak on double mmap(). We fix the issues against fork() by using VM_DONTCOPY; I don't think there's code out there that uses this; we didn't hear about weird accounting problems/crashes. If we do need this to work, the previously proposed VM_PINNED could make this work. Aside from the rb reference leak spotted by Al, Vince's example prog was indeed doing a double mmap() through the use of perf_event_set_output(). This exposes another problem, since we now have 2 events with one buffer, the accounting gets screwy because we account per event. Fix this by making the buffer responsible for its own accounting. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20130528085548.GA12193@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \| \|	Merge branch 'timers-urgent-for-linus' of ↵	Linus Torvalds	2013-06-20
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: - Fix inconstinant clock usage in virtual time accounting - Fix a build error in KVM caused by the NOHZ work - Remove a pointless timekeeping duty assignment which breaks NOHZ - Use a proper notifier return value to avoid random behaviour * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick: Remove useless timekeeping duty attribution to broadcast source nohz: Fix notifier return val that enforce timekeeping kvm: Move guest entry/exit APIs to context_tracking vtime: Use consistent clocks among nohz accounting
\| * \| \|	kvm: Move guest entry/exit APIs to context_tracking	Frederic Weisbecker	2013-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kvm_host.h header file doesn't handle well inclusion when archs don't support KVM. This results in build crashes for such archs when they want to implement context tracking because this subsystem includes kvm_host.h in order to implement the guest_enter/exit APIs but it doesn't handle KVM off case. To fix this, move the guest_enter()/guest_exit() declarations and generic implementation to the context tracking headers. These generic APIs actually belong to this subsystem, besides other domains boundary tracking like user_enter() et al. KVM now properly becomes a user of this library, not the other buggy way around. Reported-by: Kevin Hilman <khilman@linaro.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Tested-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| * \| \|	vtime: Use consistent clocks among nohz accounting	Frederic Weisbecker	2013-05-31
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While computing the cputime delta of dynticks CPUs, we are mixing up clocks of differents natures: * local_clock() which takes care of unstable clock sources and fix these if needed. * sched_clock() which is the weaker version of local_clock(). It doesn't compute any fixup in case of unstable source. If the clock source is stable, those two clocks are the same and we can safely compute the difference against two random points. Otherwise it results in random deltas as sched_clock() can randomly drift away, back or forward, from local_clock(). As a consequence, some strange behaviour with unstable tsc has been observed such as non progressing constant zero cputime. (The 'top' command showing no load). Fix this by only using local_clock(), or its irq safe/remote equivalent, in vtime code. Reported-by: Mike Galbraith <efault@gmx.de> Suggested-by: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>