aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge tag 'fixes-for-linus' of ↵Linus Torvalds2013-08-08
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio fixes from Rusty Russell: "More virtio console fixes than I'm happy with, but all real issues, and all CC:stable.." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: virtio-scsi: Fix virtqueue affinity setup virtio: console: return -ENODEV on all read operations after unplug virtio: console: fix raising SIGIO after port unplug virtio: console: clean up port data immediately at time of unplug virtio: console: fix race in port_fops_open() and port unplug virtio: console: fix race with port unplug and open/close virtio/console: Add pipe_lock/unlock for splice_write virtio/console: Quit from splice_write if pipe->nrbufs is 0
| * virtio-scsi: Fix virtqueue affinity setupAsias He2013-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vscsi->num_queues counts the number of request virtqueue which does not include the control and event virtqueue. It is wrong to subtract VIRTIO_SCSI_VQ_BASE from vscsi->num_queues. This patch fixes the following panic. (qemu) device_del scsi0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120 PGD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 659 Comm: kworker/0:1 Not tainted 3.11.0-rc2+ #1172 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: kacpi_hotplug _handle_hotplug_event_func task: ffff88007bee1cc0 ti: ffff88007bfe4000 task.ti: ffff88007bfe4000 RIP: 0010:[<ffffffff8179b29f>] [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120 RSP: 0018:ffff88007bfe5a38 EFLAGS: 00010202 RAX: 0000000000000010 RBX: ffff880077fd0d28 RCX: 0000000000000050 RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000 RBP: ffff88007bfe5a58 R08: ffff880077f6ff00 R09: 0000000000000001 R10: ffffffff8143e673 R11: 0000000000000001 R12: 0000000000000001 R13: ffff880077fd0800 R14: 0000000000000000 R15: ffff88007bf489b0 FS: 0000000000000000(0000) GS:ffff88007ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000079f8b000 CR4: 00000000000006f0 Stack: ffff880077fd0d28 0000000000000000 ffff880077fd0800 0000000000000008 ffff88007bfe5a78 ffffffff8179b37d ffff88007bccc800 ffff88007bccc800 ffff88007bfe5a98 ffffffff8179b3b6 ffff88007bccc800 ffff880077fd0d28 Call Trace: [<ffffffff8179b37d>] virtscsi_set_affinity+0x2d/0x40 [<ffffffff8179b3b6>] virtscsi_remove_vqs+0x26/0x50 [<ffffffff8179c7d2>] virtscsi_remove+0x82/0xa0 [<ffffffff814cb6b2>] virtio_dev_remove+0x22/0x70 [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0 [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40 [<ffffffff8167bb96>] bus_remove_device+0x116/0x150 [<ffffffff81679936>] device_del+0x126/0x1e0 [<ffffffff81679a06>] device_unregister+0x16/0x30 [<ffffffff814cb889>] unregister_virtio_device+0x19/0x30 [<ffffffff814cdad6>] virtio_pci_remove+0x36/0x80 [<ffffffff81464ae7>] pci_device_remove+0x37/0x70 [<ffffffff8167ca49>] __device_release_driver+0x69/0xd0 [<ffffffff8167cb9d>] device_release_driver+0x2d/0x40 [<ffffffff8167bb96>] bus_remove_device+0x116/0x150 [<ffffffff81679936>] device_del+0x126/0x1e0 [<ffffffff8145edfc>] pci_stop_bus_device+0x9c/0xb0 [<ffffffff8145f036>] pci_stop_and_remove_bus_device+0x16/0x30 [<ffffffff81474a9e>] acpiphp_disable_slot+0x8e/0x150 [<ffffffff81474f6a>] hotplug_event_func+0xba/0x1a0 [<ffffffff814906c8>] ? acpi_os_release_object+0xe/0x12 [<ffffffff81475911>] _handle_hotplug_event_func+0x31/0x70 [<ffffffff810b5333>] process_one_work+0x183/0x500 [<ffffffff810b66e2>] worker_thread+0x122/0x400 [<ffffffff810b65c0>] ? manage_workers+0x2d0/0x2d0 [<ffffffff810bc5de>] kthread+0xce/0xe0 [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff81ca045c>] ret_from_fork+0x7c/0xb0 [<ffffffff810bc510>] ? kthread_freezable_should_stop+0x70/0x70 Code: 01 00 00 00 74 59 45 31 e4 83 bb c8 01 00 00 02 74 46 66 2e 0f 1f 84 00 00 00 00 00 49 63 c4 48 c1 e0 04 48 8b bc 0 3 10 02 00 00 <48> 8b 47 20 48 8b 80 d0 01 00 00 48 8b 40 50 48 85 c0 74 07 be RIP [<ffffffff8179b29f>] __virtscsi_set_affinity+0x6f/0x120 RSP <ffff88007bfe5a38> CR2: 0000000000000020 ---[ end trace 99679331a3775f48 ]--- CC: stable@vger.kernel.org Signed-off-by: Asias He <asias@redhat.com> Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio: console: return -ENODEV on all read operations after unplugAmit Shah2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a port gets unplugged while a user is blocked on read(), -ENODEV is returned. However, subsequent read()s returned 0, indicating there's no host-side connection (but not indicating the device went away). This also happened when a port was unplugged and the user didn't have any blocking operation pending. If the user didn't monitor the SIGIO signal, they won't have a chance to find out if the port went away. Fix by returning -ENODEV on all read()s after the port gets unplugged. write() already behaves this way. CC: <stable@vger.kernel.org> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio: console: fix raising SIGIO after port unplugAmit Shah2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SIGIO should be sent when a port gets unplugged. It should only be sent to prcesses that have the port opened, and have asked for SIGIO to be delivered. We were clearing out guest_connected before calling send_sigio_to_port(), resulting in a sigio not getting sent to processes. Fix by setting guest_connected to false after invoking the sigio function. CC: <stable@vger.kernel.org> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio: console: clean up port data immediately at time of unplugAmit Shah2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to keep the port's char device structs and the /sys entries around till the last reference to the port was dropped. This is actually unnecessary, and resulted in buggy behaviour: 1. Open port in guest 2. Hot-unplug port 3. Hot-plug a port with the same 'name' property as the unplugged one This resulted in hot-plug being unsuccessful, as a port with the same name already exists (even though it was unplugged). This behaviour resulted in a warning message like this one: -------------------8<--------------------------------------- WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted) Hardware name: KVM sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:04.0/virtio0/virtio-ports/vport0p1' Call Trace: [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b6f6>] ? warn_slowpath_fmt+0x46/0x50 [<ffffffff811f2319>] ? sysfs_add_one+0xc9/0x130 [<ffffffff811f23e8>] ? create_dir+0x68/0xb0 [<ffffffff811f2469>] ? sysfs_create_dir+0x39/0x50 [<ffffffff81273129>] ? kobject_add_internal+0xb9/0x260 [<ffffffff812733d8>] ? kobject_add_varg+0x38/0x60 [<ffffffff812734b4>] ? kobject_add+0x44/0x70 [<ffffffff81349de4>] ? get_device_parent+0xf4/0x1d0 [<ffffffff8134b389>] ? device_add+0xc9/0x650 -------------------8<--------------------------------------- Instead of relying on guest applications to release all references to the ports, we should go ahead and unregister the port from all the core layers. Any open/read calls on the port will then just return errors, and an unplug/plug operation on the host will succeed as expected. This also caused buggy behaviour in case of the device removal (not just a port): when the device was removed (which means all ports on that device are removed automatically as well), the ports with active users would clean up only when the last references were dropped -- and it would be too late then to be referencing char device pointers, resulting in oopses: -------------------8<--------------------------------------- PID: 6162 TASK: ffff8801147ad500 CPU: 0 COMMAND: "cat" #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b #1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322 #2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50 #3 [ffff88011b9d5bf0] die at ffffffff8100f26b #4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2 #5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5 [exception RIP: strlen+2] RIP: ffffffff81272ae2 RSP: ffff88011b9d5d00 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880118901c18 RCX: 0000000000000000 RDX: ffff88011799982c RSI: 00000000000000d0 RDI: 3a303030302f3030 RBP: ffff88011b9d5d38 R8: 0000000000000006 R9: ffffffffa0134500 R10: 0000000000001000 R11: 0000000000001000 R12: ffff880117a1cc10 R13: 00000000000000d0 R14: 0000000000000017 R15: ffffffff81aff700 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d #7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551 #8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb #9 [ffff88011b9d5de0] device_del at ffffffff813440c7 -------------------8<--------------------------------------- So clean up when we have all the context, and all that's left to do when the references to the port have dropped is to free up the port struct itself. CC: <stable@vger.kernel.org> Reported-by: chayang <chayang@redhat.com> Reported-by: YOGANANTH SUBRAMANIAN <anantyog@in.ibm.com> Reported-by: FuXiangChun <xfu@redhat.com> Reported-by: Qunfang Zhang <qzhang@redhat.com> Reported-by: Sibiao Luo <sluo@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio: console: fix race in port_fops_open() and port unplugAmit Shah2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Between open() being called and processed, the port can be unplugged. Check if this happened, and bail out. A simple test script to reproduce this is: while true; do for i in $(seq 1 100); do echo $i > /dev/vport0p3; done; done; This opens and closes the port a lot of times; unplugging the port while this is happening triggers the bug. CC: <stable@vger.kernel.org> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio: console: fix race with port unplug and open/closeAmit Shah2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | There's a window between find_port_by_devt() returning a port and us taking a kref on the port, where the port could get unplugged. Fix it by taking the reference in find_port_by_devt() itself. Problem reported and analyzed by Mateusz Guzik. CC: <stable@vger.kernel.org> Reported-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio/console: Add pipe_lock/unlock for splice_writeYoshihiro YUNOMAE2013-07-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add pipe_lock/unlock for splice_write to avoid oops by following competition: (1) An application gets fds of a trace buffer, virtio-serial, pipe. (2) The application does fork() (3) The processes execute splice_read(trace buffer) and splice_write(virtio-serial) via same pipe. <parent> <child> get fds of a trace buffer, virtio-serial, pipe | fork()----------create--------+ | | splice(read) | ---+ splice(write) | +-- no competition | splice(read) | | splice(write) ---+ | | splice(read) | splice(write) splice(read) ------ competition | splice(write) Two processes share a pipe_inode_info structure. If the child execute splice(read) when the parent tries to execute splice(write), the structure can be broken. Existing virtio-serial driver does not get lock for the structure in splice_write, so this competition will induce oops. <oops messages> BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130 PGD 7223e067 PUD 72391067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: lockd bnep bluetooth rfkill sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr virtio_net virtio_balloon i2c_piix4 i2c_core microcode uinput floppy CPU: 0 PID: 1072 Comm: compete-test Not tainted 3.10.0ws+ #55 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880071b98000 ti: ffff88007b55e000 task.ti: ffff88007b55e000 RIP: 0010:[<ffffffff811a6b5f>] [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130 RSP: 0018:ffff88007b55fd78 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88007b55fe20 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88007a95ba30 RDI: ffff880036f9e6c0 RBP: ffff88007b55fda8 R08: 00000000000006ec R09: ffff880077626708 R10: 0000000000000003 R11: ffffffff8139ca59 R12: ffff88007a95ba30 R13: 0000000000000000 R14: ffffffff8139dd00 R15: ffff880036f9e6c0 FS: 00007f2e2e3a0740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000000071bd1000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffff8139ca59 ffff88007b55fe20 ffff880036f9e6c0 ffffffff8139dd00 ffff8800776266c0 ffff880077626708 ffff88007b55fde8 ffffffff811a6e8e ffff88007b55fde8 ffffffff8139ca59 ffff880036f9e6c0 ffff88007b55fe20 Call Trace: [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0 [<ffffffff8139dd00>] ? virtcons_restore+0x100/0x100 [<ffffffff811a6e8e>] __splice_from_pipe+0x7e/0x90 [<ffffffff8139ca59>] ? alloc_buf.isra.13+0x39/0xb0 [<ffffffff8139d739>] port_fops_splice_write+0xe9/0x140 [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120 [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0 [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110 [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0 [<ffffffff8161facf>] tracesys+0xdd/0xe2 Code: 49 8b 87 80 00 00 00 4c 8d 24 d0 8b 53 04 41 8b 44 24 0c 4d 8b 6c 24 10 39 d0 89 03 76 02 89 13 49 8b 44 24 10 4c 89 e6 4c 89 ff <ff> 50 18 85 c0 0f 85 aa 00 00 00 48 89 da 4c 89 e6 4c 89 ff 41 RIP [<ffffffff811a6b5f>] splice_from_pipe_feed+0x6f/0x130 RSP <ffff88007b55fd78> CR2: 0000000000000018 ---[ end trace 24572beb7764de59 ]--- V2: Fix a locking problem for error V3: Add Reviewed-by lines and stable@ line in sign-off area Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Amit Shah <amit.shah@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * virtio/console: Quit from splice_write if pipe->nrbufs is 0Yoshihiro YUNOMAE2013-07-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quit from splice_write if pipe->nrbufs is 0 for avoiding oops in virtio-serial. When an application was doing splice from a kernel buffer to virtio-serial on a guest, the application received signal(SIGINT). This situation will normally happen, but the kernel executed a kernel panic by oops as follows: BUG: unable to handle kernel paging request at ffff882071c8ef28 IP: [<ffffffff812de48f>] sg_init_table+0x2f/0x50 PGD 1fac067 PUD 0 Oops: 0000 [#1] SMP Modules linked in: lockd sunrpc bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd microcode virtio_balloon virtio_net pcspkr soundcore i2c_piix4 i2c_core uinput floppy CPU: 1 PID: 908 Comm: trace-cmd Not tainted 3.10.0+ #49 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880071c64650 ti: ffff88007bf24000 task.ti: ffff88007bf24000 RIP: 0010:[<ffffffff812de48f>] [<ffffffff812de48f>] sg_init_table+0x2f/0x50 RSP: 0018:ffff88007bf25dd8 EFLAGS: 00010286 RAX: 0000001fffffffe0 RBX: ffff882071c8ef28 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880071c8ef48 RBP: ffff88007bf25de8 R08: ffff88007fd15d40 R09: ffff880071c8ef48 R10: ffffea0001c71040 R11: ffffffff8139c555 R12: 0000000000000000 R13: ffff88007506a3c0 R14: ffff88007c862500 R15: ffff880071c8ef00 FS: 00007f0a3646c740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff882071c8ef28 CR3: 000000007acbb000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff880071c8ef48 ffff88007bf25e20 ffff88007bf25e88 ffffffff8139d6fa ffff88007bf25e28 ffffffff8127a3f4 0000000000000000 0000000000000000 ffff880071c8ef48 0000100000000000 0000000000000003 ffff88007bf25e08 Call Trace: [<ffffffff8139d6fa>] port_fops_splice_write+0xaa/0x130 [<ffffffff8127a3f4>] ? selinux_file_permission+0xc4/0x120 [<ffffffff8139d650>] ? wait_port_writable+0x1b0/0x1b0 [<ffffffff811a6fe0>] do_splice_from+0xa0/0x110 [<ffffffff811a951f>] SyS_splice+0x5ff/0x6b0 [<ffffffff8161f8c2>] system_call_fastpath+0x16/0x1b Code: c1 e2 05 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 31 f6 48 89 5d f0 48 89 fb e8 8d ce ff ff 41 8d 44 24 ff 48 c1 e0 05 48 01 c3 <48> 8b 03 48 83 e0 fe 48 83 c8 02 48 89 03 48 8b 5d f0 4c 8b 65 RIP [<ffffffff812de48f>] sg_init_table+0x2f/0x50 RSP <ffff88007bf25dd8> CR2: ffff882071c8ef28 ---[ end trace 86323505eb42ea8f ]--- It seems to induce pagefault in sg_init_tabel() when pipe->nrbufs is equal to zero. This may happen in a following situation: (1) The application normally does splice(read) from a kernel buffer, then does splice(write) to virtio-serial. (2) The application receives SIGINT when is doing splice(read), so splice(read) is failed by EINTR. However, the application does not finish the operation. (3) The application tries to do splice(write) without pipe->nrbufs. (4) The virtio-console driver tries to touch scatterlist structure sgl in sg_init_table(), but the region is out of bound. To avoid the case, a kernel should check whether pipe->nrbufs is empty or not when splice_write is executed in the virtio-console driver. V3: Add Reviewed-by lines and stable@ line in sign-off area. Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Amit Shah <amit.shah@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | Merge tag 'fixes-for-linus' of ↵Linus Torvalds2013-08-08
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Kevin Hilman: - MSM: GPIO fixes (includes old code removal) - OMAP: earlyprintk regression, AM33xx cpgmac PM regression - OMAP5: urgent fix for potentially harmful voltage regulator values - Renesas: gpio-keys fix, fix SD card detection, fix shdma calculation error - STi: critical SMP boot fix - tegra: DTS fix for usb-phy - a couple MAINTAINERS updates (Arnd is on paternity leave, Kevin is stepping up to help arm-soc maintenance) * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: MAINTAINERS: add TI Keystone ARM platform MAINTAINERS: delete Srinidhi from ux500 ARM: tegra: enable ULPI phy on Colibri T20 ARM: STi: remove sti_secondary_start from INIT section. ARM: STi: Fix cpu nodes with correct device_type. ARM: shmobile: lager: do not annotate gpio_buttons as __initdata ARM: shmobile: BOCK-W: fix SDHI0 PFC settings shdma: fixup sh_dmae_get_partial() calculation error ARM: OMAP2+: hwmod: AM335x: fix cpgmac address space ARM: OMAP2+: hwmod: rt address space index for DT ARM: OMAP2+: Sync hwmod state with the pm_runtime and omap_device state ARM: OMAP2+: Avoid idling memory controllers with no drivers ARM: OMAP2+: hwmod: Fix a crash in _setup_reset() with DEBUG_LL ARM: dts: omap5-uevm: update optional/unused regulator configurations ARM: dts: omap5-uevm: fix regulator configurations mandatory for SoC ARM: dts: omap5-uevm: document regulator signals used on the actual board ARM: msm: Consolidate gpiomux for older architectures ARM: shmobile: armadillo800eva: Don't request GPIO 166 in board code ARM: msm: dts: Fix the gpio register address for msm8960
| * | MAINTAINERS: add TI Keystone ARM platformSantosh Shilimkar2013-08-07
| | | | | | | | | | | | | | | | | | | | | Adding maintainer for arch/arm/mach-keystone/ Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>
| * | MAINTAINERS: delete Srinidhi from ux500Linus Walleij2013-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Srinidhi's mail address is now bouncing and he has requested me to delete this entry. Acked-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kevin Hilman <khilman@linaro.org>
| * | ARM: tegra: enable ULPI phy on Colibri T20Lucas Stach2013-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | This was missed when splitting out the phy from the controller node in commit 9dffe3be3f32 (ARM: tegra: modify ULPI reset GPIO properties). Signed-off-by: Lucas Stach <dev@lynxeye.de> Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | ARM: STi: remove sti_secondary_start from INIT section.Srinivas Kandagatla2013-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes sti_secondary_start from _INIT section, there are 2 reason for this removal. 1. discarding such a small code does not save much, given the RAM sizes. 2. Having this code discarded, creates corruption issue when we boot smp-kernel with nrcpus=1 or with single cpu node in DT. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | ARM: STi: Fix cpu nodes with correct device_type.Srinivas Kandagatla2013-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes cpu nodes with device_type = "cpu". This change was not necessary before 3.10-rc7. Without this patch STi SOCs does not boot as SMP. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | Merge tag 'renesas-fixes2-for-v3.11' of ↵Olof Johansson2013-08-04
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes From Simon Horman: Second round of Renesas ARM based SoC fixes for v3.11 * Lager board: do not annotate gpio_buttons as __initdata - This avoids accessing uninitialised memory if keys are pressed after kernel initialisation completes. - Bug introduced in gpio-keys were enabled in v3.11-rc1 * Bock-W board: fix SDHI0 PFC settings - Allow detection of SD card - Bug introduced in SDHI support was added in v3.11-rc1 * shdma: fixup sh_dmae_get_partial() calculation error - Bug introduced in 2.6.34-rc1. * armadillo800eva board: Don't request GPIO 166 in board code - Allow use of touchscreen - Bug introduced in v3.11-rc1 * tag 'renesas-fixes2-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: lager: do not annotate gpio_buttons as __initdata ARM: shmobile: BOCK-W: fix SDHI0 PFC settings shdma: fixup sh_dmae_get_partial() calculation error ARM: shmobile: armadillo800eva: Don't request GPIO 166 in board code Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | ARM: shmobile: lager: do not annotate gpio_buttons as __initdataSimon Horman2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the gpio-keys device is registered using platform_device_register_data() the platform data argument, lager_keys_pdata is duplicated and thus should be marked as __initdata to avoid wasting memory. However, this is not true of gpio_buttons, a reference to it rather than its value is duplicated when lager_keys_pdata is duplicated. This avoids accessing freed memory if gpio-key events occur after unused kernel memory is freed late in the kernel's boot. This but was added when support for gpio-keys was added to lager in c3842e4fcbb7664276443b79187b7808c2e80a35 ("ARM: shmobile: lager: support GPIO switches") which was included in v3.11-rc1. Tested-by: Magnus Damm <damm@opensource.se> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
| | * | ARM: shmobile: BOCK-W: fix SDHI0 PFC settingsSergei Shtylyov2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following message is printed on the BOCK-W kernel bootup: sh-pfc pfc-r8a7778: invalid group "sdhi0" for function "sdhi0" In addition, SD card cannot be detected. The reason is apparently that commit ca7bb309485e4ec89a9addd47bea (ARM: shmobile: bockw: add SDHI0 support) matched the previous version of commit 564617d2f92473031d035deb273da5 (sh-pfc: r8a7778: add SDHI support). Add the missing pin groups according to the BOCK-W board schematics. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
| | * | shdma: fixup sh_dmae_get_partial() calculation errorKuninori Morimoto2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sh_desc->hw.tcr is controlling real data size, and, register TCR is controlling data transfer count which was xmit_shifted value of hw.tcr. Current sh_dmae_get_partial() is calculating in different unit. This patch fixes it. This bug has been present since c014906a870ce70e009def0c9d170ccabeb0be63 ("dmaengine: shdma: extend .device_terminate_all() to record partial transfer"), which was added in 2.6.34-rc1. Cc: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Acked-by: Guennadi Liakhovetski <g.liakhovetski+renesas@gmail.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
| | * | ARM: shmobile: armadillo800eva: Don't request GPIO 166 in board codeKuninori Morimoto2013-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 89ae7b5bbd3e65bc6ab7a577ca5ec18569589c8c (ARM: shmobile: armadillo800eva: Register pinctrl mapping for INTC) mistakenly requests GPIO 166 in board code, most probably due to a wrong merge conflict resolution. As the GPIO is passed to the st1232 driver through platform data and requested by the driver, there's no need to request it in board code. Fix it. Tested by: Cao Minh Hiep <cm-hiep@jinso.co.jp> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
| * | | Merge tag 'for-v3.11-rc/omap-fixes-b' of ↵Olof Johansson2013-08-04
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes From Paul Walmsley via Tony Lindgren: Some OMAP hwmod fixes for v3.11-rc. Mostly intended to fix an earlyprintk regression and an AM33xx cpgmac power management regression. Basic build, boot, and PM tests are available here: http://www.pwsan.com/omap/testlogs/hwmod_fixes_a_v3.11-rc/20130730042132/ The tests include temporary fixes for the unrelated 2430SDP and OMAP3 boot regressions, which are not part of this signed tag. * tag 'for-v3.11-rc/omap-fixes-b' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending: ARM: OMAP2+: hwmod: AM335x: fix cpgmac address space ARM: OMAP2+: hwmod: rt address space index for DT ARM: OMAP2+: Sync hwmod state with the pm_runtime and omap_device state ARM: OMAP2+: Avoid idling memory controllers with no drivers ARM: OMAP2+: hwmod: Fix a crash in _setup_reset() with DEBUG_LL Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | | ARM: OMAP2+: hwmod: AM335x: fix cpgmac address spaceAfzal Mohammed2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Register target address to be used for cpgmac is the second device address space. By default, hwmod picks first address space (0th index) for register target. With removal of address space from hwmod and using DT instead, cpgmac is getting wrong address space for register target. Fix it by indicating the address space to be used for register target. Signed-off-by: Afzal Mohammed <afzal@ti.com> Tested-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com>
| | * | | ARM: OMAP2+: hwmod: rt address space index for DTAfzal Mohammed2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address space is being removed from hwmod database and DT information in <reg> property is being used. Currently the 0th index of device address space is used to map for register target address. This is not always true, eg. cpgmac has it's sysconfig in second address space. Handle it by specifying index of device address space to be used for register target. As default value of this field would be zero with static initialization, existing behaviour of using first address space for register target while using DT would be kept as such. Signed-off-by: Afzal Mohammed <afzal@ti.com> Tested-by: Mugunthan V N <mugunthanvnm@ti.com> [paul@pwsan.com: use u8 rather than int to save memory] Signed-off-by: Paul Walmsley <paul@pwsan.com>
| | * | | ARM: OMAP2+: Sync hwmod state with the pm_runtime and omap_device stateRajendra Nayak2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some hwmods which are marked with HWMOD_INIT_NO_IDLE are left in enabled state post setup(). When a omap_device gets created for such hwmods make sure the omap_device and pm_runtime states are also in sync for such hwmods by doing a omap_device_enable() and pm_runtime_set_active() for the device. Signed-off-by: Rajendra Nayak <rnayak@ti.com> Tested-by: Mark Jackson <mpfj-list@newflow.co.uk> Signed-off-by: Paul Walmsley <paul@pwsan.com>
| | * | | ARM: OMAP2+: Avoid idling memory controllers with no driversRajendra Nayak2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory controllers in OMAP (like GPMC and EMIF) have the hwmods marked with HWMOD_INIT_NO_IDLE and are left in enabled state post initial setup. Even if they have drivers missing, avoid idling them as part of omap_device_late_idle() Signed-off-by: Rajendra Nayak <rnayak@ti.com> Tested-by: Mark Jackson <mpfj-list@newflow.co.uk> Signed-off-by: Paul Walmsley <paul@pwsan.com>
| | * | | ARM: OMAP2+: hwmod: Fix a crash in _setup_reset() with DEBUG_LLRajendra Nayak2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With commit '82702ea11ddfe0e43382e1fa5b66d807d8114916' "ARM: OMAP2+: Fix serial init for device tree based booting" stubbing out omap_serial_early_init() for Device tree based booting, there was a crash observed on AM335x based devices when hwmod does a _setup_reset() early at boot. This was rootcaused to hwmod trying to reset console uart while earlycon was using it. The way to tell hwmod not to do this is to specify the HWMOD_INIT_NO_RESET flag, which were infact set by the omap_serial_early_init() function by parsing the cmdline to identify the console device. Parsing the cmdline to identify the uart used by earlycon itself seems broken as there is nothing preventing earlycon to use a different one. This patch, instead, attempts to populate the requiste flags for hwmod based on the CONFIG_DEBUG_OMAPxUARTy FLAGS. This gets rid of the need for cmdline parsing in the DT as well as non-DT cases to identify the uart used by earlycon. Signed-off-by: Rajendra Nayak <rnayak@ti.com> Reported-by: Mark Jackson <mpfj-list@newflow.co.uk> Reported-by: Vaibhav Bedia <vaibhav.bedia@ti.com> Tested-by: Mark Jackson <mpfj-list@newflow.co.uk> Signed-off-by: Paul Walmsley <paul@pwsan.com>
| * | | | Merge tag 'omap-for-v3.11/fixes-omap5' of ↵Olof Johansson2013-08-04
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes From Tony Lindgren: Fixes for omap5-uevm regulators from Nishanth Menon <nm@ti.com>: Due to wrong older revision of documentation used as reference, we seem to have a bunch of LDOs wrongly configured on OMAP5 uEVM. This series is based power tree on production board 750-2628-XXX platform. Unfortunately, the wrong voltages may be detrimental to OMAP5 as they supply hardware blocks at voltages that are out of specification. There is a chance that without these fixes there can be hardware damage to omap5-uevm boards with the v3.11-rc series. * tag 'omap-for-v3.11/fixes-omap5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap5-uevm: update optional/unused regulator configurations ARM: dts: omap5-uevm: fix regulator configurations mandatory for SoC ARM: dts: omap5-uevm: document regulator signals used on the actual board Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | | | ARM: dts: omap5-uevm: update optional/unused regulator configurationsNishanth Menon2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e00c27ef3b4c23e39d0a77b7c8e5be44c28001c7 (ARM: dts: OMAP5: Add Palmas MFD node and regulator nodes) introduced regulator entries for OMAP5uEVM. However, The regulator information is based on an older temporary pre-production board variant and does not reflect production board 750-2628-XXX boards. The following optional/unused regulators can be updated: - SMPS9 supplies TWL6040 over VDDA_2v1_AUD. This regulator needs to be enabled only when audio is active. Since it does not come active by default, it does not require "always-on" or "boot-on". - LDO2 and LDO8 do not go to any peripheral or connector on the board. Further, these unused regulators should have been 2.8V for LDO2 and 3.0V for LDO8. Mark these LDOs as disabled in the dts until needed. Reported-by: Marc Jüttner <m-juettner@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Acked-by: J Keerthy <j-keerthy@ti.com> Acked-by: Benoit Cousson <benoit.cousson@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| | * | | | ARM: dts: omap5-uevm: fix regulator configurations mandatory for SoCNishanth Menon2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e00c27ef3b4c23e39d0a77b7c8e5be44c28001c7 (ARM: dts: OMAP5: Add Palmas MFD node and regulator nodes) introduced regulator entries for OMAP5uEVM. However, The regulator information is based on an older temporary pre-production board variant and does not reflect production board 750-2628-XXX boards. The following fixes are hence mandatory to ensure right voltage is supplied to key OMAP5 SoC voltage rails: - LDO1 supplies VDDAPHY_CAM which is OMAP5's vdda_csiporta/b/c. This can only be supplied at 1.5V or 1.8V and we currently supply 2.8V. To prevent any potential device damage risk, use the specified 1.5V-1.8V supply. Remove 'always-on' and 'boot-on' settings here as it is a 'on need' supply to SoC IP and is not enabled by PMIC by default at boot. - LDO3 supplies Low Latency Interface(LLI) hardware module which is a special hardware to communicate with Modem. However since uEVM is not setup by default for this communication, this should be disabled by default. Further, vdda_lli is supposed to be 1.5V and not 3V. - LDO4 supplies VDDAPHY_DISP which is vdda_dsiporta/c/vdda_hdmi This can only be supplied at 1.5V or 1.8V and we currently supply 2.2V. To prevent any potential device damage risk, use the specified 1.5V-1.8V supply. Remove 'always-on' and 'boot-on' settings here as it is a 'on need' supply to SoC IP and is not enabled by PMIC by default at boot. - LDO6 supplies the board specified VDDS_1V2_WKUP supply going to ldo_emu_wkup/vdds_hsic. To stay within the SoC specification supply 1.2V instead of 1.5V. - LDO7 supplies VDD_VPP which is vpp1. This is currently configured for 1.5V which as per data manual "A pulse width of 1000 ns and an amplitude of 2V is required to program each eFuse bit. Otherwise, VPP1 must not be supplied". So, fix the voltage to 2V. and disable the supply since we have no plans of programming efuse bits - it can only be done once - in factory. Further it is not enabled by default by PMIC so, 'boot-on' must be removed, and the 'always-on' needs to be removed to achieve pulsing if efuse needs to be programmed. - LDO9 supplies the board specified vdds_sdcard supply going within SoC specification of 1.8V or 3.0V. Further the supply is controlled by switch enabled by REGEN3. So, introduce REGEN3 and map sdcard slot to be powered by LDO9. Remove 'always-on' allowing the LDO to be disabled on need basis. Reported-by: Marc Jüttner <m-juettner@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Acked-by: J Keerthy <j-keerthy@ti.com> Acked-by: Benoit Cousson <benoit.cousson@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| | * | | | ARM: dts: omap5-uevm: document regulator signals used on the actual boardNishanth Menon2013-07-30
| | | |/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e00c27ef3b4c23e39d0a77b7c8e5be44c28001c7 (ARM: dts: OMAP5: Add Palmas MFD node and regulator nodes) introduced regulator entries for OMAP5uEVM. However, currently we use the Palmas regulator names which is used for different purposes on uEVM. Document the same based on 750-2628-XXX boards - which is meant to be supported by this dts. Reported-by: Marc Jüttner <m-juettner@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Acked-by: J Keerthy <j-keerthy@ti.com> Acked-by: Benoit Cousson <benoit.cousson@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| * | | | Merge tag 'msm-3.11-fix1' of ↵Olof Johansson2013-08-04
| |\ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm into fixes From David Brown, fixes for MSM for 3.11: Two small fixes for MSM. The first fixes the a gpio controller register address. I didn't see any acks from the devicetree maintainers, so I've copied them on this pull request. The change itself is minor, and just to the register address. The second change removes the gpiomux V1 code from MSM. This was breaking compilation for some of the targets. * tag 'msm-3.11-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm: ARM: msm: Consolidate gpiomux for older architectures ARM: msm: dts: Fix the gpio register address for msm8960 Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | | ARM: msm: Consolidate gpiomux for older architecturesRohit Vaswani2013-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Msm gpiomux can be used only for 7x30 and 8x50. Prevent compilation and fix build issues on 7X00, 8X60 and 8960. Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org>
| | * | | ARM: msm: dts: Fix the gpio register address for msm8960Rohit Vaswani2013-07-18
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | Fix the the gpio reg address for the device tree entry. Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org>
* | | | Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"Linus Torvalds2013-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 318df36e57c0ca9f2146660d41ff28e8650af423. This commit caused Steven Rostedt's hackbench runs to run out of memory due to a leak. As noted by Joonsoo Kim, it is buggy in the following scenario: "I guess, you may set 0 to all kmem caches's cpu_partial via sysfs, doesn't it? In this case, memory leak is possible in following case. Code flow of possible leak is follwing case. * in __slab_free() 1. (!new.inuse || !prior) && !was_frozen 2. !kmem_cache_debug && !prior 3. new.frozen = 1 4. after cmpxchg_double_slab, run the (!n) case with new.frozen=1 5. with this patch, put_cpu_partial() doesn't do anything, because this cache's cpu_partial is 0 6. return In step 5, leak occur" And Steven does indeed have cpu_partial set to 0 due to RT testing. Joonsoo is cooking up a patch, but everybody agrees that reverting this for now is the right thing to do. Reported-and-bisected-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'trace-fixes-3.11-rc3' of ↵Linus Torvalds2013-08-07
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Oleg Nesterov has been working hard in closing all the holes that can lead to race conditions between deleting an event and accessing an event debugfs file. This included a fix to the debugfs system (acked by Greg Kroah-Hartman). We think that all the holes have been patched and hopefully we don't find more. I haven't marked all of them for stable because I need to examine them more to figure out how far back some of the changes need to go. Along the way, some other fixes have been made. Alexander Z Lam fixed some logic where the wrong buffer was being modifed. Andrew Vagin found a possible corruption for machines that actually allocate cpumask, as a reference to one was being zeroed out by mistake. Dhaval Giani found a bad prototype when tracing is not configured. And I not only had some changes to help Oleg, but also finally fixed a long standing bug that Dave Jones and others have been hitting, where a module unload and reload can cause the function tracing accounting to get screwed up" * tag 'trace-fixes-3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix reset of time stamps during trace_clock changes tracing: Make TRACE_ITER_STOP_ON_FREE stop the correct buffer tracing: Fix trace_dump_stack() proto when CONFIG_TRACING is not set tracing: Fix fields of struct trace_iterator that are zeroed by mistake tracing/uprobes: Fail to unregister if probe event files are in use tracing/kprobes: Fail to unregister if probe event files are in use tracing: Add comment to describe special break case in probe_remove_event_call() tracing: trace_remove_event_call() should fail if call/file is in use debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) ftrace: Check module functions being traced on reload ftrace: Consolidate some duplicate code for updating ftrace ops tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_private tracing: Introduce remove_event_file_dir() tracing: Change f_start() to take event_mutex and verify i_private != NULL tracing: Change event_filter_read/write to verify i_private != NULL tracing: Change event_enable/disable_read() to verify i_private != NULL tracing: Turn event/id->i_private into call->event.type
| * | | | tracing: Fix reset of time stamps during trace_clock changesAlexander Z Lam2013-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed two issues with changing the timestamp clock with trace_clock: - The global buffer was reset on instance clock changes. Change this to pass the correct per-instance buffer - ftrace_now() is used to set buf->time_start in tracing_reset_online_cpus(). This was incorrect because ftrace_now() used the global buffer's clock to return the current time. Change this to use buffer_ftrace_now() which returns the current time for the correct per-instance buffer. Also removed tracing_reset_current() because it is not used anywhere Link: http://lkml.kernel.org/r/1375493777-17261-2-git-send-email-azl@google.com Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Make TRACE_ITER_STOP_ON_FREE stop the correct bufferAlexander Z Lam2013-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Releasing the free_buffer file in an instance causes the global buffer to be stopped when TRACE_ITER_STOP_ON_FREE is enabled. Operate on the correct buffer. Link: http://lkml.kernel.org/r/1375493777-17261-1-git-send-email-azl@google.com Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Fix trace_dump_stack() proto when CONFIG_TRACING is not setDhaval Giani2013-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_TRACING is not enabled, the stub prototype for trace_dump_stack() is incorrect. It has (void) when it should be (int). Link: http://lkml.kernel.org/r/CAPhKKr_H=ukFnBL4WgDOVT5ay2xeF-Ho+CA0DWZX0E2JW-=vSQ@mail.gmail.com Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Fix fields of struct trace_iterator that are zeroed by mistakeAndrew Vagin2013-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tracing_read_pipe zeros all fields bellow "seq". The declaration contains a comment about that, but it doesn't help. The first field is "snapshot", it's true when current open file is snapshot. Looks obvious, that it should not be zeroed. The second field is "started". It was converted from cpumask_t to cpumask_var_t (v2.6.28-4983-g4462344), in other words it was converted from cpumask to pointer on cpumask. Currently the reference on "started" memory is lost after the first read from tracing_read_pipe and a proper object will never be freed. The "started" is never dereferenced for trace_pipe, because trace_pipe can't have the TRACE_FILE_ANNOTATE options. Link: http://lkml.kernel.org/r/1375463803-3085183-1-git-send-email-avagin@openvz.org Cc: stable@vger.kernel.org # 2.6.30 Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing/uprobes: Fail to unregister if probe event files are in useSteven Rostedt (Red Hat)2013-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Uprobes suffer the same problem that kprobes have. There's a race between writing to the "enable" file and removing the probe. The probe checks for it being in use and if it is not, goes about deleting the probe and the event that represents it. But the problem with that is, after it checks if it is in use it can be enabled, and the deletion of the event (access to the probe) will fail, as it is in use. But the uprobe will still be deleted. This is a problem as the event can reference the uprobe that was deleted. The fix is to remove the event first, and check to make sure the event removal succeeds. Then it is safe to remove the probe. When the event exists, either ftrace or perf can enable the probe and prevent the event from being removed. Link: http://lkml.kernel.org/r/20130704034038.991525256@goodmis.org Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing/kprobes: Fail to unregister if probe event files are in useSteven Rostedt (Red Hat)2013-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a probe is being removed, it cleans up the event files that correspond to the probe. But there is a race between writing to one of these files and deleting the probe. This is especially true for the "enable" file. CPU 0 CPU 1 ----- ----- fd = open("enable",O_WRONLY); probes_open() release_all_trace_probes() unregister_trace_probe() if (trace_probe_is_enabled(tp)) return -EBUSY write(fd, "1", 1) __ftrace_set_clr_event() call->class->reg() (kprobe_register) enable_trace_probe(tp) __unregister_trace_probe(tp); list_del(&tp->list) unregister_probe_event(tp) <-- fails! free_trace_probe(tp) write(fd, "0", 1) __ftrace_set_clr_event() call->class->unreg (kprobe_register) disable_trace_probe(tp) <-- BOOM! A test program was written that used two threads to simulate the above scenario adding a nanosleep() interval to change the timings and after several thousand runs, it was able to trigger this bug and crash: BUG: unable to handle kernel paging request at 00000005000000f9 IP: [<ffffffff810dee70>] probes_open+0x3b/0xa7 PGD 7808a067 PUD 0 Oops: 0000 [#1] PREEMPT SMP Dumping ftrace buffer: --------------------------------- Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6 CPU: 1 PID: 2070 Comm: test-kprobe-rem Not tainted 3.11.0-rc3-test+ #47 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007 task: ffff880077756440 ti: ffff880076e52000 task.ti: ffff880076e52000 RIP: 0010:[<ffffffff810dee70>] [<ffffffff810dee70>] probes_open+0x3b/0xa7 RSP: 0018:ffff880076e53c38 EFLAGS: 00010203 RAX: 0000000500000001 RBX: ffff88007844f440 RCX: 0000000000000003 RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880076e52000 RBP: ffff880076e53c58 R08: ffff880076e53bd8 R09: 0000000000000000 R10: ffff880077756440 R11: 0000000000000006 R12: ffffffff810dee35 R13: ffff880079250418 R14: 0000000000000000 R15: ffff88007844f450 FS: 00007f87a276f700(0000) GS:ffff88007d480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000005000000f9 CR3: 0000000077262000 CR4: 00000000000007e0 Stack: ffff880076e53c58 ffffffff81219ea0 ffff88007844f440 ffffffff810dee35 ffff880076e53ca8 ffffffff81130f78 ffff8800772986c0 ffff8800796f93a0 ffffffff81d1b5d8 ffff880076e53e04 0000000000000000 ffff88007844f440 Call Trace: [<ffffffff81219ea0>] ? security_file_open+0x2c/0x30 [<ffffffff810dee35>] ? unregister_trace_probe+0x4b/0x4b [<ffffffff81130f78>] do_dentry_open+0x162/0x226 [<ffffffff81131186>] finish_open+0x46/0x54 [<ffffffff8113f30b>] do_last+0x7f6/0x996 [<ffffffff8113cc6f>] ? inode_permission+0x42/0x44 [<ffffffff8113f6dd>] path_openat+0x232/0x496 [<ffffffff8113fc30>] do_filp_open+0x3a/0x8a [<ffffffff8114ab32>] ? __alloc_fd+0x168/0x17a [<ffffffff81131f4e>] do_sys_open+0x70/0x102 [<ffffffff8108f06e>] ? trace_hardirqs_on_caller+0x160/0x197 [<ffffffff81131ffe>] SyS_open+0x1e/0x20 [<ffffffff81522742>] system_call_fastpath+0x16/0x1b Code: e5 41 54 53 48 89 f3 48 83 ec 10 48 23 56 78 48 39 c2 75 6c 31 f6 48 c7 RIP [<ffffffff810dee70>] probes_open+0x3b/0xa7 RSP <ffff880076e53c38> CR2: 00000005000000f9 ---[ end trace 35f17d68fc569897 ]--- The unregister_trace_probe() must be done first, and if it fails it must fail the removal of the kprobe. Several changes have already been made by Oleg Nesterov and Masami Hiramatsu to allow moving the unregister_probe_event() before the removal of the probe and exit the function if it fails. This prevents the tp structure from being used after it is freed. Link: http://lkml.kernel.org/r/20130704034038.819592356@goodmis.org Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Add comment to describe special break case in probe_remove_event_call()Steven Rostedt (Red Hat)2013-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "break" used in the do_for_each_event_file() is used as an optimization as the loop is really a double loop. The loop searches all event files for each trace_array. There's only one matching event file per trace_array and after we find the event file for the trace_array, the break is used to jump to the next trace_array and start the search there. As this is not a standard way of using "break" in C code, it requires a comment right before the break to let people know what is going on. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: trace_remove_event_call() should fail if call/file is in useOleg Nesterov2013-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change trace_remove_event_call(call) to return the error if this call is active. This is what the callers assume but can't verify outside of the tracing locks. Both trace_kprobe.c/trace_uprobe.c need the additional changes, unregister_trace_probe() should abort if trace_remove_event_call() fails. The caller is going to free this call/file so we must ensure that nobody can use them after trace_remove_event_call() succeeds. debugfs should be fine after the previous changes and event_remove() does TRACE_REG_UNREGISTER, but still there are 2 reasons why we need the additional checks: - There could be a perf_event(s) attached to this tp_event, so the patch checks ->perf_refcount. - TRACE_REG_UNREGISTER can be suppressed by FTRACE_EVENT_FL_SOFT_MODE, so we simply check FTRACE_EVENT_FL_ENABLED protected by event_mutex. Link: http://lkml.kernel.org/r/20130729175033.GB26284@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)Oleg Nesterov2013-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | debugfs_remove_recursive() is wrong, 1. it wrongly assumes that !list_empty(d_subdirs) means that this dir should be removed. This is not that bad by itself, but: 2. if d_subdirs does not becomes empty after __debugfs_remove() it gives up and silently fails, it doesn't even try to remove other entries. However ->d_subdirs can be non-empty because it still has the already deleted !debugfs_positive() entries. 3. simple_release_fs() is called even if __debugfs_remove() fails. Suppose we have dir1/ dir2/ file2 file1 and someone opens dir1/dir2/file2. Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes away. But debugfs_remove_recursive(dir1) silently fails and doesn't remove this directory. Because it tries to delete (the already deleted) dir1/dir2/file2 again and then fails due to "Avoid infinite loop" logic. Test-case: #!/bin/sh cd /sys/kernel/debug/tracing echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events sleep 1000 < events/probe/sigprocmask/id & echo -n >| kprobe_events [ -d events/probe ] && echo "ERR!! failed to rm probe" And after that it is not possible to create another probe entry. With this patch debugfs_remove_recursive() skips !debugfs_positive() files although this is not strictly needed. The most important change is that it does not try to make ->d_subdirs empty, it simply scans the whole list(s) recursively and removes as much as possible. Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | ftrace: Check module functions being traced on reloadSteven Rostedt (Red Hat)2013-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's been a nasty bug that would show up and not give much info. The bug displayed the following warning: WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230() Pid: 20903, comm: bash Tainted: G O 3.6.11+ #38405.trunk Call Trace: [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20 [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230 [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0 [<ffffffff811401cc>] ? kfree+0x2c/0x110 [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150 [<ffffffff81149f1e>] __fput+0xae/0x220 [<ffffffff8114a09e>] ____fput+0xe/0x10 [<ffffffff8105fa22>] task_work_run+0x72/0x90 [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0 [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff815c0f88>] int_signal+0x12/0x17 ---[ end trace 793179526ee09b2c ]--- It was finally narrowed down to unloading a module that was being traced. It was actually more than that. When functions are being traced, there's a table of all functions that have a ref count of the number of active tracers attached to that function. When a function trace callback is registered to a function, the function's record ref count is incremented. When it is unregistered, the function's record ref count is decremented. If an inconsistency is detected (ref count goes below zero) the above warning is shown and the function tracing is permanently disabled until reboot. The ftrace callback ops holds a hash of functions that it filters on (and/or filters off). If the hash is empty, the default means to filter all functions (for the filter_hash) or to disable no functions (for the notrace_hash). When a module is unloaded, it frees the function records that represent the module functions. These records exist on their own pages, that is function records for one module will not exist on the same page as function records for other modules or even the core kernel. Now when a module unloads, the records that represents its functions are freed. When the module is loaded again, the records are recreated with a default ref count of zero (unless there's a callback that traces all functions, then they will also be traced, and the ref count will be incremented). The problem is that if an ftrace callback hash includes functions of the module being unloaded, those hash entries will not be removed. If the module is reloaded in the same location, the hash entries still point to the functions of the module but the module's ref counts do not reflect that. With the help of Steve and Joern, we found a reproducer: Using uinput module and uinput_release function. cd /sys/kernel/debug/tracing modprobe uinput echo uinput_release > set_ftrace_filter echo function > current_tracer rmmod uinput modprobe uinput # check /proc/modules to see if loaded in same addr, otherwise try again echo nop > current_tracer [BOOM] The above loads the uinput module, which creates a table of functions that can be traced within the module. We add uinput_release to the filter_hash to trace just that function. Enable function tracincg, which increments the ref count of the record associated to uinput_release. Remove uinput, which frees the records including the one that represents uinput_release. Load the uinput module again (and make sure it's at the same address). This recreates the function records all with a ref count of zero, including uinput_release. Disable function tracing, which will decrement the ref count for uinput_release which is now zero because of the module removal and reload, and we have a mismatch (below zero ref count). The solution is to check all currently tracing ftrace callbacks to see if any are tracing any of the module's functions when a module is loaded (it already does that with callbacks that trace all functions). If a callback happens to have a module function being traced, it increments that records ref count and starts tracing that function. There may be a strange side effect with this, where tracing module functions on unload and then reloading a new module may have that new module's functions being traced. This may be something that confuses the user, but it's not a big deal. Another approach is to disable all callback hashes on module unload, but this leaves some ftrace callbacks that may not be registered, but can still have hashes tracing the module's function where ftrace doesn't know about it. That situation can cause the same bug. This solution solves that case too. Another benefit of this solution, is it is possible to trace a module's function on unload and load. Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.com Reported-by: Jörn Engel <joern@logfs.org> Reported-by: Dave Jones <davej@redhat.com> Reported-by: Steve Hodgson <steve@purestorage.com> Tested-by: Steve Hodgson <steve@purestorage.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | ftrace: Consolidate some duplicate code for updating ftrace opsSteven Rostedt (Red Hat)2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ftrace ops modifies the functions that it will trace, the update to the function mcount callers may need to be modified. Consolidate the two places that do the checks to see if an update is required with a wrapper function for those checks. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_privateOleg Nesterov2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change remove_event_file_dir() to clear ->i_private for every file we are going to remove. We need to check file->dir != NULL because event_create_dir() can fail. debugfs_remove_recursive(NULL) is fine but the patch moves it under the same check anyway for readability. spin_lock(d_lock) and "d_inode != NULL" check are not needed afaics, but I do not understand this code enough. tracing_open_generic_file() and tracing_release_generic_file() can go away, ftrace_enable_fops and ftrace_event_filter_fops() use tracing_open_generic() but only to check tracing_disabled. This fixes all races with event_remove() or instance_delete(). f_op->read/write/whatever can never use the freed file/call, all event/* files were changed to check and use ->i_private under event_mutex. Note: this doesn't not fix other problems, event_remove() can destroy the active ftrace_event_call, we need more changes but those changes are completely orthogonal. Link: http://lkml.kernel.org/r/20130728183527.GB16723@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Introduce remove_event_file_dir()Oleg Nesterov2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preparation for the next patch. Extract the common code from remove_event_from_tracers() and __trace_remove_event_dirs() into the new helper, remove_event_file_dir(). The patch looks more complicated than it actually is, it also moves remove_subsystem() up to avoid the forward declaration. Link: http://lkml.kernel.org/r/20130726172547.GA3629@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Change f_start() to take event_mutex and verify i_private != NULLOleg Nesterov2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | trace_format_open() and trace_format_seq_ops are racy, nothing protects ftrace_event_call from trace_remove_event_call(). Change f_start() to take event_mutex and verify i_private != NULL, change f_stop() to drop this lock. This fixes nothing, but now we can change debugfs_remove("format") callers to nullify ->i_private and fix the the problem. Note: the usage of event_mutex is sub-optimal but simple, we can change this later. Link: http://lkml.kernel.org/r/20130726172543.GA3622@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | tracing: Change event_filter_read/write to verify i_private != NULLOleg Nesterov2013-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | event_filter_read/write() are racy, ftrace_event_call can be already freed by trace_remove_event_call() callers. 1. Shift mutex_lock(event_mutex) from print/apply_event_filter to the callers. 2. Change the callers, event_filter_read() and event_filter_write() to read i_private under this mutex and abort if it is NULL. This fixes nothing, but now we can change debugfs_remove("filter") callers to nullify ->i_private and fix the the problem. Link: http://lkml.kernel.org/r/20130726172540.GA3619@redhat.com Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>