aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* mmc: dw_mmc: Fix PIO mode with support of highmemSeungwon Jeon2012-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current PIO mode makes a kernel crash with CONFIG_HIGHMEM. Highmem pages have a NULL from sg_virt(sg). This patch fixes the following problem. Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] PREEMPT SMP Modules linked in: CPU: 0 Not tainted (3.0.15-01423-gdbf465f #589) PC is at dw_mci_pull_data32+0x4c/0x9c LR is at dw_mci_read_data_pio+0x54/0x1f0 pc : [<c0358824>] lr : [<c035988c>] psr: 20000193 sp : c0619d48 ip : c0619d70 fp : c0619d6c r10: 00000000 r9 : 00000002 r8 : 00001000 r7 : 00000200 r6 : 00000000 r5 : e1dd3100 r4 : 00000000 r3 : 65622023 r2 : 0000007f r1 : eeb96000 r0 : e1dd3100 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment xkernel Control: 10c5387d Table: 61e2004a DAC: 00000015 Process swapper (pid: 0, stack limit = 0xc06182f0) Stack: (0xc0619d48 to 0xc061a000) 9d40: e1dd3100 e1a4f000 00000000 e1dd3100 e1a4f000 00000200 9d60: c0619da4 c0619d70 c035988c c03587e4 c0619d9c e18158f4 e1dd3100 e1dd3100 9d80: 00000020 00000000 00000000 00000020 c06e8a84 00000000 c0619e04 c0619da8 9da0: c0359b24 c0359844 e18158f4 e1dd3164 e1dd3168 e1dd3150 3d02fc79 e1dd3154 9dc0: e1dd3178 00000000 00000020 00000000 e1dd3150 00000000 c10dd7e8 e1a84900 9de0: c061e7cc 00000000 00000000 0000008d c06e8a84 c061e780 c0619e4c c0619e08 9e00: c00c4738 c0359a34 3d02fc79 00000000 c0619e4c c05a1698 c05a1670 c05a165c 9e20: c04de8b0 c061e780 c061e7cc e1a84900 ffffed68 0000008d c0618000 00000000 9e40: c0619e6c c0619e50 c00c48b4 c00c46c8 c061e780 c00423ac c061e7cc ffffed68 9e60: c0619e8c c0619e70 c00c7358 c00c487c 0000008d ffffee38 c0618000 ffffed68 9e80: c0619ea4 c0619e90 c00c4258 c00c72b0 c00423ac ffffee38 c0619ecc c0619ea8 9ea0: c004241c c00c4234 ffffffff f8810000 0000006d 00000002 00000001 7fffffff 9ec0: c0619f44 c0619ed0 c0048bc0 c00423c4 220ae7a9 00000000 386f0d30 0005d3a4 9ee0: c00423ac c10dd0b8 c06f2cd8 c0618000 c0594778 c003a674 7fffffff c0619f44 9f00: 386f0d30 c0619f18 c00a6f94 c005be3c 80000013 ffffffff 386f0d30 0005d3a4 9f20: 386f0d30 0005d2d1 c10dd0a8 c10dd0b8 c06f2cd8 c0618000 c0619f74 c0619f48 9f40: c0345858 c005be00 c00a2440 c0618000 c0618000 c00410d8 c06c1944 c00410fc 9f60: c0594778 c003a674 c0619f9c c0619f78 c004a7e8 c03457b4 c0618000 c06c18f8 9f80: 00000000 c0039c70 c06c18d4 c003a674 c0619fb4 c0619fa0 c04ceafc c004a714 9fa0: c06287b4 c06c18f8 c0619ff4 c0619fb8 c0008b68 c04cea68 c0008578 00000000 9fc0: 00000000 c003a674 00000000 10c5387d c0628658 c003aa78 c062f1c4 4000406a 9fe0: 413fc090 00000000 00000000 c0619ff8 40008044 c0008858 00000000 00000000 Backtrace: [<c03587d8>] (dw_mci_pull_data32+0x0/0x9c) from [<c035988c>] (dw_mci_read_data_pio+0x54/0x1f0) r6:00000200 r5:e1a4f000 r4:e1dd3100 [<c0359838>] (dw_mci_read_data_pio+0x0/0x1f0) from [<c0359b24>] (dw_mci_interrupt+0xfc/0x4a4) [<c0359a28>] (dw_mci_interrupt+0x0/0x4a4) from [<c00c4738>] (handle_irq_event_percpu+0x7c/0x1b4) [<c00c46bc>] (handle_irq_event_percpu+0x0/0x1b4) from [<c00c48b4>] (handle_irq_event+0x44/0x64) [<c00c4870>] (handle_irq_event+0x0/0x64) from [<c00c7358>] (handle_fasteoi_irq+0xb4/0x124) r7:ffffed68 r6:c061e7cc r5:c00423ac r4:c061e780 [<c00c72a4>] (handle_fasteoi_irq+0x0/0x124) from [<c00c4258>] (generic_handle_irq+0x30/0x38) r7:ffffed68 r6:c0618000 r5:ffffee38 r4:0000008d [<c00c4228>] (generic_handle_irq+0x0/0x38) from [<c004241c>] (asm_do_IRQ+0x64/0xe0) r5:ffffee38 r4:c00423ac [<c00423b8>] (asm_do_IRQ+0x0/0xe0) from [<c0048bc0>] (__irq_svc+0x80/0x14c) Exception stack(0xc0619ed0 to 0xc0619f18) Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: atmel-mci: save and restore sdioirq when soft reset is performedLudovic Desroches2012-02-13
| | | | | | | | | | | Sometimes a software reset is needed. Then some registers are saved and restored but the interrupt mask register is missing. It causes issues with sdio devices whose interrupts are masked after reset. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: block: Init ro_lock sysfs attr to fix lockdep warningsRabin Vincent2012-02-13
| | | | | | | | | Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Johan Rudholm <johan.rudholm@stericsson.com> Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: sh_mmcif: fix late delayed work initialisationGuennadi Liakhovetski2012-02-13
| | | | | | | | | | | | | | | | | If the driver is loaded with a card in the slot, mmc_add_host() will schedule an immediate card-detection work, which will start IO and wait for command completion. Usually the kernel first returns to the sh_mmcif probe function, lets it finish and only then schedules the rescan work. But sometimes, expecially under heavy system load, the work will be scheduled immediately before returning to the probe method. In this case it is important for the driver to be fully prepared for IO. For sh_mmcif this means, that also the timeout work has to be initialised before calling mmc_add_host(). It is also better to prepare interrupts beforehand. Besides, since mmc_add_host() does card-detection itself, there is no need to do it again immediately afterwards. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: tmio_mmc: fix card eject during IO with DMAGuennadi Liakhovetski2012-02-13
| | | | | | | | | When DMA is in use and the card is ejected during IO, DMA transfers have to be terminated, otherwise the dmaengine driver fails to operate properly, when the card is re-inserted. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: core: Fix comparison issue in mmc_compare_ext_csdsJurgen Heeks2012-02-13
| | | | | | | | | | Found this issue during code review. Actually, there are two issues which both compensate together in lucky case. In unlucky case the bus width probing might not work as expected. Signed-off-by: Jurgen Heeks <jurgen.heeks@nokia.com> Reviewed-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: core: Fix PowerOff Notify suspend/resumeGirish K S2012-02-13
| | | | | | | | | | | | | | | Modified the mmc_poweroff to resume before sending the poweroff notification command. In sleep mode only AWAKE and RESET commands are allowed, so before sending the poweroff notification command resume from sleep mode and then send the notification command. PowerOff Notify is tested on a Synopsis Designware Host Controller (eMMC 4.5). The suspend to RAM and resume works fine. Signed-off-by: Girish K S <girish.shivananjappa@linaro.org> Tested-by: Girish K S <girish.shivananjappa@linaro.org> Reviewed-by: Saugata Das <saugata.das@linaro.org> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: sdhci-pci: set Medfield SDIO as non-removableAdrian Hunter2012-02-13
| | | | | | | | Set Medfield SDIO as non-removable to avoid un-necessary card detect activity. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: core: add the capability for broken voltageJaehoon Chung2012-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an understood mismatch between the voltage the host controller is set to and the voltage supplied to the card by a fixed voltage regulator. Teaching the driver to accept the mismatch is overly complicated. Instead just accept the regulator's voltage. This patch adds MMC_CAP2_BROKEN_VOLTAGE. If the voltage didn't satisfy between min_uV and max_uV, try to change the voltage in core.c. When changing the voltage, maybe use regulator_set_voltage(). In regulator_set_voltage(), check the below condition. /* sanity check */ if (!rdev->desc->ops->set_voltage && !rdev->desc->ops->set_voltage_sel) { ret = -EINVAL; goto out; } If some board should use the fixed-regulator, always return -EINVAL. Then, eMMC didn't initialize always. So if use a fixed-regulator, we need to add the MMC_CAP2_BROKEN_VOLTAGE. Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: core: Fix low speed mmc card detection failureGirish K S2012-02-13
| | | | | | | This patch fixes the failure of low speed mmc card detection. Signed-off-by: Girish K S <girish.shivananjappa@linaro.org> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: esdhc: set the timeout to the max valueJerry Huang2012-02-13
| | | | | | | | | | | | | | | | | | When accessing the card on some FSL platform boards (e.g p2020, p1010, mpc8536), the following error is reported with the timeout value calculated: mmc0: Got data interrupt 0x00000020 even though no data operation was in progress. mmc0: Got data interrupt 0x00000020 even though no data operation was in progress. So we skip the calculation of timeout and use the max value to fix it. Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com> Signed-off-by: Gao Guanhua <B22826@freescale.com> Signed-off-by: Xie Xiaobo <X.Xie@freescale.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: esdhc: add PIO mode supportJerry Huang2012-02-13
| | | | | | | | | | | For some FSL ESDHC controllers (e.g. P2020E, Rev1.0), the SDHC can not work on DMA mode because of the hardware bug, so we set a broken dma flag and use PIO mode. Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com> Signed-off-by: Gao Guanhua <B22826@freescale.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: core: Ensure clocks are always enabled before host interactionSujit Reddy Thumma2012-02-13
| | | | | | | | | | | | Ensure clocks are always enabled before any interaction with the host controller driver. This makes sure that there is no race between host execution and the core layer turning off clocks in different context with clock gating framework. Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Per Forlin <per.forlin@stericsson.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: of_mmc_spi: fix little endian supportJean-Christophe PLAGNIOL-VILLARD2012-02-13
| | | | | | | | The voltage_ranges is supposed to switch from big endian to little endian. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: core: UHS sdio card that fails should not exceed 50MHzPhilip Rakity2012-02-13
| | | | | | | | | | | | A UHS sdio card that fails initialization at 1.8v signaling is not in UHS mode. We cannot use the speed in the the cis to reflect the bus speed as this is the maxiumum value and will not reflect the fact that the host is operating at a lower (non uhs) bus speed. Signed-off-by: Philip Rakity <prakity@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Reviewed-by: Aaron Lu <aaron.lu@amd.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* mmc: esdhc: fix errors when booting kernel on Freescale eSDHC version 2.3Roy Zang2012-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | When eSDHC module is enabled on P5020/P3041/P2041/P1010 with eSDHC version 2.3, there is following errors: mmc0: Timeout waiting for hardware interrupt. mmc0: error -110 whilst initialising SD card mmc0: Unexpected interrupt 0x02000000. mmc0: Timeout waiting for hardware interrupt. mmc0: error -110 whilst initialising SD card mmc0: Unexpected interrupt 0x02000000. It is because eSDHC controller has different bit setting for PROCTL register at 0x28 comparing SD specification. This patch sets DMAS bits correctly for byte operation and does not change the default value of other field of PROCTL register. For other FSL chips, such as MPC8536/P2020, PROCTL[DMAS] bits are reserved and even if they are set to wrong bits, it will not take effective. Signed-off-by: Roy Zang <tie-fei.zang@freescale.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Chris Ball <cjb@laptop.org>
* Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2012-02-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Says Jens: "Time to push off some of the pending items. I really wanted to wait until we had the regression nailed, but alas it's not quite there yet. But I'm very confident that it's "just" a missing expire on exit, so fix from Tejun should be fairly trivial. I'm headed out for a week on the slopes. - Killing the barrier part of mtip32xx. It doesn't really support barriers, and it doesn't need them (writes are fully ordered). - A few fixes from Dan Carpenter, preventing overflows of integer multiplication. - A fixup for loop, fixing a previous commit that didn't quite solve the partial read problem from Dave Young. - A bio integer overflow fix from Kent Overstreet. - Improvement/fix of the door "keep locked" part of the cdrom shared code from Paolo Benzini. - A few cfq fixes from Shaohua Li. - A fix for bsg sysfs warning when removing a file it did not create from Stanislaw Gruszka. - Two fixes for floppy from Vivek, preventing a crash. - A few block core fixes from Tejun. One killing the over-optimized ioc exit path, cleaning that up nicely. Two others fixing an oops on elevator switch, due to calling into the scheduler merge check code without holding the queue lock." * 'for-linus' of git://git.kernel.dk/linux-block: block: fix lockdep warning on io_context release put_io_context() relay: prevent integer overflow in relay_open() loop: zero fill bio instead of return -EIO for partial read bio: don't overflow in bio_get_nr_vecs() floppy: Fix a crash during rmmod floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never called cdrom: move shared static to cdrom_device_info bsg: fix sysfs link remove warning block: don't call elevator callbacks for plug merges block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the unused member of struct driver_data block: strip out locking optimization in put_io_context() cdrom: use copy_to_user() without the underscores block: fix ioc locking warning block: fix NULL icq_cache reference block,cfq: change code order
| * block: fix lockdep warning on io_context release put_io_context()Tejun Heo2012-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11a3122f6c "block: strip out locking optimization in put_io_context()" removed ioc_lock depth lockdep annoation along with locking optimization; however, while recursing from put_io_context() is no longer possible, ioc_release_fn() may still end up putting the last reference of another ioc through elevator, which wlil grab ioc->lock triggering spurious (as the ioc is always different one) A-A deadlock warning. As this can only happen one time from ioc_release_fn(), using non-zero subclass from ioc_release_fn() is enough. Use subclass 1. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * relay: prevent integer overflow in relay_open()Dan Carpenter2012-02-10
| | | | | | | | | | | | | | | | | | "subbuf_size" and "n_subbufs" come from the user and they need to be capped to prevent an integer overflow. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * loop: zero fill bio instead of return -EIO for partial readDave Young2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8268f5a741 ("deny partial write for loop dev fd") tried to fix the loop device partial read information leak problem. But it changed the semantics of read behavior. When we read beyond the end of the device we should get 0 bytes, which is normal behavior, we should not just return -EIO Instead of returning -EIO, zero out the bio to avoid information leak in case of partail read. Signed-off-by: Dave Young <dyoung@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Tested-by: Jeff Moyer <jmoyer@redhat.com> Cc: Dmitry Monakhov <dmonakhov@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * bio: don't overflow in bio_get_nr_vecs()Kent Overstreet2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were two places bio_get_nr_vecs() could overflow: First, it did a left shift to convert from sectors to bytes immediately before dividing by PAGE_SIZE. If PAGE_SIZE ever was less than 512 a great many things would break, so dividing by PAGE_SIZE >> 9 is safe and will generate smaller code too. The nastier overflow was in the DIV_ROUND_UP() (that's what the code was effectively doing, anyways). If n + d overflowed, the whole thing would return 0 which breaks things rather effectively. bio_get_nr_vecs() doesn't claim to give an exact value anyways, so the DIV_ROUND_UP() is silly; we could do a straight divide except if a device's queue_max_sectors was less than PAGE_SIZE we'd return 0. So we just add 1; this should always be safe - things will break badly if bio_get_nr_vecs() returns > BIO_MAX_PAGES (bio_alloc() will suddenly start failing) but it's queue_max_segments that must guard against this, if queue_max_sectors is preventing this from happen things are going to explode on architectures with different PAGE_SIZE. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Tejun Heo <tj@kernel.org> Acked-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * floppy: Fix a crash during rmmodVivek Goyal2012-02-08
| | | | | | | | | | | | | | | | | | | | | | floppy driver does not call add_disk() on all the drives hence we don't take gendisk reference on request queue for these drives. Don't call put_disk() with disk->queue set, otherwise we try to put the reference we never took. Reported-and-tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de> Signed-off-by: Vivek Goyal<vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never ↵Vivek Goyal2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | called add_disk() takes gendisk reference on request queue. If driver failed during initialization and never called add_disk() then that extra reference is not taken. That reference is put in put_disk(). floppy driver allocates the disk, allocates queue, sets disk->queue and then relizes that floppy controller is not present. It tries to tear down everything and tries to put a reference down in put_disk() which was never taken. In such error cases cleanup disk->queue before calling put_disk() so that we never try to put down a reference which was never taken in first place. Reported-and-tested-by: Suresh Jayaraman <sjayaraman@suse.com> Tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * cdrom: move shared static to cdrom_device_infoPaolo Bonzini2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | The keeplocked variable in the cdrom driver is shared across multiple drives, but set in per-device ioctls. Move it to the per-device struct, avoiding that the setting on one drive affects the driver's behavior when closing another. [ Impact: limit udev's confusion to one drive when a CD burning program unlocks the CD door at the end of burning. ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * bsg: fix sysfs link remove warningStanislaw Gruszka2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We create "bsg" link if q->kobj.sd is not NULL, so remove it only when the same condition is true. Fixes: WARNING: at fs/sysfs/inode.c:323 sysfs_hash_and_remove+0x2b/0x77() sysfs: can not remove 'bsg', no directory Call Trace: [<c0429683>] warn_slowpath_common+0x6a/0x7f [<c0537a68>] ? sysfs_hash_and_remove+0x2b/0x77 [<c042970b>] warn_slowpath_fmt+0x2b/0x2f [<c0537a68>] sysfs_hash_and_remove+0x2b/0x77 [<c053969a>] sysfs_remove_link+0x20/0x23 [<c05d88f1>] bsg_unregister_queue+0x40/0x6d [<c0692263>] __scsi_remove_device+0x31/0x9d [<c069149f>] scsi_forget_host+0x41/0x52 [<c0689fa9>] scsi_remove_host+0x71/0xe0 [<f7de5945>] quiesce_and_remove_host+0x51/0x83 [usb_storage] [<f7de5a1e>] usb_stor_disconnect+0x18/0x22 [usb_storage] [<c06c29de>] usb_unbind_interface+0x4e/0x109 [<c067a80f>] __device_release_driver+0x6b/0xa6 [<c067a861>] device_release_driver+0x17/0x22 [<c067a46a>] bus_remove_device+0xd6/0xe6 [<c06785e2>] device_del+0xf2/0x137 [<c06c101f>] usb_disable_device+0x94/0x1a0 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block: don't call elevator callbacks for plug mergesTejun Heo2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Plug merge calls two elevator callbacks outside queue lock - elevator_allow_merge_fn() and elevator_bio_merged_fn(). Although attempt_plug_merge() suggests that elevator is guaranteed to be there through the existing request on the plug list, nothing prevents plug merge from calling into dying or initializing elevator. For regular merges, bypass ensures elvpriv count to reach zero, which in turn prevents merges as all !ELVPRIV requests get REQ_SOFTBARRIER from forced back insertion. Plug merge doesn't check ELVPRIV, and, as the requests haven't gone through elevator insertion yet, it doesn't have SOFTBARRIER set allowing merges on a bypassed queue. This, for example, leads to the following crash during elevator switch. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0 PGD 112cbc067 PUD 115d5c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU 1 Modules linked in: deadline_iosched Pid: 819, comm: dd Not tainted 3.3.0-rc2-work+ #76 Bochs Bochs RIP: 0010:[<ffffffff813b34e9>] [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0 RSP: 0018:ffff8801143a38f8 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff88011817ce28 RCX: ffff880116eb6cc0 RDX: 0000000000000000 RSI: ffff880118056e20 RDI: ffff8801199512f8 RBP: ffff8801143a3908 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff880118195708 R13: ffff880118052aa0 R14: ffff8801143a3d50 R15: ffff880118195708 FS: 00007f19f82cb700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000112c6a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 819, threadinfo ffff8801143a2000, task ffff880116eb6cc0) Stack: ffff88011817ce28 ffff880118195708 ffff8801143a3928 ffffffff81391bba ffff88011817ce28 ffff880118195708 ffff8801143a3948 ffffffff81391bf1 ffff88011817ce28 0000000000000000 ffff8801143a39a8 ffffffff81398e3e Call Trace: [<ffffffff81391bba>] elv_rq_merge_ok+0x4a/0x60 [<ffffffff81391bf1>] elv_try_merge+0x21/0x40 [<ffffffff81398e3e>] blk_queue_bio+0x8e/0x390 [<ffffffff81396a5a>] generic_make_request+0xca/0x100 [<ffffffff81396b04>] submit_bio+0x74/0x100 [<ffffffff811d45c2>] __blockdev_direct_IO+0x1ce2/0x3450 [<ffffffff811d0dc7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811460b5>] generic_file_aio_read+0x6d5/0x760 [<ffffffff811986b2>] do_sync_read+0xe2/0x120 [<ffffffff81199345>] vfs_read+0xc5/0x180 [<ffffffff81199501>] sys_read+0x51/0x90 [<ffffffff81aeac12>] system_call_fastpath+0x16/0x1b There are multiple ways to fix this including making plug merge check ELVPRIV; however, * Calling into elevator outside queue lock is confusing and error-prone. * Requests on plug list aren't known to the elevator. They aren't on the elevator yet, so there's no elevator specific state to update. * Given the nature of plug merges - collecting bio's for the same purpose from the same issuer - elevator specific restrictions aren't applicable. So, simply don't call into elevator methods from plug merge by moving elv_bio_merged() from bio_attempt_*_merge() to blk_queue_bio(), and using blk_try_merge() in attempt_plug_merge(). This is based on Jens' patch to skip elevator_allow_merge_fn() from plug merge. Note that this makes per-cgroup merged stats skip plug merging. Signed-off-by: Tejun Heo <tj@kernel.org> LKML-Reference: <4F16F3CA.90904@kernel.dk> Original-patch-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator ↵Tejun Heo2012-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | functions blk_rq_merge_ok() is the elevator-neutral part of merge eligibility test. blk_try_merge() determines merge direction and expects the caller to have tested elv_rq_merge_ok() previously. elv_rq_merge_ok() now wraps blk_rq_merge_ok() and then calls elv_iosched_allow_merge(). elv_try_merge() is removed and the two callers are updated to call elv_rq_merge_ok() explicitly followed by blk_try_merge(). While at it, make rq_merge_ok() functions return bool. This is to prepare for plug merge update and doesn't introduce any behavior change. This is based on Jens' patch to skip elevator_allow_merge_fn() from plug merge. Signed-off-by: Tejun Heo <tj@kernel.org> LKML-Reference: <4F16F3CA.90904@kernel.dk> Original-patch-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the ↵Asai Thambi S P2012-02-07
| | | | | | | | | | | | | | | | | | | | | | | | unused member of struct driver_data Removed the following: * irrelevant argument 'barrier' of mtip_hw_submit_io() * unused member 'eh_active' of struct driver_data Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block: strip out locking optimization in put_io_context()Tejun Heo2012-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | put_io_context() performed a complex trylock dancing to avoid deferring ioc release to workqueue. It was also broken on UP because trylock was always assumed to succeed which resulted in unbalanced preemption count. While there are ways to fix the UP breakage, even the most pathological microbench (forced ioc allocation and tight fork/exit loop) fails to show any appreciable performance benefit of the optimization. Strip it out. If there turns out to be workloads which are affected by this change, simpler optimization from the discussion thread can be applied later. Signed-off-by: Tejun Heo <tj@kernel.org> LKML-Reference: <1328514611.21268.66.camel@sli10-conroe> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * cdrom: use copy_to_user() without the underscoresDan Carpenter2012-02-06
| | | | | | | | | | | | | | | | | | | | | | | | "nframes" comes from the user and "nframes * CD_FRAMESIZE_RAW" can wrap on 32 bit systems. That would have been ok if we used the same wrapped value for the copy, but we use a shifted value. We should just use the checked version of copy_to_user() because it's not going to make a difference to the speed. Cc: stable@vger.kernel.com Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block: fix ioc locking warningShaohua Li2012-02-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Meelis reported a warning: WARNING: at kernel/timer.c:1122 run_timer_softirq+0x199/0x1ec() Hardware name: 939Dual-SATA2 timer: cfq_idle_slice_timer+0x0/0xaa preempt leak: 00000102 -> 00000103 Modules linked in: sr_mod cdrom videodev media drm_kms_helper ohci_hcd ehci_hcd v4l2_compat_ioctl32 usbcore i2c_ali15x3 snd_seq drm snd_timer snd_seq Pid: 0, comm: swapper Not tainted 3.3.0-rc2-00110-gd125666 #176 Call Trace: <IRQ> [<ffffffff81022aaa>] warn_slowpath_common+0x7e/0x96 [<ffffffff8114c485>] ? cfq_slice_expired+0x1d/0x1d [<ffffffff81022b56>] warn_slowpath_fmt+0x41/0x43 [<ffffffff8114c526>] ? cfq_idle_slice_timer+0xa1/0xaa [<ffffffff8114c485>] ? cfq_slice_expired+0x1d/0x1d [<ffffffff8102c124>] run_timer_softirq+0x199/0x1ec [<ffffffff81047a53>] ? timekeeping_get_ns+0x12/0x31 [<ffffffff810145fd>] ? apic_write+0x11/0x13 [<ffffffff81027475>] __do_softirq+0x74/0xfa [<ffffffff812f337a>] call_softirq+0x1a/0x30 [<ffffffff81002ff9>] do_softirq+0x31/0x68 [<ffffffff810276cf>] irq_exit+0x3d/0xa3 [<ffffffff81014aca>] smp_apic_timer_interrupt+0x6b/0x77 [<ffffffff812f2de9>] apic_timer_interrupt+0x69/0x70 <EOI> [<ffffffff81040136>] ? sched_clock_cpu+0x73/0x7d [<ffffffff81040136>] ? sched_clock_cpu+0x73/0x7d [<ffffffff8100801f>] ? default_idle+0x1e/0x32 [<ffffffff81008019>] ? default_idle+0x18/0x32 [<ffffffff810008b1>] cpu_idle+0x87/0xd1 [<ffffffff812de861>] rest_init+0x85/0x89 [<ffffffff81659a4d>] start_kernel+0x2eb/0x2f8 [<ffffffff8165926e>] x86_64_start_reservations+0x7e/0x82 [<ffffffff81659362>] x86_64_start_kernel+0xf0/0xf7 this_q == locked_q is possible. There are two problems here: 1. In UP case, there is preemption counter issue as spin_trylock always successes. 2. In SMP case, the loop breaks too earlier. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Knut Petersen <Knut_Petersen@t-online.de> Tested-by: Knut Petersen <Knut_Petersen@t-online.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block: fix NULL icq_cache referenceShaohua Li2012-01-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vivek reported a kernel crash: [ 94.217015] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c [ 94.218004] IP: [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200 [ 94.218004] PGD 13abda067 PUD 137d52067 PMD 0 [ 94.218004] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 94.218004] CPU 0 [ 94.218004] Modules linked in: [last unloaded: scsi_wait_scan] [ 94.218004] [ 94.218004] Pid: 0, comm: swapper/0 Not tainted 3.2.0+ #16 Hewlett-Packard HP xw6600 Workstation/0A9Ch [ 94.218004] RIP: 0010:[<ffffffff81142fae>] [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200 [ 94.218004] RSP: 0018:ffff88013fc03de0 EFLAGS: 00010006 [ 94.218004] RAX: ffffffff81e0d020 RBX: ffff880138b3c680 RCX: 00000001801c001b [ 94.218004] RDX: 00000000003aac1d RSI: ffff880138b3c680 RDI: ffffffff81142fae [ 94.218004] RBP: ffff88013fc03e10 R08: ffff880137830238 R09: 0000000000000001 [ 94.218004] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 94.218004] R13: ffffea0004e2cf00 R14: ffffffff812f6eb6 R15: 0000000000000246 [ 94.218004] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000 [ 94.218004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 94.218004] CR2: 000000000000001c CR3: 00000001395ab000 CR4: 00000000000006f0 [ 94.218004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 94.218004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 94.218004] Process swapper/0 (pid: 0, threadinfo ffffffff81e00000, task ffffffff81e0d020) [ 94.218004] Stack: [ 94.218004] 0000000000000102 ffff88013fc0db20 ffffffff81e22700 ffff880139500f00 [ 94.218004] 0000000000000001 000000000000000a ffff88013fc03e20 ffffffff812f6eb6 [ 94.218004] ffff88013fc03e90 ffffffff810c8da2 ffffffff81e01fd8 ffff880137830240 [ 94.218004] Call Trace: [ 94.218004] <IRQ> [ 94.218004] [<ffffffff812f6eb6>] icq_free_icq_rcu+0x16/0x20 [ 94.218004] [<ffffffff810c8da2>] __rcu_process_callbacks+0x1c2/0x420 [ 94.218004] [<ffffffff810c9038>] rcu_process_callbacks+0x38/0x250 [ 94.218004] [<ffffffff810405ee>] __do_softirq+0xce/0x3e0 [ 94.218004] [<ffffffff8108ed04>] ? clockevents_program_event+0x74/0x100 [ 94.218004] [<ffffffff81090104>] ? tick_program_event+0x24/0x30 [ 94.218004] [<ffffffff8183ed1c>] call_softirq+0x1c/0x30 [ 94.218004] [<ffffffff8100422d>] do_softirq+0x8d/0xc0 [ 94.218004] [<ffffffff81040c3e>] irq_exit+0xae/0xe0 [ 94.218004] [<ffffffff8183f4be>] smp_apic_timer_interrupt+0x6e/0x99 [ 94.218004] [<ffffffff8183e330>] apic_timer_interrupt+0x70/0x80 Once a queue is quiesced, it's not supposed to have any elvpriv data or icq's, and elevator switching depends on that. Request alloc path followed the rule for elvpriv data but forgot apply it to icq's leading to the following crash during elevator switch. Fix it by not allocating icq's if ELVPRIV is not set for the request. Reported-by: Vivek Goyal <vgoyal@redhat.com> Tested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block,cfq: change code orderShaohua Li2012-01-19
| | | | | | | | | | | | | | | | | | | | | | cfq_slice_expired will change saved_workload_slice. It should be called first so saved_workload_slice is correctly set to 0 after workload type is changed. This fixes the code order changed by 54b466e44b1c7. Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2012-02-10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoth David: 1) GRO MAC header comparisons were ethernet specific, breaking other link types. This required a multi-faceted fix to cure the originally noted case (Infiniband), because IPoIB was lying about it's actual hard header length. Thanks to Eric Dumazet, Roland Dreier, and others. 2) Fix build failure when INET_UDP_DIAG is built in and ipv6 is modular. From Anisse Astier. 3) Off by ones and other bug fixes in netprio_cgroup from Neil Horman. 4) ipv4 TCP reset generation needs to respect any network interface binding from the socket, otherwise route lookups might give a different result than all the other segments received. From Shawn Lu. 5) Fix unintended regression in ipv4 proxy ARP responses, from Thomas Graf. 6) Fix SKB under-allocation bug in sh_eth, from Yoshihiro Shimoda. 7) Revert skge PCI mapping changes that are causing crashes for some folks, from Stephen Hemminger. 8) IPV4 route lookups fill in the wildcarded fields of the given flow lookup key passed in, which is fine most of the time as this is exactly what the caller's want. However there are a few cases that want to retain the original flow key values afterwards, so handle those cases properly. Fix from Julian Anastasov. 9) IGB/IXGBE VF lookup bug fixes from Greg Rose. 10) Properly null terminate filename passed to ethtool flash device method, from Ben Hutchings. 11) S3 resume fix in via-velocity from David Lv. 12) Fix double SKB free during xmit failure in CAIF, from Dmitry Tarnyagin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits) net: Don't proxy arp respond if iif == rt->dst.dev if private VLAN is disabled ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr. netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=m netprio_cgroup: don't allocate prio table when a device is registered netprio_cgroup: fix an off-by-one bug bna: fix error handling of bnad_get_flash_partition_by_offset() isdn: type bug in isdn_net_header() net: Make qdisc_skb_cb upper size bound explicit. ixgbe: ethtool: stats user buffer overrun ixgbe: dcb: up2tc mapping lost on disable/enable CEE DCB state ixgbe: do not update real num queues when netdev is going away ixgbe: Fix broken dependency on MAX_SKB_FRAGS being related to page size ixgbe: Fix case of Tx Hang in PF with 32 VFs ixgbe: fix vf lookup igb: fix vf lookup e1000: add dropped DMA receive enable back in for WoL gro: more generic L2 header check IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses zd1211rw: firmware needs duration_id set to zero for non-pspoll frames net: enable TC35815 for MIPS again ...
| * | net: Don't proxy arp respond if iif == rt->dst.dev if private VLAN is disabledThomas Graf2012-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 653241 (net: RFC3069, private VLAN proxy arp support) changed the behavior of arp proxy to send arp replies back out on the interface the request came in even if the private VLAN feature is disabled. Previously we checked rt->dst.dev != skb->dev for in scenarios, when proxy arp is enabled on for the netdevice and also when individual proxy neighbour entries have been added. This patch adds the check back for the pneigh_lookup() scenario. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr.Li Wei2012-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix a bug which introduced by commit ac8a4810 (ipv4: Save nexthop address of LSRR/SSRR option to IPCB.).In that patch, we saved the nexthop of SRR in ip_option->nexthop and update iph->daddr until we get to ip_forward_options(), but we need to update it before ip_rt_get_source(), otherwise we may get a wrong src. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=mNeil Horman2012-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the netprio_cgroup module is not loaded, net_prio_subsys_id is -1, and so sock_update_prioidx() accesses cgroup_subsys array with negative index subsys[-1]. Make the code resembles cls_cgroup code, which is bug free. Origionally-authored-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netprio_cgroup: don't allocate prio table when a device is registeredNeil Horman2012-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So we delay the allocation till the priority is set through cgroup, and this makes skb_update_priority() faster when it's not set. This also eliminates an off-by-one bug similar with the one fixed in the previous patch. Origionally-authored-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netprio_cgroup: fix an off-by-one bugNeil Horman2012-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # mount -t cgroup xxx /mnt # mkdir /mnt/tmp # cat /mnt/tmp/net_prio.ifpriomap lo 0 eth0 0 virbr0 0 # echo 'lo 999' > /mnt/tmp/net_prio.ifpriomap # cat /mnt/tmp/net_prio.ifpriomap lo 999 eth0 0 virbr0 4101267344 We got weired output, because we exceeded the boundary of the array. We may even crash the kernel.. Origionally-authored-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bna: fix error handling of bnad_get_flash_partition_by_offset()Dan Carpenter2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current error handling doesn't work because we flash_part is a u32 so the checks for negative error codes don't work. I considered making things signed but I don't know the hardware enough to say if that's a problem. Really, we don't use the error codes so just returning zero for all problems is fine. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | isdn: type bug in isdn_net_header()Dan Carpenter2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use len to store the return value from eth_header(). eth_header() can return -ETH_HLEN (-14). We want to pass this back instead of truncating it to 65522 and returning that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: Make qdisc_skb_cb upper size bound explicit.David S. Miller2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside of other data structures. This is intended to be used by IPoIB so that it can remember addressing information stored at hard_header_ops->create() time that it can fetch when the packet gets to the transmit routine. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ixgbe: ethtool: stats user buffer overrunJohn Fastabend2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the number of tx/rx queues changes the ethtool ioctl ETHTOOL_GSTATS may overrun the userspace buffer. This occurs because the general practice in user space to query stats is to issue a ETHTOOL_GSSET cmd to learn the buffer size needed, allocate the buffer, then call ETHTOOL_GSTIRNGS and ETHTOOL_GSTATS. If the number of real_num_queues is changed or flow control attributes are changed after ETHTOOL_GSSET but before the ETHTOOL_GSTRINGS/ETHTOOL_GSTATS a user space buffer overrun occurs. To fix the overrun always return the max buffer size needed from get_sset_count() then return all strings and stats from get_strings()/get_ethtool_stats(). This _will_ change the output from the ioctl() call which could break applications and script parsing in theory. I believe these changes should not break existing tools because the only changes will be more {tx|rx}_queues and the {tx|rx}_pb_* stats will always be returned. Existing scripts already need to handle changing number of queues because this occurs today depending on system and current features. The {tx|rx}_pb_* stats are at the end of the output and should be handled by scripts today regardless. Finally get_ethtool_stats and get_strings are free-form outputs tools parsing these outputs should be defensive anyways. In the end these updates are better then having a tool segfault because of a buffer overrun. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: dcb: up2tc mapping lost on disable/enable CEE DCB stateJohn Fastabend2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users expect the up2tc mapping to be maintained across a DCB enable/disable/enable transition. And since we maintain all the other DCB attributes we should do this for up2tc mappings as well just to be consistent. Also without this we break user space applications that expect this to occur that previously worked. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: do not update real num queues when netdev is going awayYi Zou2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e #7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 #10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 #11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 #12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] #13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] #14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] #15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] #16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] #17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] #18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca #19 [ffff88021f007e68] worker_thread at ffffffff81060513 #20 [ffff88021f007ee8] kthread at ffffffff810648b6 #21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: Fix broken dependency on MAX_SKB_FRAGS being related to page sizeAlexander Duyck2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an issue in which RSC will generate corrupted frames when PAGE_SIZE is larger than 8K. Specifically it looks like that in 2.6.39 a change was made so that GRO would always have at least 16 frags available for coalescing, but the ixgbe RSC logic was not updated. As such the RSC feature would generate a frame larger than 64K and then overflow the value in the IP length field. To correct that I am now basing things on the PAGE_SIZE. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: Fix case of Tx Hang in PF with 32 VFsGreg Rose2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A check for the number of VFs allocated should have used a greater than equal operator instead of just greater than. This caused allocation of exactly 32 VFs to not enable the PF transmit and receive enables. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Robert E Garrett <robertX.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | ixgbe: fix vf lookupGreg Rose2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent addition of code to find already allocated VFs failed to take account that systems with 2 or more multi-port SR-IOV capable controllers might have already enabled VFs. Make sure that the VFs the function is finding are actually subordinate to the particular instance of the adapter that is looking for them and not subordinate to some device that has previously enabled SR-IOV. This bug exists in 3.2 stable as well as 3.3 release candidates. CC: stable@vger.kernel.org Reported-by: David Ahern <daahern@cisco.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Robert E Garrett <robertX.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | igb: fix vf lookupGreg Rose2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent addition of code to find already allocated VFs failed to take account that systems with 2 or more multi-port SR-IOV capable controllers might have already enabled VFs. Make sure that the VFs the function is finding are actually subordinate to the particular instance of the adapter that is looking for them and not subordinate to some device that has previously enabled SR-IOV. This is applicable to 3.2+ kernels. CC: stable@vger.kernel.org Reported-by: David Ahern <daahern@cisco.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Robert E Garrett <robertX.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | e1000: add dropped DMA receive enable back in for WoLDean Nelson2012-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d5bc77a223b0e9b9dfb002048d2b34a79e7d0b48 broke Wake-on-LAN by inadvertently dropping the enabling of DMA receives. Restore the enabling of DMA receives for WoL. This is applicable to 3.1+ stable trees. CC: stable@vger.stable.org Reported-by: Tobias Klausmann <klausman@schwarzvogel.de> Signed-off-by: Dean Nelson <dnelson@redhat.com> Tested-by: Tobias Klausmann <klausman@schwarzvogel.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>