aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* dw_dmac: disable dma in optimal way in probeAndy Shevchenko2012-06-20
| | | | | | | | | | The dw_dma_off call needs to have the all_chan_mask calculated. So, done this calculations before the call. Moreover, remove duplicate code that masks the DMA interrupts. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dw_dmac: use __func__ constant in the debug printsAndy Shevchenko2012-06-20
| | | | | | Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dw_dmac: print correct number of scanned descriptorsAndy Shevchenko2012-06-20
| | | | | | | | | In case the first descriptor we found is available, the counter still remains 0 value which is wrong. This patch fixes the counter behaviour. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dw_dmac: introduce dwc_dump_chan_regs to dump registersAndy Shevchenko2012-06-20
| | | | | | | | | There is three places where values of the most significant registers were printed. Make such piece of code as separate function. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dw_dmac: use proper casting to print dma_addr_t valuesAndy Shevchenko2012-06-20
| | | | | | | | | dma_addr_t is sometimes 32 bit and sometimes 64. We normally cast them to unsigned long long for printk(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dw_dmac: fix constant in the commentAndy Shevchenko2012-06-20
| | | | | | Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dmaengine: mmp_tdma: add mmp tdma supportZhangfei Gao2012-06-20
| | | | | | | | | | Add support for two-channel dma under dmaengine support: mmp-adma and pxa910-squ Signed-off-by: Zhangfei Gao <zhangfei.gao@marvell.com> Signed-off-by: Leo Yan <leoy@marvell.com> Signed-off-by: Qiao Zhou <zhouqiao@marvell.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dmaengine: Add wrapper for device_tx_status callbackLars-Peter Clausen2012-06-20
| | | | | | | | | | | | | This patch adds a small inline wrapper for the devivce_tx_status callback of a dma device. This makes the source code of users of this function a bit more compact and a bit more legible. E.g.: -status = chan->device->device_tx_status(chan, cookie, &state) +status = dmaengine_tx_status(chan, cookie, &state) Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* Merge branch 'fixes' into nextVinod Koul2012-06-14
|\
| * DMA: PL330: Fix racy mutex unlockJavi Merino2012-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pl330_update() stores a pointer to the thrd->req that finished, which contains a pointer to the corresponding pl330_req. This is done with the pl330_lock held. Then, it iterates through the req_done list, calling the callback for each of the requests that are done. The problem is that the driver releases the lock before calling the callback for each of the callbacks. pl330_submit_req() running in another processor can then acquire the lock and insert another request in one of the thrd->req that hasn't been processed yet, replacing the pointer to pl330_req there. When the callback returns in pl330_update() and the next rqdone is popped from the list, it dereferences the pl330_req pointer to the just scheduled pl330_req, instead of the one that has finished, calling pl330 with the wrong r. This patch fixes this by storing the pointer to pl330_req directly in the list. Signed-off-by: Javi Merino <javi.merino@arm.com> Cc: Jassi Brar <jaswinder.singh@linaro.org> Acked-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* | dma: coh901318: use devm allocationLinus Walleij2012-06-13
| | | | | | | | | | | | | | | | Allocate memory, region, remap and irq for device state using devm_* helpers to simplify memory accounting. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* | dmaengine: at_hdmac: trivial: fix comment in headerNicolas Ferre2012-06-12
| | | | | | | | | | | | | | | | | | | | Not all Atmel SoCs were pointed out in header comment which was bringing confusion. Remove the truncated list of supported devices, replace by the only one that is not supported. Reported-by: Elen Song <elen.song@atmel.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* | dma: tegra: add dmaengine based dma driverLaxman Dewangan2012-06-08
| | | | | | | | | | | | | | | | | | | | | | | | Add dmaengine based NVIDIA's Tegra APB DMA driver. This driver support the slave mode of data transfer from peripheral to memory and vice versa. The driver supports for the cyclic and non-cyclic mode of data transfer. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Acked-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* | dma: dmaengine: add slave req id in slave_configLaxman Dewangan2012-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | The DMA controller like Nvidia's Tegra Dma controller supports the different slave requestor id from different slave. This need to be configure in dma controller to handle the request properly. Adding the slave-id in the slave configuration so that information can be passed from client when configuring for slave. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* | dma: enable mxs-dma for imx6qHuang Shijie2012-06-07
|/ | | | | | | | | enable the mxs-dma for imx6q. Also remove the unused header file. Signed-off-by: Huang Shijie <shijie8@gmail.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* DMA: PL330: Add missing static storage class specifierSachin Kamat2012-06-07
| | | | | | | | Fixes the following sparse warning: drivers/dma/pl330.c:2542:5: warning: symbol 'add_desc' was not declared. Should it be static? Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dma: imx-sdma: buf_tail should be initialize in prepare functionRichard Zhao2012-06-07
| | | | | | | | | | | This fix audio underrun issue. When SNDRV_PCM_TRIGGER_STOP and SNDRV_PCM_TRIGGER_START, it calls prepare again. buf_tail should be reset to zero. So move buf_tail initialization into prepare function. Signed-off-by: Richard Zhao <richard.zhao@freescale.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* dmaengine: pl330: dont complete descriptor for cyclic dmaTushar Behera2012-06-07
| | | | | | | | | | | | | Commit eab215855803 ("dmaengine: pl330: dont complete descriptor for cyclic dma") wrongly completes descriptor for cyclic dma, hence following BUG_ON is still hit with cyclic DMA operations. kernel BUG at drivers/dma/dmaengine.h:53! Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Acked-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com> Cc: stable <stable@vger.kernel.org>
* Linux 3.5-rc1Linus Torvalds2012-06-02
|
* Merge tag 'dm-3.5-changes-1' of ↵Linus Torvalds2012-06-02
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm Pull device-mapper updates from Alasdair G Kergon: "Improve multipath's retrying mechanism in some defined circumstances and provide a simple reserve/release mechanism for userspace tools to access thin provisioning metadata while the pool is in use." * tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: dm thin: provide userspace access to pool metadata dm thin: use slab mempools dm mpath: allow ioctls to trigger pg init dm mpath: delay retry of bypassed pg dm mpath: reduce size of struct multipath
| * dm thin: provide userspace access to pool metadataJoe Thornber2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements two new messages that can be sent to the thin pool target allowing it to take a snapshot of the _metadata_. This, read-only snapshot can be accessed by userland, concurrently with the live target. Only one metadata snapshot can be held at a time. The pool's status line will give the block location for the current msnap. Since version 0.1.5 of the userland thin provisioning tools, the thin_dump program displays the msnap as follows: thin_dump -m <msnap root> <metadata dev> Available here: https://github.com/jthornber/thin-provisioning-tools Now that userland can access the metadata we can do various things that have traditionally been kernel side tasks: i) Incremental backups. By using metadata snapshots we can work out what blocks have changed over time. Combined with data snapshots we can ensure the data doesn't change while we back it up. A short proof of concept script can be found here: https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb ii) Migration of thin devices from one pool to another. iii) Merging snapshots back into an external origin. iv) Asyncronous replication. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * dm thin: use slab mempoolsMike Snitzer2012-06-02
| | | | | | | | | | | | | | | | | | Use dedicated caches prefixed with a "dm_" name rather than relying on kmalloc mempools backed by generic slab caches so the memory usage of thin provisioning (and any leaks) can be accounted for independently. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * dm mpath: allow ioctls to trigger pg initMikulas Patocka2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the failure of a group of paths, any alternative paths that need initialising do not become available until further I/O is sent to the device. Until this has happened, ioctls return -EAGAIN. With this patch, new paths are made available in response to an ioctl too. The processing of the ioctl gets delayed until this has happened. Instead of returning an error, we submit a work item to kmultipathd (that will potentially activate the new path) and retry in ten milliseconds. Note that the patch doesn't retry an ioctl if the ioctl itself fails due to a path failure. Such retries should be handled intelligently by the code that generated the ioctl in the first place, noting that some SCSI commands should not be retried because they are not idempotent (XOR write commands). For commands that could be retried, there is a danger that if the device rejected the SCSI command, the path could be errorneously marked as failed, and the request would be retried on another path which might fail too. It can be determined if the failure happens on the device or on the SCSI controller, but there is no guarantee that all SCSI drivers set these flags correctly. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * dm mpath: delay retry of bypassed pgMike Christie2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If I/O needs retrying and only bypassed priority groups are available, set the pg_init_delay_retry flag to wait before retrying. If, for example, the reason for the bypass is that the controller is getting reset or there is a firmware upgrade happening, retrying right away would cause a flood of log messages and retries for what could be a few seconds or even several minutes. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * dm mpath: reduce size of struct multipathMike Snitzer2012-06-02
| | | | | | | | | | | | | | | | | | | | | | Move multipath structure's 'lock' and 'queue_size' members to eliminate two 4-byte holes. Also use a bit within a single unsigned int for each existing flag (saves 8-bytes). This allows future flags to be added without each consuming an unsigned int. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2012-06-02
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Make syn floods consume significantly less resources by a) Not pre-COW'ing routing metrics for SYN/ACKs b) Mirroring the device queue mapping of the SYN for the SYN/ACK reply. Both from Eric Dumazet. 2) Fix calculation errors in Byte Queue Limiting, from Hiroaki SHIMODA. 3) Validate the length requested when building a paged SKB for a socket, so we don't overrun the page vector accidently. From Jason Wang. 4) When netlabel is disabled, we abort all IP option processing when we see a CIPSO option. This isn't the right thing to do, we should simply skip over it and continue processing the remaining options (if any). Fix from Paul Moore. 5) SRIOV fixes for the mellanox driver from Jack orgenstein and Marcel Apfelbaum. 6) 8139cp enables the receiver before the ring address is properly programmed, which potentially lets the device crap over random memory. Fix from Jason Wang. 7) e1000/e1000e fixes for i217 RST handling, and an improper buffer address reference in jumbo RX frame processing from Bruce Allan and Sebastian Andrzej Siewior, respectively. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: fec_mpc52xx: fix timestamp filtering mcs7830: Implement link state detection e1000e: fix Rapid Start Technology support for i217 e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats() r8169: call netif_napi_del at errpaths and at driver unload tcp: reflect SYN queue_mapping into SYNACK packets tcp: do not create inetpeer on SYNACK message 8139cp/8139too: terminate the eeprom access with the right opmode 8139cp: set ring address before enabling receiver cipso: handle CIPSO options correctly when NetLabel is disabled net: sock: validate data_len before allocating skb in sock_alloc_send_pskb() bql: Avoid possible inconsistent calculation. bql: Avoid unneeded limit decrement. bql: Fix POSDIFF() to integer overflow aware. net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap net/mlx4_core: Fixes for VF / Guest startup flow net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event net/mlx4_core: Fix number of EQs used in ICM initialisation net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int
| * | fec_mpc52xx: fix timestamp filteringStephan Gatzka2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb_defer_rx_timestamp was called with a freshly allocated skb but must be called with rskb instead. Signed-off-by: Stephan Gatzka <stephan@gatzka.org> Cc: stable <stable@vger.kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | mcs7830: Implement link state detectionOndrej Zary2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add .status callback that detects link state changes. Tested with MCS7832CV-AA chip (9710:7830, identified as rev.C by the driver). Fixes https://bugzilla.kernel.org/show_bug.cgi?id=28532 Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | e1000e: fix Rapid Start Technology support for i217Bruce Allan2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The definition of I217_PROXY_CTRL must use the BM_PHY_REG() macro instead of the PHY_REG() macro for PHY page 800 register 70 since it is for a PHY register greater than the maximum allowed by the latter macro, and fix a typo setting the I217_MEMPWR register in e1000_suspend_workarounds_ich8lan. Also for clarity, rename a few defines as bit definitions instead of masks. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats()Sebastian Andrzej Siewior2012-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is another fixup where the data is not transfered into buffer addressed by skb->data but into a page. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | r8169: call netif_napi_del at errpaths and at driver unloadDevendra Naga2012-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | when register_netdev fails, the init'ed NAPIs by netif_napi_add must be deleted with netif_napi_del, and also when driver unloads, it should delete the NAPI before unregistering netdevice using unregister_netdev. Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tcp: reflect SYN queue_mapping into SYNACK packetsEric Dumazet2012-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing how linux behaves on SYNFLOOD attack on multiqueue device (ixgbe), I found that SYNACK messages were dropped at Qdisc level because we send them all on a single queue. Obvious choice is to reflect incoming SYN packet @queue_mapping to SYNACK packet. Under stress, my machine could only send 25.000 SYNACK per second (for 200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues. After patch, not a single SYNACK is dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tcp: do not create inetpeer on SYNACK messageEric Dumazet2012-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting larger and larger, using lots of memory and cpu time. tcp_v4_send_synack() ->inet_csk_route_req() ->ip_route_output_flow() ->rt_set_nexthop() ->rt_init_metrics() ->inet_getpeer( create = true) This is a side effect of commit a4daad6b09230 (net: Pre-COW metrics for TCP) added in 2.6.39 Possible solution : Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS Before patch : # grep peer /proc/slabinfo inet_peer_cache 4175430 4175430 192 42 2 : tunables 0 0 0 : slabdata 99415 99415 0 Samples: 41K of event 'cycles', Event count (approx.): 30716565122 + 20,24% ksoftirqd/0 [kernel.kallsyms] [k] inet_getpeer + 8,19% ksoftirqd/0 [kernel.kallsyms] [k] peer_avl_rebalance.isra.1 + 4,81% ksoftirqd/0 [kernel.kallsyms] [k] sha_transform + 3,64% ksoftirqd/0 [kernel.kallsyms] [k] fib_table_lookup + 2,36% ksoftirqd/0 [ixgbe] [k] ixgbe_poll + 2,16% ksoftirqd/0 [kernel.kallsyms] [k] __ip_route_output_key + 2,11% ksoftirqd/0 [kernel.kallsyms] [k] kernel_map_pages + 2,11% ksoftirqd/0 [kernel.kallsyms] [k] ip_route_input_common + 2,01% ksoftirqd/0 [kernel.kallsyms] [k] __inet_lookup_established + 1,83% ksoftirqd/0 [kernel.kallsyms] [k] md5_transform + 1,75% ksoftirqd/0 [kernel.kallsyms] [k] check_leaf.isra.9 + 1,49% ksoftirqd/0 [kernel.kallsyms] [k] ipt_do_table + 1,46% ksoftirqd/0 [kernel.kallsyms] [k] hrtimer_interrupt + 1,45% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_alloc + 1,29% ksoftirqd/0 [kernel.kallsyms] [k] inet_csk_search_req + 1,29% ksoftirqd/0 [kernel.kallsyms] [k] __netif_receive_skb + 1,16% ksoftirqd/0 [kernel.kallsyms] [k] copy_user_generic_string + 1,15% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_free + 1,02% ksoftirqd/0 [kernel.kallsyms] [k] tcp_make_synack + 0,93% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock_bh + 0,87% ksoftirqd/0 [kernel.kallsyms] [k] __call_rcu + 0,84% ksoftirqd/0 [kernel.kallsyms] [k] rt_garbage_collect + 0,84% ksoftirqd/0 [kernel.kallsyms] [k] fib_rules_lookup Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | 8139cp/8139too: terminate the eeprom access with the right opmodeJason Wang2012-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we terminate the eeprom access through clearing the CS by: RTL_W8 (Cfg9346, ~EE_CS); or writeb (~EE_CS, ee_addr); This would left the eeprom into "Config. Register Write Enable:" state which is not expcted as the highest two bits were set to 0x11 ( expected is the "Normal" mode (0x00)). Solving this by write 0x0 instead of ~EE_CS when terminating the eeprom access. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | 8139cp: set ring address before enabling receiverJason Wang2012-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we enable the receiver before setting the ring address which could lead the card DMA into unexpected areas. Solving this by set the ring address before enabling the receiver. btw. I find and test this in qemu as I didn't have a 8139cp card in hand. please review it carefully. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | cipso: handle CIPSO options correctly when NetLabel is disabledPaul Moore2012-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system receives a CIPSO tagged packet it is dropped (cipso_v4_validate() returns non-zero). In most cases this is the correct and desired behavior, however, in the case where we are simply forwarding the traffic, e.g. acting as a network bridge, this becomes a problem. This patch fixes the forwarding problem by providing the basic CIPSO validation code directly in ip_options_compile() without the need for the NetLabel or CIPSO code. The new validation code can not perform any of the CIPSO option label/value verification that cipso_v4_validate() does, but it can verify the basic CIPSO option format. The behavior when NetLabel is enabled is unchanged. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: sock: validate data_len before allocating skb in sock_alloc_send_pskb()Jason Wang2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | We need to validate the number of pages consumed by data_len, otherwise frags array could be overflowed by userspace. So this patch validate data_len and return -EMSGSIZE when data_len may occupies more frags than MAX_SKB_FRAGS. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bql: Avoid possible inconsistent calculation.Hiroaki SHIMODA2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dql->num_queued could change while processing dql_completed(). To provide consistent calculation, added an on stack variable. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bql: Avoid unneeded limit decrement.Hiroaki SHIMODA2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When below pattern is observed, TIME dql_queued() dql_completed() | a) initial state | | b) X bytes queued V c) Y bytes queued d) X bytes completed e) Z bytes queued f) Y bytes completed a) dql->limit has already some value and there is no in-flight packet. b) X bytes queued. c) Y bytes queued and excess limit. d) X bytes completed and dql->prev_ovlimit is set and also dql->prev_num_queued is set Y. e) Z bytes queued. f) Y bytes completed. inprogress and prev_inprogress are true. At f), according to the comment, all_prev_completed becomes true and limit should be increased. But POSDIFF() ignores (completed == dql->prev_num_queued) case, so limit is decreased. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bql: Fix POSDIFF() to integer overflow aware.Hiroaki SHIMODA2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSDIFF() fails to take into account integer overflow case. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAPJack Morgenstein2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "!mlx4_is_slave" is totally confusing. Fix with constant MLX4_CMD_NATIVE, which is the intended behavior. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_core: Check port out-of-range before using in mlx4_slave_capJack Morgenstein2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The range check was performed after using the port number. Reverse this to prevent a potential array overflow. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_core: Fixes for VF / Guest startup flowJack Morgenstein2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - pass the following parameters: - firmware version (added QUERY_FW paravirtualization for that) - disable Blueflame on slaves. KVM disables write combining on guests, and we get better performance without BF in this case. (This requires QUERY_DEV_CAP paravirtualization, also in this commit) - max qp rdma as destination - get rid of a chunk of "if (0)" dead code Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_eventJack Morgenstein2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Port is used as an array index before we know if that is proper. For example, in the catas event case, port is zero; however, the port index should lie in the range (1..2). Fix this by using 'port' only in the events where it is of interest. Test for port out of range in the default (unhandled event) case, and do not output a message if it is not an ethernet port. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_core: Fix number of EQs used in ICM initialisationMarcel Apfelbaum2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In SRIOV mode, the number of EQs used when computing the total ICM size was incorrect. To fix this, we do the following: 1. We add a new structure to mlx4_dev, mlx4_phys_caps, to contain physical HCA capabilities. The PPF uses the phys capabilities when it computes things like ICM size. The dev_caps structure will then contain the paravirtualized values, making bookkeeping much easier in SRIOV mode. We add a structure rather than a single parameter because there will be other fields in the phys_caps. The first field we add to the mlx4_phys_caps structure is num_phys_eqs. 2. In INIT_HCA, when running in SRIOV mode, the "log_num_eqs" parameter passed to the FW is the number of EQs per VF/PF; each function (PF or VF) has this number of EQs available. However, the total number of EQs which must be allowed for in the ICM is (1 << log_num_eqs) * (#VFs + #PFs). Rather than compute this quantity, we allocate ICM space for 1024 EQs (which is the device maximum number of EQs, and which is the value we place in the mlx4_phys_caps structure). For INIT_HCA, however, we use the per-function number of EQs as described above. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_intJack Morgenstein2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | Ths fixes the comparison in the FLR (Function Level Reset) event case. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2012-06-02
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull straggler x86 fixes from Peter Anvin: "Three groups of patches: - EFI boot stub documentation and the ability to print error messages; - Removal for PTRACE_ARCH_PRCTL for x32 (obsolete interface which should never have been ported, and the port is broken and potentially dangerous.) - ftrace stack corruption fixes. I'm not super-happy about the technical implementation, but it is probably the least invasive in the short term. In the future I would like a single method for nesting the debug stack, however." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32 x86, efi: Add EFI boot stub documentation x86, efi; Add EFI boot stub console support x86, efi: Only close open files in error path ftrace/x86: Do not change stacks in DEBUG when calling lockdep x86: Allow nesting of the debug stack IDT setting x86: Reset the debug_stack update counter ftrace: Use breakpoint method to update ftrace caller ftrace: Synchronize variable setting with breakpoints
| * \ \ Merge remote-tracking branch 'rostedt/tip/perf/urgent-2' into ↵H. Peter Anvin2012-06-01
| |\ \ \ | | | | | | | | | | | | | | | x86-urgent-for-linus
| | * | | ftrace/x86: Do not change stacks in DEBUG when calling lockdepSteven Rostedt2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When both DYNAMIC_FTRACE and LOCKDEP are set, the TRACE_IRQS_ON/OFF will call into the lockdep code. The lockdep code can call lots of functions that may be traced by ftrace. When ftrace is updating its code and hits a breakpoint, the breakpoint handler will call into lockdep. If lockdep happens to call a function that also has a breakpoint attached, it will jump back into the breakpoint handler resetting the stack to the debug stack and corrupt the contents currently on that stack. The 'do_sym' call that calls do_int3() is protected by modifying the IST table to point to a different location if another breakpoint is hit. But the TRACE_IRQS_OFF/ON are outside that protection, and if a breakpoint is hit from those, the stack will get corrupted, and the kernel will crash: [ 1013.243754] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 [ 1013.272665] IP: [<ffff880145cc0000>] 0xffff880145cbffff [ 1013.285186] PGD 1401b2067 PUD 14324c067 PMD 0 [ 1013.298832] Oops: 0010 [#1] PREEMPT SMP [ 1013.310600] CPU 2 [ 1013.317904] Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr iTCO_wdt i2c_i801 iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] [ 1013.401848] [ 1013.407399] Pid: 112, comm: kworker/2:1 Not tainted 3.4.0+ #30 [ 1013.437943] RIP: 8eb8:[<ffff88014630a000>] [<ffff88014630a000>] 0xffff880146309fff [ 1013.459871] RSP: ffffffff8165e919:ffff88014780f408 EFLAGS: 00010046 [ 1013.477909] RAX: 0000000000000001 RBX: ffffffff81104020 RCX: 0000000000000000 [ 1013.499458] RDX: ffff880148008ea8 RSI: ffffffff8131ef40 RDI: ffffffff82203b20 [ 1013.521612] RBP: ffffffff81005751 R08: 0000000000000000 R09: 0000000000000000 [ 1013.543121] R10: ffffffff82cdc318 R11: 0000000000000000 R12: ffff880145cc0000 [ 1013.564614] R13: ffff880148008eb8 R14: 0000000000000002 R15: ffff88014780cb40 [ 1013.586108] FS: 0000000000000000(0000) GS:ffff880148000000(0000) knlGS:0000000000000000 [ 1013.609458] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1013.627420] CR2: 0000000000000002 CR3: 0000000141f10000 CR4: 00000000001407e0 [ 1013.649051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1013.670724] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1013.692376] Process kworker/2:1 (pid: 112, threadinfo ffff88013fe0e000, task ffff88014020a6a0) [ 1013.717028] Stack: [ 1013.724131] ffff88014780f570 ffff880145cc0000 0000400000004000 0000000000000000 [ 1013.745918] cccccccccccccccc ffff88014780cca8 ffffffff811072bb ffffffff81651627 [ 1013.767870] ffffffff8118f8a7 ffffffff811072bb ffffffff81f2b6c5 ffffffff81f11bdb [ 1013.790021] Call Trace: [ 1013.800701] Code: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a <e7> d7 64 81 ff ff ff ff 01 00 00 00 00 00 00 00 65 d9 64 81 ff [ 1013.861443] RIP [<ffff88014630a000>] 0xffff880146309fff [ 1013.884466] RSP <ffff88014780f408> [ 1013.901507] CR2: 0000000000000002 The solution was to reuse the NMI functions that change the IDT table to make the debug stack keep its current stack (in kernel mode) when hitting a breakpoint: call debug_stack_set_zero TRACE_IRQS_ON call debug_stack_reset If the TRACE_IRQS_ON happens to hit a breakpoint then it will keep the current stack and not crash the box. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | x86: Allow nesting of the debug stack IDT settingSteven Rostedt2012-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the NMI handler runs, it checks if it preempted a debug handler and if that handler is using the debug stack. If it is, it changes the IDT table not to update the stack, otherwise it will reset the debug stack and corrupt the debug handler it preempted. Now that ftrace uses breakpoints to change functions from nops to callers, many more places may hit a breakpoint. Unfortunately this includes some of the calls that lockdep performs. Which causes issues with the debug stack. It too needs to change the debug stack before tracing (if called from the debug handler). Allow the debug_stack_set_zero() and debug_stack_reset() to be nested so that the debug handlers can take advantage of them too. [ Used this_cpu_*() over __get_cpu_var() as suggested by H. Peter Anvin ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>