aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* bnx2x: Spread rx buffers between allocated queuesDmitry Kravkov2010-09-13
| | | | | | | | | | | | | | Default number of rx buffers will be divided equally between allocated queues. This will decrease amount of pre-allocated buffers on systems with multiple CPUs. User can override this behavior with ethtool -G. Minimum amount of rx buffers per queue set to 128. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cx82310_eth: allow empty URBsOndrej Zary2010-09-13
| | | | | | | | Empty received URBs are currently counted as errors but the device sends them sometimes as part of regular traffic - so remove this check. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* cx82310_eth: check usb_string() return value for errorOndrej Zary2010-09-13
| | | | | | | | Fix that usb_string() return value is not checked for error (negative value). Also change the ignore message a bit and lower its level to info. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/net/skfp: Remove pr_<level> uses of KERN_<level>Joe Perches2010-09-13
| | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/cxgb3: remove undefined operationsAndreas Schwab2010-09-13
| | | | | | | | Modifying an object twice without an intervening sequence point is undefined. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/de4x5: remove undefined operationsAndreas Schwab2010-09-13
| | | | | | | | Modifying an object twice without an intervening sequence point is undefined. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* sundance: Add power management hooksDenis Kirjanov2010-09-13
| | | | | | | This patch to adds support for PM hooks into sundance driver Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow: better memory managementEric Dumazet2010-09-13
| | | | | | | | | | | | | | | Allocate hash tables for every online cpus, not every possible ones. NUMA aware allocations. Dont use a full page on arches where PAGE_SIZE > 1024*sizeof(void *) misc: __percpu , __read_mostly, __cpuinit annotations flow_compare_t is just an "unsigned long" Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Fix order of channel_name array dimensionsBen Hutchings2010-09-13
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bna: Check for NULL before deref in bnad_cb_tx_cleanupDavid S. Miller2010-09-12
| | | | | Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: remov unnecessary bh_disablestephen hemminger2010-09-10
| | | | | | | | Now that est_tree_lock is acquired with BH protection, the other call is unnecessary. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* fib: cleanupsEric Dumazet2010-09-10
| | | | | | | | | | | Use rcu_dereference_rtnl() helper Change hard coded constants in fib_flag_trans() 7 -> RTN_UNREACHABLE 8 -> RTN_PROHIBIT Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Allow changing the DMA ring sizes dynamically via ethtoolBen Hutchings2010-09-10
| | | | | | | | | | This requires some reorganisation of channel setup and teardown to ensure that we can always roll-back a failed change. Based on work by Steve Hodgson <shodgson@solarflare.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Make the dmaq size a run-time setting (rather than compile-time)Steve Hodgson2010-09-10
| | | | | | | | | | - Allow the ring size to be specified in non power-of-two sizes (for instance to limit the amount of receive buffers). - Automatically size the event queue. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Allocate each channel separately, along with its RX and TX queuesBen Hutchings2010-09-10
| | | | | | | | | | This will allow for reallocation of channel structures and rings. Change module parameter separate_tx_channels to be read-only, since we now require its value to be constant. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Refactor channel and queue lookup and iterationBen Hutchings2010-09-10
| | | | | | | | | | | | | | | | | | In preparation for changes to the way channels and queue structures are allocated, revise the macros and functions used to look up and iterator over them. - Replace efx_for_each_tx_queue() with iteration over channels then TX queues - Replace efx_for_each_rx_queue() with iteration over channels then RX queues (with one exception, shortly to be removed) - Introduce efx_get_{channel,rx_queue,tx_queue}() functions to look up channels and queues by index - Introduce efx_channel_get_{rx,tx}_queue() functions to look up a channel's queues Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Abstract channel and index lookup for RX queuesBen Hutchings2010-09-10
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Allocate DMA and event rings using GFP_KERNELBen Hutchings2010-09-10
| | | | | | | | | | | | | Currently we allocate DMA descriptor rings and event rings using pci_alloc_consistent() which selects non-blocking behaviour from the page allocator (GFP_ATOMIC). This is unnecessary, and since we currently allocate a single contiguous block for each ring (up to 32 pages!) these allocations are likely to fail if there is any significant memory pressure. Use dma_alloc_coherent() and GFP_KERNEL instead. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Fix failure paths in efx_probe_port()Ben Hutchings2010-09-10
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Remove declarations of functions that no longer existBen Hutchings2010-09-10
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Accumulate RX_NODESC_DROP count in rx_dropped, not rx_over_errorsBen Hutchings2010-09-10
| | | | | | | | | | | | rx_over_errors appears to be intended as a count of packets that overflow a packet buffer in the NIC. Given that we implement a cut-through receive path, this should always be 0. rx_dropped appears to be the correct counter for packets dropped due to lack of host buffers. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Use MCDI RX_BAD_FCS_PKTS count as MAC rx_bad countBen Hutchings2010-09-10
| | | | | | | | | Calculating rx_bad as rx_packets - rx_good is unnecessary and incorrect, since rx_good does not include control frames (e.g. pause frames) and rx_packets does. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-09-10
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/main.c
| * Merge branch 'vhost-net' of ↵David S. Miller2010-09-10
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
| | * vhost: error handling fixMichael S. Tsirkin2010-09-06
| | | | | | | | | | | | | | | | | | | | | vhost should set worker to NULL on cgroups attach failure, so that we won't try to destroy the worker again on close. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| | * vhost: fix attach to cgroups regressionMichael S. Tsirkin2010-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 2.6.36-rc1, non-root users of vhost-net fail to attach if they are in any cgroups. The reason is that when qemu uses vhost, vhost wants to attach its thread to all cgroups that qemu has. But we got the API backwards, so a non-priveledged process (Qemu) tried to control the priveledged one (vhost), which fails. Fix this by switching to the new cgroup_attach_task_all, and running it from the vhost thread. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| | * cgroups: fix API thinkoMichael S. Tsirkin2010-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup_attach_task_current_cg API that have upstream is backwards: we really need an API to attach to the cgroups from another process A to the current one. In our case (vhost), a priveledged user wants to attach it's task to cgroups from a less priveledged one, the API makes us run it in the other task's context, and this fails. So let's make the API generic and just pass in 'from' and 'to' tasks. Add an inline wrapper for cgroup_attach_task_current_cg to avoid breaking bisect. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com>
| * | ipheth: remove incorrect devtype to WWANDan Williams2010-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'wwan' devtype is meant for devices that require preconfiguration and *every* time setup before the ethernet interface can be used, like cellular modems which require a series of setup commands on serial ports or other mechanisms before the ethernet interface will handle packets. As ipheth only requires one-per-hotplug pairing setup with no preconfiguration (like APN, phone #, etc) and the network interface is usable at any time after that initial setup, remove the incorrect devtype wwan. Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | MAINTAINERS: Add CAIFJoe Perches2010-09-10
| | | | | | | | | | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sctp: fix test for end of loopJoe Perches2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a list_has_sctp_addr function to simplify loop Based on a patches by Dan Carpenter and David Miller Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2010-09-09
| |\ \ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
| | * \ Merge branch 'for-linus' of ↵Linus Torvalds2010-09-07
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: bus speed strings should be const PCI hotplug: Fix build with CONFIG_ACPI unset PCI: PCIe: Remove the port driver module exit routine PCI: PCIe: Move PCIe PME code to the pcie directory PCI: PCIe: Disable PCIe port services during port initialization PCI: PCIe: Ask BIOS for control of all native services at once ACPI/PCI: Negotiate _OSC control bits before requesting them ACPI/PCI: Do not preserve _OSC control bits returned by a query ACPI/PCI: Make acpi_pci_query_osc() return control bits ACPI/PCI: Reorder checks in acpi_pci_osc_control_set() PCI: PCIe: Introduce commad line switch for disabling port services PCI: PCIe AER: Introduce pci_aer_available() x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs
| | | * | PCI: bus speed strings should be constStephen Hemminger2010-08-31
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI hotplug: Fix build with CONFIG_ACPI unsetRafael J. Wysocki2010-08-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the recent changes caused complilation of drivers/pci/hotplug/pciehp_core.c to fail. Fix this issue. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: PCIe: Remove the port driver module exit routineKenji Kaneshige2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCIe port driver's module exit routine is never used, so drop it. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: PCIe: Move PCIe PME code to the pcie directoryRafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCIe PME code only consists of one file, so it doesn't need to occupy its own directory. Move it to drivers/pci/pcie/pme.c and remove the contents of drivers/pci/pcie/pme . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: PCIe: Disable PCIe port services during port initializationRafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In principle PCIe port services may be enabled by the BIOS, so it's better to disable them during port initialization to avoid spurious events from being generated. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: PCIe: Ask BIOS for control of all native services at onceRafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 852972acff8f10f3a15679be2059bb94916cba5d (ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe) control of the PCIe Capability Structure is unconditionally requested by acpi_pci_root_add(), which in principle may cause problems to happen in two ways. First, the BIOS may refuse to give control of the PCIe Capability Structure if it is not asked for any of the _OSC features depending on it at the same time. Second, the BIOS may assume that control of the _OSC features depending on the PCIe Capability Structure will be requested in the future and may behave incorrectly if that doesn't happen. For this reason, control of the PCIe Capability Structure should always be requested along with control of any other _OSC features that may depend on it (ie. PCIe native PME, PCIe native hot-plug, PCIe AER). Rework the PCIe port driver so that (1) it checks which native PCIe port services can be enabled, according to the BIOS, and (2) it requests control of all these services simultaneously. In particular, this causes pcie_portdrv_probe() to fail if the BIOS refuses to grant control of the PCIe Capability Structure, which means that no native PCIe port services can be enabled for the PCIe Root Complex the given port belongs to. If that happens, ASPM is disabled to avoid problems with mishandling it by the part of the PCIe hierarchy for which control of the PCIe Capability Structure has not been received. Make it possible to override this behavior using 'pcie_ports=native' (use the PCIe native services regardless of the BIOS response to the control request), or 'pcie_ports=compat' (do not use the PCIe native services at all). Accordingly, rework the existing PCIe port service drivers so that they don't request control of the services directly. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | ACPI/PCI: Negotiate _OSC control bits before requesting them Rafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that the BIOS will not grant control of all _OSC features requested via acpi_pci_osc_control_set(), so it is recommended to negotiate the final set of _OSC features with the query flag set before calling _OSC to request control of these features. To implement it, rework acpi_pci_osc_control_set() so that the caller can specify the mask of _OSC control bits to negotiate and the mask of _OSC control bits that are absolutely necessary to it. Then, acpi_pci_osc_control_set() will run _OSC queries in a loop until the mask of _OSC control bits returned by the BIOS is equal to the mask passed to it. Also, before running the _OSC request acpi_pci_osc_control_set() will check if the caller's required control bits are present in the final mask. Using this mechanism we will be able to avoid situations in which the BIOS doesn't grant control of certain _OSC features, because they depend on some other _OSC features that have not been requested. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | ACPI/PCI: Do not preserve _OSC control bits returned by a query Rafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is the assumption in acpi_pci_osc_control_set() that it is always sufficient to compare the mask of _OSC control bits to be requested with the result of an _OSC query where all of the known control bits have been checked. However, in general, that need not be the case. For example, if an _OSC feature A depends on an _OSC feature B and control of A, B plus another _OSC feature C is requested simultaneously, the BIOS may return A, B, C, while it would only return C if A and C were requested without B. That may result in passing a wrong mask of _OSC control bits to an _OSC control request, in which case the BIOS may only grant control of a subset of the requested features. Moreover, acpi_pci_run_osc() will return error code if that happens and the caller of acpi_pci_osc_control_set() will not know that it's been granted control of some _OSC features. Consequently, the system will generally not work as expected. Apart from this acpi_pci_osc_control_set() always uses the mask of _OSC control bits returned by the very first invocation of acpi_pci_query_osc(), but that is done with the second argument equal to OSC_PCI_SEGMENT_GROUPS_SUPPORT which generally happens to affect the returned _OSC control bits. For these reasons, make acpi_pci_osc_control_set() always check if control of the requested _OSC features will be granted before making the final control request. As a result, the osc_control_qry and osc_queried members of struct acpi_pci_root are not necessary any more, so drop them and remove the remaining code referring to them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | ACPI/PCI: Make acpi_pci_query_osc() return control bitsRafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make acpi_pci_query_osc() use an additional pointer argument to return the mask of control bits obtained from the BIOS to the caller. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()Rafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make acpi_pci_osc_control_set() attempt to find the handle of the _OSC object under the given PCI root bridge object after verifying that its second argument is correct and that there is a struct acpi_pci_root object for the given root bridge handle, which is more logical than the old code. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: PCIe: Introduce commad line switch for disabling port servicesRafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce kernel command line switch pcie_ports= allowing one to disable all of the native PCIe port services, so that PCIe ports are treated like PCI-to-PCI bridges. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: PCIe AER: Introduce pci_aer_available()Rafael J. Wysocki2010-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a function allowing the caller to check whether to try to enable PCIe AER. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are setJesse Barnes2010-08-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we'll duplicate definitions with the pci.h stubs. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | | * | PCI: provide stub pci_domain_nr function for !CONFIG_PCI configsDave Airlie2010-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allows the new PCI domain aware DRM code to compile on m68k. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| | * | | Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds2010-09-07
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: Make fiemap work with sparse files xfs: prevent 32bit overflow in space reservation xfs: Disallow 32bit project quota id xfs: improve buffer cache hash scalability
| | | * \ \ Merge branch '2.6.36-xfs-misc' of ↵Alex Elder2010-09-03
| | | |\ \ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev
| | | | * | | xfs: prevent 32bit overflow in space reservationDave Chinner2010-09-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we attempt to preallocate more than 2^32 blocks of space in a single syscall, the transaction block reservation will overflow leading to a hangs in the superblock block accounting code. This is trivially reproduced with xfs_io. Fix the problem by capping the allocation reservation to the maximum number of blocks a single xfs_bmapi() call can allocate (2^21 blocks). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| | | | * | | xfs: improve buffer cache hash scalabilityDave Chinner2010-09-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing large parallel file creates on a 16p machines, large amounts of time is being spent in _xfs_buf_find(). A system wide profile with perf top shows this: 1134740.00 19.3% _xfs_buf_find 733142.00 12.5% __ticket_spin_lock The problem is that the hash contains 45,000 buffers, and the hash table width is only 256 buffers. That means we've got around 200 buffers per chain, and searching it is quite expensive. The hash table size needs to increase. Secondly, every time we do a lookup, we promote the buffer we find to the head of the hash chain. This is causing cachelines to be dirtied and causes invalidation of cachelines across all CPUs that may have walked the hash chain recently. hence every walk of the hash chain is effectively a cold cache walk. Remove the promotion to avoid this invalidation. The results are: 1045043.00 21.2% __ticket_spin_lock 326184.00 6.6% _xfs_buf_find A 70% drop in the CPU usage when looking up buffers. Unfortunately that does not result in an increase in performance underthis workload as contention on the inode_lock soaks up most of the reduction in CPU usage. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>