aboutsummaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAge
* CAPI: Rework locking of controller data structuresJan Kiszka2010-02-16
| | | | | | | | | | | | | This patch applies the mutex so far only protecting the controller list to (almost) all accesses of controller data structures. It also reworks waiting on state changes in old_capi_manufacturer so that it no longer poll and holds a module reference to the controller owner while waiting (the latter was partly done already). Modification and checking of the blocked state remains racy by design, the caller is responsible for dealing with this. Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* CAPI: Rework controller state notifierJan Kiszka2010-02-16
| | | | | | | | | | | | | | | | | | Another step towards proper locking: Rework the callback provided to capidrv for controller state changes. This is so far attached to an application, which would require us to hold the corresponding lock across notification calls. But there is no direct relation between a controller up/down event and an application, so let's decouple them and provide a notifier call chain for those events instead. This notifier chain is first of all used internally. Here we request the highest priority to unsure that housekeeping work is done before any other notifications. The chain is exported via [un]register_capictr_notifier to our only user, capidrv, to replace the racy and unfixable capi20_set_callback. Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* CAPI: Call a controller 'controller', not 'card'Jan Kiszka2010-02-16
| | | | | | | | At least for our internal use, fix the misnomers that refer to a CAPI controller as 'card'. No functional changes. Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net neigh: Decouple per interface neighbour table controls from binary sysctlsEric W. Biederman2010-02-16
| | | | | | | | | | | | | | Stop computing the number of neighbour table settings we have by counting the number of binary sysctls. This behaviour was silly and meant that we could not add another neighbour table setting without also adding another binary sysctl. Don't pass the binary sysctl path for neighour table entries into neigh_sysctl_register. These parameters are no longer used and so are just dead code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net ipv4: Decouple ipv4 interface parameters from binary sysctl numbersEric W. Biederman2010-02-16
| | | | | | | | | | Stop using the binary sysctl enumeartion in sysctl.h as an index into a per interface array. This leads to unnecessary binary sysctl number allocation, and a fragility in data structure and implementation because of unnecessary coupling. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-02-16
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
| * netfilter: CONFIG_COMPAT: allow delta to exceed 32767Florian Westphal2010-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with 32 bit userland and 64 bit kernels, it is unlikely but possible that insertion of new rules fails even tough there are only about 2000 iptables rules. This happens because the compat delta is using a short int. Easily reproducible via "iptables -m limit" ; after about 2050 rules inserting new ones fails with -ELOOP. Note that compat_delta included 2 bytes of padding on x86_64, so structure size remains the same. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: ctnetlink: add zone supportPatrick McHardy2010-02-15
| | | | | | | | | | | | Parse and dump the conntrack zone in ctnetlink. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_conntrack: add support for "conntrack zones"Patrick McHardy2010-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | Normally, each connection needs a unique identity. Conntrack zones allow to specify a numerical zone using the CT target, connections in different zones can use the same identity. Example: iptables -t raw -A PREROUTING -i veth0 -j CT --zone 1 iptables -t raw -A OUTPUT -o veth1 -j CT --zone 1 Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_conntrack: pass template to l4proto ->error() handlerPatrick McHardy2010-02-15
| | | | | | | | | | | | | | The error handlers might need the template to get the conntrack zone introduced in the next patches to perform a conntrack lookup. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: xtables: constify args in compat copying functionsJan Engelhardt2010-02-15
| | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * netfilter: get rid of the grossness in netfilter.hJan Engelhardt2010-02-15
| | | | | | | | | | | | | | GCC is now smart enough to follow the inline trail correctly. vmlinux size remain the same. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * netfilter: reduce NF_HOOK by one argumentJan Engelhardt2010-02-15
| | | | | | | | | | | | No changes in vmlinux filesize. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * netfilter: nf_conntrack: elegantly simplify nf_ct_exp_net()Alexey Dobriyan2010-02-12
| | | | | | | | | | | | | | Remove #ifdef at nf_ct_exp_net() by using nf_ct_net(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_conntrack_sip: add T.38 FAX supportPatrick McHardy2010-02-11
| | | | | | | | Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_nat_sip: add TCP supportPatrick McHardy2010-02-11
| | | | | | | | | | | | Add support for mangling TCP SIP packets. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_nat: support mangling a single TCP packet multiple timesPatrick McHardy2010-02-11
| | | | | | | | | | | | | | | | | | | | | | | | nf_nat_mangle_tcp_packet() can currently only handle a single mangling per window because it only maintains two sequence adjustment positions: the one before the last adjustment and the one after. This patch makes sequence number adjustment tracking in nf_nat_mangle_tcp_packet() optional and allows a helper to manually update the offsets after the packet has been fully handled. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_conntrack_sip: add TCP supportPatrick McHardy2010-02-11
| | | | | | | | | | | | | | | | | | | | Add TCP support, which is mandated by RFC3261 for all SIP elements. SIP over TCP is similar to UDP, except that messages are delimited by Content-Length: headers and multiple messages may appear in one packet. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_conntrack_sip: pass data offset to NAT functionsPatrick McHardy2010-02-11
| | | | | | | | | | | | | | | | | | | | When using TCP multiple SIP messages might be present in a single packet. A following patch will parse them by setting the dptr to the beginning of each message. The NAT helper needs to reload the dptr value after mangling the packet however, so it needs to know the offset of the message to the beginning of the packet. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * netfilter: nf_conntrack: show helper and class in /proc/net/nf_conntrack_expectPatrick McHardy2010-02-11
| | | | | | | | | | | | | | Make the output a bit more informative by showing the helper an expectation belongs to and the expectation class. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * Merge branch 'master' of git://dev.medozas.de/linuxPatrick McHardy2010-02-10
| |\
| | * netfilter: xtables: generate initial table on-demandJan Engelhardt2010-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The static initial tables are pretty large, and after the net namespace has been instantiated, they just hang around for nothing. This commit removes them and creates tables on-demand at runtime when needed. Size shrinks by 7735 bytes (x86_64). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| | * netfilter: xtables: use xt_table for hook instantiationJan Engelhardt2010-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The respective xt_table structures already have most of the metadata needed for hook setup. Add a 'priority' field to struct xt_table so that xt_hook_link() can be called with a reduced number of arguments. So should we be having more tables in the future, it comes at no static cost (only runtime, as before) - space saved: 6807373->6806555. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
| * | netfilter: xtables: symmetric COMPAT_XT_ALIGN definitionAlexey Dobriyan2010-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite COMPAT_XT_ALIGN in terms of dummy structure hack. Compat counters logically have nothing to do with it. Use ALIGN() macro while I'm at it for same types. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xtables: consistent struct compat_xt_counters definitionAlexey Dobriyan2010-02-10
| |/ | | | | | | | | | | | | | | | | There is compat_u64 type which deals with different u64 type alignment on different compat-capable platforms, so use it and removed some hardcoded assumptions. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy2010-02-10
| |\ | | | | | | | | | Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xtables: add CT targetPatrick McHardy2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new target for the raw table, which can be used to specify conntrack parameters for specific connections, f.i. the conntrack helper. The target attaches a "template" connection tracking entry to the skb, which is used by the conntrack core when initializing a new conntrack. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_conntrack: support conntrack templatesPatrick McHardy2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support initializing selected parameters of new conntrack entries from a "conntrack template", which is a specially marked conntrack entry attached to the skb. Currently the helper and the event delivery masks can be initialized this way. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: ctnetlink: support selective event deliveryPatrick McHardy2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add two masks for conntrack end expectation events to struct nf_conntrack_ecache and use them to filter events. Their default value is "all events" when the event sysctl is on and "no events" when it is off. A following patch will add specific initializations. Expectation events depend on the ecache struct of their master conntrack. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nf_conntrack: split up IPCT_STATUS eventPatrick McHardy2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split up the IPCT_STATUS event into an IPCT_REPLY event, which is generated when the IPS_SEEN_REPLY bit is set, and an IPCT_ASSURED event, which is generated when the IPS_ASSURED bit is set. In combination with a following patch to support selective event delivery, this can be used for "sparse" conntrack replication: start replicating the conntrack entry after it reached the ASSURED state and that way it's SYN-flood resistant. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: add struct net * to target parametersPatrick McHardy2010-02-03
| | | | | | | | | | | | Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: ctnetlink: only assign helpers for matching protocolsPatrick McHardy2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure not to assign a helper for a different network or transport layer protocol to a connection. Additionally change expectation deletion by helper to compare the name directly - there might be multiple helper registrations using the same name, currently one of them is chosen in an unpredictable manner and only those expectations are removed. Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xtables: CONFIG_COMPAT reduxAlexey Dobriyan2010-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ifdef out struct nf_sockopt_ops::compat_set struct nf_sockopt_ops::compat_get struct xt_match::compat_from_user struct xt_match::compat_to_user struct xt_match::compatsize to make structures smaller on COMPAT=n kernels. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | IPv6: reassembly: replace magic number with macro definitionsShan Wei2010-01-20
| | | | | | | | | | | | | | | | | | | | | | | | Use macro to define high/low thresh value, refer to IPV6_FRAG_TIMEOUT. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xtables: add struct xt_mtdtor_param::netAlexey Dobriyan2010-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ->net to match destructor list like ->net in constructor list. Make sure it's set in ebtables/iptables/ip6tables, this requires to propagate netns up to *_unregister_table(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xtables: add struct xt_mtchk_param::netAlexey Dobriyan2010-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some complex match modules (like xt_hashlimit/xt_recent) want netns information at constructor and destructor time. We propably can play games at match destruction time, because netns can be passed in object, but I think it's cleaner to explicitly pass netns. Add ->net, make sure it's set from ebtables/iptables/ip6tables code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: netns: #ifdef ->iptable_security, ->ip6table_securityAlexey Dobriyan2010-01-18
| | | | | | | | | | | | | | | | | | | | | 'security' tables depend on SECURITY, so ifdef them. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: nfnetlink: netns supportAlexey Dobriyan2010-01-13
| | | | | | | | | | | | | | | | | | | | | Make nfnl socket per-petns. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | IPVS: Allow boot time change of hash sizeCatalin(ux) M. BOIE2010-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was very frustrated about the fact that I have to recompile the kernel to change the hash size. So, I created this patch. If IPVS is built-in you can append ip_vs.conn_tab_bits=?? to kernel command line, or, if you built IPVS as modules, you can add options ip_vs conn_tab_bits=??. To keep everything backward compatible, you still can select the size at compile time, and that will be used as default. It has been about a year since this patch was originally posted and subsequently dropped on the basis of insufficient test data. Mark Bergsma has provided the following test results which seem to strongly support the need for larger hash table sizes: We do however run into the same problem with the default setting (212 = 4096 entries), as most of our LVS balancers handle around a million connections/SLAB entries at any point in time (around 100-150 kpps load). With only 4096 hash table entries this implies that each entry consists of a linked list of 256 connections *on average*. To provide some statistics, I did an oprofile run on an 2.6.31 kernel, with both the default 4096 table size, and the same kernel recompiled with IP_VS_CONN_TAB_BITS set to 18 (218 = 262144 entries). I built a quick test setup with a part of Wikimedia/Wikipedia's live traffic mirrored by the switch to the test host. With the default setting, at ~ 120 kpps packet load we saw a typical %si CPU usage of around 30-35%, and oprofile reported a hot spot in ip_vs_conn_in_get: samples % image name app name symbol name 1719761 42.3741 ip_vs.ko ip_vs.ko ip_vs_conn_in_get 302577 7.4554 bnx2 bnx2 /bnx2 181984 4.4840 vmlinux vmlinux __ticket_spin_lock 128636 3.1695 vmlinux vmlinux ip_route_input 74345 1.8318 ip_vs.ko ip_vs.ko ip_vs_conn_out_get 68482 1.6874 vmlinux vmlinux mwait_idle After loading the recompiled kernel with 218 entries, %si CPU usage dropped in half to around 12-18%, and oprofile looks much healthier, with only 7% spent in ip_vs_conn_in_get: samples % image name app name symbol name 265641 14.4616 bnx2 bnx2 /bnx2 143251 7.7986 vmlinux vmlinux __ticket_spin_lock 140661 7.6576 ip_vs.ko ip_vs.ko ip_vs_conn_in_get 94364 5.1372 vmlinux vmlinux mwait_idle 86267 4.6964 vmlinux vmlinux ip_route_input [ horms@verge.net.au: trivial up-port and minor style fixes ] Signed-off-by: Catalin(ux) M. BOIE <catab@embedromix.ro> Cc: Mark Bergsma <mark@wikimedia.org> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | ethtool: Fix includes build breakDavid S. Miller2010-02-15
| | | | | | | | | | | | | | | | | | Based upon a patch by Oliver Hartkopp <oliver@hartkopp.net>. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: Fix first line of kernel-doc for a few functionsBen Hutchings2010-02-15
| | | | | | | | | | | | | | | | | | | | | | | | The function name must be followed by a space, hypen, space, and a short description. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2010-02-14
|\ \ \ | | | | | | | | | | | | ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
| * | | mac80211: Retry null data frame for power save.Vivek Natarajan2010-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if the null data frame is not acked by the AP, mac80211 goes into power save. This might lead to loss of frames from the AP. Prevent this by restarting dynamic_ps_timer when ack is not received for null data frames. Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | | mac80211: remove get_tx_stats() driver opKalle Valo2010-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_tx_stats() driver operation is not currently used anywhere in mac80211 and there are no plans to use it in the not-so-near future. So it can go without anyone missing it. Signed-off-by: Kalle Valo <kalle.valo@iki.fi> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | | mac80211: allow station add/remove to sleepJohannes Berg2010-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many drivers would like to sleep during station addition and removal, and currently have a high complexity there from not being able to. This introduces two new callbacks sta_add() and sta_remove() that drivers can implement instead of using sta_notify() and that can sleep, and the new sta_add() callback is also allowed to fail. The reason we didn't do this previously is that the IBSS code wants to insert stations from the RX path, which is a tasklet, so cannot sleep. This patch will keep the station allocation in that path, but moves adding the station to the driver out of line. Since the addition can now fail, we can have IBSS peer structs the driver rejected -- in that case we still talk to the station but never tell the driver about it in the control.sta pointer. If there will ever be a driver that has a low limit on the number of stations and that cannot talk to any stations that are not known to it, we need to do come up with a new strategy of handling larger IBSSs, maybe quicker expiry or rejecting peers. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | | wireless: update radiotap parserJohannes Berg2010-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream radiotap has adopted the namespace proposal David Young made and I then took care of, for which I had adapted the radiotap parser as a library outside the kernel. This brings the in-kernel parser up to speed. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | | | net: Add netdev ops for SR-IOV configurationWilliams, Mitch A2010-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add netdev ops for configuring SR-IOV VF devices through the PF driver. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | if_link: Add SR-IOV configuration methodsWilliams, Mitch A2010-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SR-IOV VF management methods to IFLA_LINKINFO. This allows userspace to use rtnetlink to configure VF network devices. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | pci: Add SR-IOV convenience functions and macrosWilliams, Mitch A2010-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add and export pci_num_vf to allow other subsystems to determine how many virtual function devices are associated with an SR-IOV physical function device. Add macros dev_is_pci, dev_is_ps, and dev_num_vf to make it easier for non-PCI specific code to determine SR-IOV capabilities. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | xfrm: use proper kernel typesjamal2010-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel side should use uxx instead of __uxx types Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>