aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* netfilter: fix Kconfig dependenciesPatrick McHardy2011-01-14
| | | | | | | | | | | | | Fix dependencies of netfilter realm match: it depends on NET_CLS_ROUTE, which itself depends on NET_SCHED; this dependency is missing from netfilter. Since matching on realms is also useful without having NET_SCHED enabled and the option really only controls whether the tclassid member is included in route and dst entries, rename the config option to IP_ROUTE_CLASSID and move it outside of traffic scheduling context to get rid of the NET_SCHED dependeny. Reported-by: Vladis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: fix compilation when conntrack is disabled but tproxy is enabledKOVACS Krisztian2010-12-15
| | | | | | | | | | | | | | | | | The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but failed to update the #ifdef stanzas guarding the defragmentation related fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c. This patch adds the required #ifdefs so that IPv6 tproxy can truly be used without connection tracking. Original report: http://marc.info/?l=linux-netdev&m=129010118516341&w=2 Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xtables: use guarded typesJan Engelhardt2010-12-15
| | | | | | | We are supposed to use the kernel's own types in userspace exports. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* IPVS: Backup, adding version 0 sending capabilitiesHans Schillstrom2010-11-24
| | | | | | | | | | | | | This patch adds a sysclt net.ipv4.vs.sync_version that can be used to send sync msg in version 0 or 1 format. sync_version value is logical, Value 1 (default) New version 0 Plain old version Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: Backup, Change sending to Version 1 formatHans Schillstrom2010-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable sending and removal of version 0 sending Affected functions, ip_vs_sync_buff_create() ip_vs_sync_conn() ip_vs_core.c removal of IPv4 check. *v5 Just check cp->pe_data_len in ip_vs_sync_conn Check if padding needed before adding a new sync_conn to the buffer, i.e. avoid sending padding at the end. *v4 moved sanity check and pe_name_len after sloop. use cp->pe instead of cp->dest->svc->pe real length in each sync_conn, not padded length however total size of a sync_msg includes padding. *v3 Sending ip_vs_sync_conn_options in network order. Sending Templates for ONE_PACKET conn. Renaming of ip_vs_sync_mesg to ip_vs_sync_mesg_v0 Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: Backup, Adding Version 1 receive capabilityHans Schillstrom2010-11-24
| | | | | | | | | | | | | | | | | | | | | | | Functionality improvements * flags changed from 16 to 32 bits * fwmark added (32 bits) * timeout in sec. added (32 bits) * pe data added (Variable length) * IPv6 capabilities (3x16 bytes for addr.) * Version and type in every conn msg. ip_vs_process_message() now handles Version 1 messages and will call ip_vs_process_message_v0() for version 0 messages. ip_vs_proc_conn() is common for both version, and handles the update of connection hash. ip_vs_conn_fill_param_sync() - Version 1 messages only ip_vs_conn_fill_param_sync_v0() - Version 0 messages only Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: Backup, Adding structs for new sync formatHans Schillstrom2010-11-24
| | | | | | | | | | | New structs defined for version 1 of sync. * ip_vs_sync_v4 Ipv4 base format struct * ip_vs_sync_v6 Ipv6 base format struct Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: Handle Scheduling errors.Hans Schillstrom2010-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If ip_vs_conn_fill_param_persist return an error to ip_vs_sched_persist, this error must propagate as ignored=-1 to ip_vs_schedule(). Errors from ip_vs_conn_new() in ip_vs_sched_persist() and ip_vs_schedule() should also return *ignored=-1; This patch just relies on the fact that ignored is 1 before calling ip_vs_sched_persist(). Sent from Julian: "The new case when ip_vs_conn_fill_param_persist fails should set *ignored = -1, so that we can use NF_DROP, see below. *ignored = -1 should be also used for ip_vs_conn_new failure in ip_vs_sched_persist() and ip_vs_schedule(). The new negative value should be handled in tcp,udp,sctp" "To summarize: - *ignored = 1: protocol tried to schedule (eg. on SYN), found svc but the svc/scheduler decides that this packet should be accepted with NF_ACCEPT because it must not be scheduled. - *ignored = 0: scheduler can not find destination, so try bypass or return ICMP and then NF_DROP (ip_vs_leave). - *ignored = -1: scheduler tried to schedule but fatal error occurred, eg. ip_vs_conn_new failure (ENOMEM) or ip_vs_sip_fill_param failure such as missing Call-ID, ENOMEM on skb_linearize or pe_data. In this case we should return NF_DROP without any attempts to send ICMP with ip_vs_leave." More or less all ideas and input to this patch is work from Julian Anastasov Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: skb defrag in L7 helpersHans Schillstrom2010-11-24
| | | | | | | | | | | L7 helpers like sip needs skb defrag since L7 data can be fragmented. This patch requires "IPVS Break ports-2 into src_port and dst_port" patch Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: Split ports[2] into src_port and dst_portHans Schillstrom2010-11-24
| | | | | | | | | | | | Avoid sending invalid pointer due to skb_linearize() call. This patch prepares for next patch where skb_linearize is a part. In ip_vs_sched_persist() params the ports ptr will be replaced by src and dst port. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* IPVS: Backup, Prepare for transferring firewall marks (fwmark) to the backup ↵Hans Schillstrom2010-11-24
| | | | | | | | | | | | | | | daemon. One struct will have fwmark added: * ip_vs_conn ip_vs_conn_new() and ip_vs_find_dest() will have an extra param - fwmark The effects of that, is in this patch. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
* Merge branch 'for-patrick' of ↵Patrick McHardy2010-11-16
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6
| * ipvs: allow transmit of GRO aggregated skbsSimon Horman2010-11-15
| | | | | | | | | | | | | | | | | | Attempt at allowing LVS to transmit skbs of greater than MTU length that have been aggregated by GRO and can thus be deaggregated by GSO. Cc: Julian Anastasov <ja@ssi.bg> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Simon Horman <horms@verge.net.au>
| * ipvs: remove shadow rt variableEric Dumazet2010-11-15
| | | | | | | | | | | | | | Remove a sparse warning about rt variable. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * ipvs: add static and read_mostly attributesEric Dumazet2010-11-15
| | | | | | | | | | | | | | | | | | | | | | ip_vs_conn_tab_bits & ip_vs_conn_tab_mask are static to ipvs/ip_vs_conn.c ip_vs_conn_tab_size, ip_vs_conn_tab_mask, ip_vs_conn_tab [the pointer], ip_vs_conn_rnd are mostly read. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * IPVS: buffer argument to ip_vs_process_message() should not be constSimon Horman2010-11-15
| | | | | | | | | | | | | | It is assigned to a non-const variable and its contents are modified. Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * IPVS: Remove useless { } block from ip_vs_process_message()Simon Horman2010-11-15
| | | | | | | | | | Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * IPVS: Make the cp argument to ip_vs_sync_conn() staticSimon Horman2010-11-15
| | | | | | | | | | Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * IPVS: Only match pe_data created by the same peSimon Horman2010-11-15
| | | | | | | | | | | | | | | | Only match persistence engine data if it was created by the same persistence engine. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| * IPVS: Add persistence engine to connection entrySimon Horman2010-11-15
| | | | | | | | | | | | | | | | | | | | The dest of a connection may not exist if it has been created as the result of connection synchronisation. But in order for connection entries for templates with persistence engine data created through connection synchronisation to be valid access to the persistence engine pointer is required. So add the persistence engine to the connection itself. Signed-off-by: Simon Horman <horms@verge.net.au>
* | netfilter: nf_conntrack: one less atomic op in nf_ct_expect_insert()Eric Dumazet2010-11-16
|/ | | | | | | | Instead of doing atomic_inc(&exp->use) twice, call atomic_add(2, &exp->use); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: rcu sparse cleanupsEric Dumazet2010-11-15
| | | | | | | | Use RCU helpers to reduce number of sparse warnings (CONFIG_SPARSE_RCU_POINTER=y), and adds lockdep checks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: nf_nat_amanda: rename a variableEric Dumazet2010-11-15
| | | | | | | Avoid a sparse warning about 'ret' variable shadowing Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: add __rcu annotationsEric Dumazet2010-11-15
| | | | | | | | Use helpers to reduce number of sparse warnings (CONFIG_SPARSE_RCU_POINTER=y) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: nf_ct_frag6_sysctl_table is staticEric Dumazet2010-11-15
| | | | | Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: add __rcu annotationsEric Dumazet2010-11-15
| | | | | | | | Add some __rcu annotations and use helpers to reduce number of sparse warnings (CONFIG_SPARSE_RCU_POINTER=y) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xt_CLASSIFY: add ARP support, allow CLASSIFY target on any tableFrédéric Leroy2010-11-15
| | | | | Signed-off-by: Frédéric Leroy <fredo@starox.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: nf_nat: define nat_pptp_info as neededChangli Gao2010-11-15
| | | | | Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: ct_extend: define NF_CT_EXT_* as neededChangli Gao2010-11-15
| | | | | | | Less IDs make nf_ct_ext smaller. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: nf_nat: don't use atomic bit operationChangli Gao2010-11-15
| | | | | | | | As we own the conntrack and the others can't see it until we confirm it, we don't need to use atomic bit operation on ct->status. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: nf_conntrack: define ct_*_info as neededChangli Gao2010-11-15
| | | | | Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: ct_extend: fix the wrong alloc_sizeChangli Gao2010-11-15
| | | | | | | | In function update_alloc_size(), sizeof(struct nf_ct_ext) is added twice wrongly. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xt_LOG: do print MAC header on FORWARDJan Engelhardt2010-11-15
| | | | | | | | | | | | I am observing consistent behavior even with bridges, so let's unlock this. xt_mac is already usable in FORWARD, too. Section 9 of http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html#section9 says the MAC source address is changed, but my observation does not match that claim -- the MAC header is retained. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> [Patrick; code inspection seems to confirm this] Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xt_NFQUEUE: remove modulo operationsChangli Gao2010-11-12
| | | | | | Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: nf_conntrack: don't always initialize ct->protoChangli Gao2010-11-12
| | | | | | | | | | ct->proto is big(60 bytes) due to structure ip_ct_tcp, and we don't need to initialize the whole for all the other protocols. This patch moves proto to the end of structure nf_conn, and pushes the initialization down to the individual protocols. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
* ipv4: Make rt->fl.iif tests lest obscure.David S. Miller2010-11-11
| | | | | | | | | When we test rt->fl.iif against zero, we're seeing if it's an output or an input route. Make that explicit with some helper functions. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6David S. Miller2010-11-11
|\
| * dccp ccid-2: Implementation of circular Ack Vector buffer with overflow handlingGerrit Renker2010-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This completes the implementation of a circular buffer for Ack Vectors, by extending the current (linear array-based) implementation. The changes are: (a) An `overflow' flag to deal with the case of overflow. As before, dynamic growth of the buffer will not be supported; but code will be added to deal robustly with overflowing Ack Vector buffers. (b) A `tail_seqno' field. When naively implementing the algorithm of Appendix A in RFC 4340, problems arise whenever subsequent Ack Vector records overlap, which can bring the entire run length calculation completely out of synch. (This is documented on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/\ ack_vectors/tracking_tail_ackno/ .) (c) The buffer length is now computed dynamically (i.e. current fill level), as the span between head to tail. As a result, dccp_ackvec_pending() is now simpler - the #ifdef is no longer necessary since buf_empty is always true when IP_DCCP_ACKVEC is not configured. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
| * dccp ccid-2: Separate internals of Ack Vectors from option-parsing codeGerrit Renker2010-11-10
| | | | | | | | | | | | | | | | | | | | | | This patch * separates Ack Vector housekeeping code from option-insertion code; * shifts option-specific code from ackvec.c into options.c; * introduces a dedicated routine to take care of the Ack Vector records; * simplifies the dccp_ackvec_insert_avr() routine: the BUG_ON was redundant, since the list is automatically arranged in descending order of ack_seqno. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
| * dccp ccid-2: Ack Vector interface clean-upGerrit Renker2010-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch brings the Ack Vector interface up to date. Its main purpose is to lay the basis for the subsequent patches of this set, which will use the new data structure fields and routines. There are no real algorithmic changes, rather an adaptation: (1) Replaced the static Ack Vector size (2) with a #define so that it can be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems to be sufficient for the moment) and added a solution so that computing the ECN nonce will continue to work - even with larger Ack Vectors. (2) Replaced the #defines for Ack Vector states with a complete enum. (3) Replaced #defines to compute Ack Vector length and state with general purpose routines (inlines), and updated code to use these. (4) Added a `tail' field (conversion to circular buffer in subsequent patch). (5) Updated the (outdated) documentation for Ack Vector struct. (6) All sequence number containers now trimmed to 48 bits. (7) Removal of unused bits: * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record); * removed Elapsed Time for Ack Vectors (it was nowhere used); * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since the code needs to be able to remember the old run length; * reduced the de-/allocation routines (redundant / duplicate tests). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* | net: get rid of rtable->idevEric Dumazet2010-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems idev field in struct rtable has no special purpose, but adding extra atomic ops. We hold refcounts on the device itself (using percpu data, so pretty cheap in current kernel). infiniband case is solved using dst.dev instead of idev->dev Removal of this field means routing without route cache is now using shared data, percpu data, and only potential contention is a pair of atomic ops on struct neighbour per forwarded packet. About 5% speedup on routing test. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | neigh: reorder struct neighbourEric Dumazet2010-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | It is important to move nud_state outside of the often modified cache line (because of refcnt), to reduce false sharing in neigh_event_send() This is a followup of commit 0ed8ddf4045f (neigh: Protect neigh->ha[] with a seqlock) This gives a 7% speedup on routing test with IP route cache disabled. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: update driver versionJon Mason2010-11-11
| | | | | | | | | | | | | | | | Update vxge driver version Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: sparse and other clean-upsJon Mason2010-11-11
| | | | | | | | | | | | | | | | Correct issues found by running sparse on the vxge driver, as well as other miscellaneous cleanups. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: update KconfigJon Mason2010-11-11
| | | | | | | | | | | | | | | | Update Kconfig to reflect Exar's purchase of Neterion (formerly S2IO). Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: correct multi-function detectionJon Mason2010-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | The values used to determined if the adapter is running in single or multi-function mode were previously modified to the values necessary when making the VXGE_HW_FW_API_GET_FUNC_MODE firmware call. However, the firmware call was not modified. This had the driver printing out on probe that the adapter was in multi-function mode when in single function mode and vice versa. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: Titan1A detectionJon Mason2010-11-11
| | | | | | | | | | | | | | | | | | Detect if the adapter is Titan or Titan1A, and tune the driver for this hardware. Also, remove unnecessary function __vxge_hw_device_id_get. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: Handle errors in vxge_hw_vpath_fw_apiJon Mason2010-11-11
| | | | | | | | | | | | | | | | | | | | | | Propagate the return code of the call to vxge_hw_vpath_fw_api and __vxge_hw_vpath_pci_func_mode_get. This enables the proper handling of error conditions when querying the function mode of the device during probe. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: add receive hardware timestampingJon Mason2010-11-11
| | | | | | | | | | | | | | | | | | | | Add support for enable/disabling hardware timestamping on receive packets via ioctl call. When enabled, the hardware timestamp replaces the FCS in the payload. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vxge: add support for ethtool firmware flashingJon Mason2010-11-11
| | | | | | | | | | | | | | | | | | | | Add the ability in the vxge driver to flash firmware via ethtool. Updated to include comments from Ben Hutchings. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>