aboutsummaryrefslogtreecommitdiffstats
path: root/include/net
Commit message (Collapse)AuthorAge
...
* | | | | | | | ipv4: introduce address lifetimeJiri Pirko2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some usecase when lifetime of ipv4 addresses might be helpful. For example: 1) initramfs networkmanager uses a DHCP daemon to learn network configuration parameters 2) initramfs networkmanager addresses, routes and DNS configuration 3) initramfs networkmanager is requested to stop 4) initramfs networkmanager stops all daemons including dhclient 5) there are addresses and routes configured but no daemon running. If the system doesn't start networkmanager for some reason, addresses and routes will be used forever, which violates RFC 2131. This patch is essentially a backport of ivp6 address lifetime mechanism for ipv4 addresses. Current "ip" tool supports this without any patch (since it does not distinguish between ipv4 and ipv6 addresses in this perspective. Also, this should be back-compatible with all current netlink users. Reported-by: Pavel Šimerda <psimerda@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: frag, move LRU list maintenance outside of rwlockJesper Dangaard Brouer2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating the fragmentation queues LRU (Least-Recently-Used) list, required taking the hash writer lock. However, the LRU list isn't tied to the hash at all, so we can use a separate lock for it. Original-idea-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: use lib/percpu_counter API for fragmentation mem accountingJesper Dangaard Brouer2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the per network namespace shared atomic "mem" accounting variable, in the fragmentation code, with a lib/percpu_counter. Getting percpu_counter to scale to the fragmentation code usage requires some tweaks. At first view, percpu_counter looks superfast, but it does not scale on multi-CPU/NUMA machines, because the default batch size is too small, for frag code usage. Thus, I have adjusted the batch size by using __percpu_counter_add() directly, instead of percpu_counter_sub() and percpu_counter_add(). The batch size is increased to 130.000, based on the largest 64K fragment memory usage. This does introduce some imprecise memory accounting, but its does not need to be strict for this use-case. It is also essential, that the percpu_counter, does not share cacheline with other writers, to make this scale. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: frag helper functions for mem limit trackingJesper Dangaard Brouer2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is primarily a preparation to ease the extension of memory limit tracking. The change does reduce the number atomic operation, during freeing of a frag queue. This does introduce a some performance improvement, as these atomic operations are at the core of the performance problems seen on NUMA systems. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: cacheline adjust struct inet_frag_queueJesper Dangaard Brouer2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fragmentation code cacheline adjusting of struct inet_frag_queue. Take advantage of the size of struct timer_list, and move all but spinlock_t lock, below the timer struct. On 64-bit 'lru_list', 'list' and 'refcnt', fits exactly into the next cacheline, and a new cacheline starts at 'fragments'. The netns_frags *net pointer is moved to the end of the struct, because its used in a compare, with "next/close-by" elements of which this struct is embedded into. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: cacheline adjust struct inet_frags for better frag performanceJesper Dangaard Brouer2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The globally shared rwlock, of struct inet_frags, shares cacheline with the 'rnd' number, which is used by the hash calculations. Fix this, as this obviously is a bad idea, as unnecessary cache-misses will occur when accessing the 'rnd' number. Also small note that, moving function ptr (*match) up in struct, is to avoid it lands on the next cacheline (on 64-bit). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: cacheline adjust struct netns_frags for better frag performanceJesper Dangaard Brouer2013-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This small cacheline adjustment of struct netns_frags improves performance significantly for the fragmentation code. Struct members 'lru_list' and 'mem' are both hot elements, and it hurts performance, due to cacheline bouncing at every call point, when they share a cacheline. Also notice, how mem is placed together with 'high_thresh' and 'low_thresh', as they are used in the compare operations together. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net neigh: Optimize neighbor entry size calculation.YOSHIFUJI Hideaki / 吉藤英明2013-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When allocating memory for neighbour cache entry, if tbl->entry_size is not set, we always calculate sizeof(struct neighbour) + tbl->key_len, which is common in the same table. With this change, set tbl->entry_size during the table initialization phase, if it was not set, and use it in neigh_alloc() and neighbour_priv(). This change also allow us to have both of protocol private data and device priate data at tha same time. Note that the only user of prototcol private is DECnet and the only user of device private is ATM CLIP. Since those are exclusive, we have not been facing issues here. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | Merge branch 'master' of ↵John W. Linville2013-01-28
|\ \ \ \ \ \ \ \ | | |/ / / / / / | |/| | | / / / | |_|_|_|/ / / |/| | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
| * | | | | | Merge branch 'for-john' of ↵John W. Linville2013-01-22
| |\ \ \ \ \ \ | | | |_|/ / / | | |/| | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
| | * | | | | cfg80211: check radar interface combinationsSimon Wunderlich2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To ease further DFS development regarding interface combinations, use the interface combinations structure to test for radar capabilities. Drivers can specify which channel widths they support, and in which modes. Right now only a single AP interface is allowed, but as the DFS code evolves other combinations can be enabled. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | cfg80211: Allow use_mfp to be specified with the connect commandJouni Malinen2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NL80211_ATTR_USE_MFP attribute was originally added for NL80211_CMD_ASSOCIATE, but it is actually as useful (if not even more useful) with NL80211_CMD_CONNECT, so process that attribute with the connect command, too. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | nl80211: allow user-space to set address for P2P_DEVICEArend van Spriel2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per email discussion Jouni Malinen pointed out that: "P2P message exchanges can be executed on the current operating channel of any operation (both P2P and non-P2P station). These can be on 5 GHz and even on 60 GHz (so yes, you _can_ do GO Negotiation on 60 GHz). As an example, it would be possible to receive a GO Negotiation Request frame on a 5 GHz only radio and then to complete GO Negotiation on that band. This can happen both when connected to a P2P group (through client discoverability mechanism) and when connected to a legacy AP (assuming the station receive Probe Request frame from full scan in the beginning of P2P device discovery)." This means that P2P messages can be sent over different radio devices. However, these should use the same P2P device address so it should be able to provision this from user-space. This patch adds a parameter for this to struct vif_params which should only be used during creation of the P2P device interface. Cc: Jouni Malinen <j@w1.fi> Cc: Greg Goldman <ggoldman@broadcom.com> Cc: Jithu Jance <jithu@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> [add error checking] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | {cfg,nl}80211: mesh power mode primitives and userspace accessMarco Porsch2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the nl80211_mesh_power_mode enumeration which holds possible values for the mesh power mode. These modes are unknown, active, light sleep and deep sleep. Add power_mode entry to the mesh config structure to hold the user-configured default mesh power mode. This value will be used for new peer links. Add the dot11MeshAwakeWindowDuration value to the mesh config. The awake window is a duration in TU describing how long the STA will stay awake after transmitting its beacon in PS mode. Add access routines to: - get/set local link-specific power mode (STA) - get remote STA's link-specific power mode (STA) - get remote STA's non-peer power mode (STA) - get/set default mesh power mode (mesh config) - get/set mesh awake window duration (mesh config) All config changes may be done at mesh runtime and take effect immediately. Signed-off-by: Marco Porsch <marco@cozybit.com> Signed-off-by: Ivan Bezyazychnyy <ivan.bezyazychnyy@gmail.com> Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com> [fix commit message line length, error handling in set station] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | {cfg,nl,mac}80211: set beacon interval and DTIM period on mesh joinMarco Porsch2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the default mesh beacon interval and DTIM period to cfg80211 and make them accessible to nl80211. This enables setting both values when joining an MBSS. Previously the DTIM parameter was not set by mac80211 so the driver's default value was used. Signed-off-by: Marco Porsch <marco@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | mac80211: call restart complete at wowlan resume timeJohannes Berg2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the driver's resume function can't completely restore the configuration in the device, it returns 1 from the callback which will be treated like a HW restart request, but done directly. In this case, also call the driver's restart_complete() function so it can finish the reconfiguration there. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | {cfg,mac}80211.h: fix some kernel-doc warningsYacine Belkadi2013-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building the 80211 DocBook, scripts/kernel-doc reports the following type of warnings: Warning(include/net/cfg80211.h:334): No description found for return value of 'cfg80211_get_chandef_type' These warnings are only reported when scripts/kernel-doc runs in verbose mode. To fix these use "Return:" to describe function return values. Signed-off-by: Yacine Belkadi <yacine.belkadi.1@gmail.com> [adjust for freq_reg_info() change] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | * | | | | wireless: make the reg_notifier() voidLuis R. Rodriguez2013-01-14
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reg_notifier()'s return value need not be checked as it is only supposed to do post regulatory work and that should never fail. Any behaviour to regulatory that needs to be considered before cfg80211 does work to a driver should be specified by using the already existing flags, the reg_notifier() just does post processing should it find it needs to. Also make lbs_reg_notifier static. Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> [move lbs_reg_notifier to not break compile] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | NFC: Initial Secure Element APISamuel Ortiz2013-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each NFC adapter can have several links to different secure elements and that property needs to be exported by the drivers. A secure element link can be enabled and disabled, and card emulation will be handled by the currently active one. Otherwise card emulation will be host implemented. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| * | | | | NFC: Add HCI quirks to support driver (non)standard implementationsEric Lapuyade2013-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some chips diverge from the HCI spec in their implementation of standard features. This adds a new quirks parameter to nfc_hci_allocate_device() to let the driver indicate its divergence. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| * | | | | NFC: Added error handling in event_received hci opsEric Lapuyade2013-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no use to return an error if the caller doesn't get it. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| * | | | | NFC: Fixed nfc core and hci unregistration and cleanupEric Lapuyade2013-01-09
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an adapter is removed, it will unregister itself from hci and/or nfc core. In order to do that safely, work tasks must first be canceled and prevented to be scheduled again, before the hci or nfc device can be destroyed. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| * | | | wireless: use __alignedJohannes Berg2013-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use __aligned(...) instead of __attribute__((aligned(...))) in mac80211 and cfg80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | mac80211: split TX aggregation stop actionJohannes Berg2013-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When TX aggregation is stopped, there are a few different cases: - connection with the peer was dropped - session stop was requested locally - session stop was requested by the peer - connection was dropped while a session is stopping The behaviour in these cases should be different, if the connection is dropped then the driver should drop all frames, otherwise the frames may continue to be transmitted, aggregated in the case of a locally requested session stop or unaggregated in the case of the peer requesting session stop. Split these different cases so that the driver can act accordingly; however, treat local and remote stop the same way and ask the driver to not send frames as aggregated packets any more. In the case of connection drop, the stop callback the driver is otherwise supposed to call is no longer required. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | mac80211: fix channel context iterationJohannes Berg2013-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During suspend/resume channel contexts might be iterated even if they haven't been re-added to the driver, keep track of this and skip them in iteration. Also use the new status for sanity checks. Also clarify the fact that during HW restart all contexts are iterated over (thanks Eliad.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | regulatory: use IS_ERR macro family for freq_reg_infoJohannes Berg2013-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of returning an error and filling a pointer return the pointer and an ERR_PTR value in error cases. Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | regulatory: use RCU to protect last_requestJohannes Berg2013-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow making freq_reg_info() lock-free. Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | regulatory: use RCU to protect global and wiphy regdomainsJohannes Berg2013-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To simplify the locking and not require cfg80211_mutex (which nl80211 uses to access the global regdomain) and also to make it possible for drivers to access their wiphy->regd safely, use RCU to protect these pointers. Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | regulatory: remove handling of channel bandwidthJohannes Berg2013-01-03
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The channel bandwidth handling isn't really quite right, it assumes that a 40 MHz channel is really two 20 MHz channels, which isn't strictly true. This is the way the regulatory database handling is defined right now though so remove the logic to handle other channel widths. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* | | | net: add RCU annotation to sk_dst_cache fieldCong Wang2013-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sock->sk_dst_cache is protected by RCU. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | decnet: use correct RCU API to deref sk_dst_cache fieldCong Wang2013-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sock->sk_dst_cache is protected by RCU, therefore we should use __sk_dst_get() to deref it once we lock the sock. This fixes several sparse warnings. Cc: linux-decnet-user@lists.sourceforge.net Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gro: Fix kcalloc argument orderJoe Perches2013-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | First number, then size. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller2013-01-27
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== This batch contains netfilter updates for you net-next tree, they are: * The new connlabel extension for x_tables, that allows us to attach labels to each conntrack flow. The kernel implementation uses a bitmask and there's a file in user-space that maps the bits with the corresponding string for each existing label. By now, you can attach up to 128 overlapping labels. From Florian Westphal. * A new round of improvements for the netns support for conntrack. Gao feng has moved many of the initialization code of each module of the netns init path. He also made several code refactoring, that code looks cleaner to me now. * Added documentation for all possible tweaks for nf_conntrack via sysctl, from Jiri Pirko. * Cisco 7941/7945 IP phone support for our SIP conntrack helper, from Kevin Cernekee. * Missing header file in the snmp helper, from Stephen Hemminger. * Finally, a couple of fixes to resolve minor issues with these changes, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netfilter: nf_conntrack: refactor l4proto support for netnsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the code that register/unregister l4proto to the module_init/exit context. Given that we have to modify some interfaces to accomodate these changes, it is a good time to use shorter function names for this using the nf_ct_* prefix instead of nf_conntrack_*, that is: nf_ct_l4proto_register nf_ct_l4proto_pernet_register nf_ct_l4proto_unregister nf_ct_l4proto_pernet_unregister We same many line breaks with it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_conntrack: refactor l3proto support for netnsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the code that register/unregister l3proto to the module_init/exit context. Given that we have to modify some interfaces to accomodate these changes, it is a good time to use shorter function names for this using the nf_ct_* prefix instead of nf_conntrack_*, that is: nf_ct_l3proto_register nf_ct_l3proto_pernet_register nf_ct_l3proto_unregister nf_ct_l3proto_pernet_unregister We same many line breaks with it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_proto: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_labels: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_helper: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_timeout: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_ecache: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_tstamp: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_acct: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_ct_expect: move initialization out of pernet_operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the global initial codes to the module_init/exit context. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_conntrack: move initialization out of pernet operationsGao feng2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nf_conntrack initialization and cleanup codes happens in pernet operations function. This task should be done in module_init/exit. We can't use init_net to identify if it's the right time to initialize or cleanup since we cannot make assumption on the order netns are created/destroyed. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: ctnetlink: allow userspace to modify labelsFlorian Westphal2013-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the ability to set/clear labels assigned to a conntrack via ctnetlink. To allow userspace to only alter specific bits, Pablo suggested to add a new CTA_LABELS_MASK attribute: The new set of active labels is then determined via active = (active & ~mask) ^ changeset i.e., the mask selects those bits in the existing set that should be changed. This follows the same method already used by MARK and CONNMARK targets. Omitting CTA_LABELS_MASK is the same as setting all bits in CTA_LABELS_MASK to 1: The existing set is replaced by the one from userspace. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: add connlabel conntrack extensionFlorian Westphal2013-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar to connmarks, except labels are bit-based; i.e. all labels may be attached to a flow at the same time. Up to 128 labels are supported. Supporting more labels is possible, but requires increasing the ct offset delta from u8 to u16 type due to increased extension sizes. Mapping of bit-identifier to label name is done in userspace. The extension is enabled at run-time once "-m connlabel" netfilter rules are added. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | | | | Merge branch 'testing' of ↵David S. Miller2013-01-23
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Add a statistic counter for invalid output states and remove a superfluous state valid check, from Li RongQing. 2) Probe for asynchronous block ciphers instead of synchronous block ciphers to make the asynchronous variants available even if no synchronous block ciphers are found, from Jussi Kivilinna. 3) Make rfc3686 asynchronous block cipher and make use of the new asynchronous variant, from Jussi Kivilinna. 4) Replace some rwlocks by rcu, from Cong Wang. 5) Remove some unused defines. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | xfrm: Remove unused definesSteffen Klassert2013-01-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XFRM_REPLAY_SEQ, XFRM_REPLAY_OSEQ and XFRM_REPLAY_SEQ_MASK were introduced years ago but actually never used. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | | | | soreuseport: TCP/IPv6 implementationTom Herbert2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Motivation for soreuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | soreuseport: TCP/IPv4 implementationTom Herbert2013-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow multiple listener sockets to bind to the same port. Motivation for soresuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>