aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4
Commit message (Collapse)AuthorAge
...
| * | | | | | | | net: drop capability from protocol definitionsEric Paris2009-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct can_proto had a capability field which wasn't ever used. It is dropped entirely. struct inet_protosw had a capability field which can be more clearly expressed in the code by just checking if sock->type = SOCK_RAW. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: Use defaults when no route options are availableGilad Ben-Yossef2009-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trying to parse the option of a SYN packet that we have no route entry for should just use global wide defaults for route entry options. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Tested-by: Valdis.Kletnieks@vt.edu Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | tcp: Do not call IPv4 specific func in tcp_check_reqGilad Ben-Yossef2009-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calling IPv4 specific inet_csk_route_req in tcp_check_req is a bad idea and crashes machine on IPv6 connections, as reported by Valdis Kletnieks Also, all we are really interested in is the timestamp option in the header, so calling tcp_parse_options() with the "estab" set to false flag is an overkill as it tries to parse half a dozen other TCP options. We know whether timestamp should be enabled or not using data from request_sock. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Tested-by: Valdis.Kletnieks@vt.edu Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: net/ipv4/devinet.c cleanupsEric Dumazet2009-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As pointed by Stephen Rothwell, commit c6d14c84 added a warning : net/ipv4/devinet.c: In function 'inet_select_addr': net/ipv4/devinet.c:902: warning: label 'out' defined but not used delete unused 'out' label and do some cleanups as well Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: Introduce for_each_netdev_rcu() iteratorEric Dumazet2009-11-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds RCU management to the list of netdevices. Convert some for_each_netdev() users to RCU version, if it can avoid read_lock-ing dev_base_lock Ie: read_lock(&dev_base_loack); for_each_netdev(net, dev) some_action(); read_unlock(&dev_base_lock); becomes : rcu_read_lock(); for_each_netdev_rcu(net, dev) some_action(); rcu_read_unlock(); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | icmp: icmp_send() can avoid a dev_put()Eric Dumazet2009-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can avoid touching device refcount in icmp_send(), using dev_get_by_index_rcu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ipv4: inetdev_by_index() switch to RCUEric Dumazet2009-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use dev_get_by_index_rcu() instead of __dev_get_by_index() and dev_base_lock rwlock Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Merge branch 'master' of ↵David S. Miller2009-10-30
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | | | | | | | | net,socket: introduce DECLARE_SOCKADDR helper to catch overflow at build timeCyrill Gorcunov2009-10-29
| | |_|_|_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proto_ops->getname implies copying protocol specific data into storage unit (particulary to __kernel_sockaddr_storage). So when we implement new protocol support we should keep such a detail in mind (which is easy to forget about). Lets introduce DECLARE_SOCKADDR helper which check if storage unit is not overfowed at build time. Eventually inet_getname is switched to use DECLARE_SOCKADDR (to show example of usage). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: Cleanup redundant tests on unsignedroel kluin2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optlen is unsigned so the `< 0' test is never true. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Allow disabling of DSACK TCP option per routeGilad Ben-Yossef2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add and use no DSCAK bit in the features field. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Allow to turn off TCP window scale opt per routeGilad Ben-Yossef2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add and use no window scale bit in the features field. Note that this is not the same as setting a window scale of 0 as would happen with window limit on route. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Allow disabling TCP timestamp options per routeGilad Ben-Yossef2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement querying and acting upon the no timestamp bit in the feature field. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Add the no SACK route option featureGilad Ben-Yossef2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement querying and acting upon the no sack bit in the features field. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Allow tcp_parse_options to consult dst entryGilad Ben-Yossef2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need tcp_parse_options to be aware of dst_entry to take into account per dst_entry TCP options settings Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Only parse time stamp TCP option in time wait sockGilad Ben-Yossef2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we only use tcp_parse_options here to check for the exietence of TCP timestamp option in the header, it is better to call with the "established" flag on. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Signed-off-by: Ori Finkelman <ori@comsleep.com> Signed-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ipmr: Optimize multiple unregistrationEric Dumazet2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: Corrected spelling error heurestics->heuristicsAndreas Petlund2009-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Corrected a spelling error in a function name. Signed-off-by: Andreas Petlund <apetlund@simula.no> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | gre: Optimize multiple unregistrationEric Dumazet2009-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | ipip: Optimize multiple unregistrationEric Dumazet2009-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Merge branch 'master' of ↵David S. Miller2009-10-27
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sh_eth.c
| * | | | | | | | | gre: convert hash tables locking to RCUEric Dumazet2009-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GRE tunnels use one rwlock to protect their hash tables. This locking scheme can be converted to RCU for free, since netdevice already must wait for a RCU grace period at dismantle time. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | ipip: convert hash tables locking to RCUEric Dumazet2009-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IPIP tunnels use one rwlock to protect their hash tables. This locking scheme can be converted to RCU for free, since netdevice already must wait for a RCU grace period at dismantle time. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | net: Fix for dst_negative_adviceKrishna Kumar2009-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dst_negative_advice() should check for changed dst and reset sk_tx_queue_mapping accordingly. Pass sock to the callers of dst_negative_advice. (sk_reset_txq is defined just for use by dst_negative_advice. The only way I could find to get around this is to move dst_negative_() from dst.h to dst.c, include sock.h in dst.c, etc) Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | IP: CleanupsJohn Dykstra2009-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use symbols instead of magic constants while checking PMTU discovery setsockopt. Remove redundant test in ip_rt_frag_needed() (done by caller). Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | ah4: convert to ahashSteffen Klassert2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts ah4 to the new ahash interface. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | net: sk_drops consolidation part 2Eric Dumazet2009-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - skb_kill_datagram() can increment sk->sk_drops itself, not callers. - UDP on IPV4 & IPV6 dropped frames (because of bad checksum or policy checks) increment sk_drops Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | inet: rename some inet_sock fieldsEric Dumazet2009-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | net: sk_drops consolidationEric Dumazet2009-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sock_queue_rcv_skb() can update sk_drops itself, removing need for callers to take care of it. This is more consistent since sock_queue_rcv_skb() also reads sk_drops when queueing a skb. This adds sk_drops managment to many protocols that not cared yet. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | Merge branch 'master' of ↵David S. Miller2009-10-13
| |\ \ \ \ \ \ \ \ \ | | | |_|_|_|_|_|_|/ | | |/| | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | | | | | | | | tcp: replace ehash_size by ehash_maskEric Dumazet2009-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Storing the mask (size - 1) instead of the size allows fast path to be a bit faster. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | net: Generalize socket rx gap / receive queue overflow cmsgNeil Horman2009-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a new socket level option to report number of queue overflows Recently I augmented the AF_PACKET protocol to report the number of frames lost on the socket receive queue between any two enqueued frames. This value was exported via a SOL_PACKET level cmsg. AFter I completed that work it was requested that this feature be generalized so that any datagram oriented socket could make use of this option. As such I've created this patch, It creates a new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue overflowed between any two given frames. It also augments the AF_PACKET protocol to take advantage of this new feature (as it previously did not touch sk->sk_drops, which this patch uses to record the overflow count). Tested successfully by me. Notes: 1) Unlike my previous patch, this patch simply records the sk_drops value, which is not a number of drops between packets, but rather a total number of drops. Deltas must be computed in user space. 2) While this patch currently works with datagram oriented protocols, it will also be accepted by non-datagram oriented protocols. I'm not sure if thats agreeable to everyone, but my argument in favor of doing so is that, for those protocols which aren't applicable to this option, sk_drops will always be zero, and reporting no drops on a receive queue that isn't used for those non-participating protocols seems reasonable to me. This also saves us having to code in a per-protocol opt in mechanism. 3) This applies cleanly to net-next assuming that commit 977750076d98c7ff6cbda51858bb5a5894a9d9ab (my af packet cmsg patch) is reverted Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | Merge branch 'master' of ↵David S. Miller2009-10-12
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | | | | | | | | | udp: dynamically size hash tables at boot timeEric Dumazet2009-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for several setups. 4000 active UDP sockets -> 32 sockets per chain in average. An incoming frame has to lookup all sockets to find best match, so long chains hurt latency. Instead of a fixed size hash table that cant be perfect for every needs, let UDP stack choose its table size at boot time like tcp/ip route, using alloc_large_system_hash() helper Add an optional boot parameter, uhash_entries=x so that an admin can force a size between 256 and 65536 if needed, like thash_entries and rhash_entries. dmesg logs two new lines : [ 0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes) [ 0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes) Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non debugging spinlocks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | ipv4: Define cipso_v4_delopt staticHagen Paul Pfeifer2009-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no reason that cipso_v4_delopt() is not defined as a static function. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | net: Add sk_mark route lookup support for IPv4 listening socketsAtis Elsts2009-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for route lookup using sk_mark on IPv4 listening sockets. Signed-off-by: Atis Elsts <atis@mikrotik.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | net: mark net_proto_ops as constStephen Hemminger2009-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All usages of structure net_proto_ops should be declared const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | add vif using local interface index instead of IPIlia K2009-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When routing daemon wants to enable forwarding of multicast traffic it performs something like: struct vifctl vc = { .vifc_vifi = 1, .vifc_flags = 0, .vifc_threshold = 1, .vifc_rate_limit = 0, .vifc_lcl_addr = ip, /* <--- ip address of physical interface, e.g. eth0 */ .vifc_rmt_addr.s_addr = htonl(INADDR_ANY), }; setsockopt(fd, IPPROTO_IP, MRT_ADD_VIF, &vc, sizeof(vc)); This leads (in the kernel) to calling vif_add() function call which search the (physical) device using assigned IP address: dev = ip_dev_find(net, vifc->vifc_lcl_addr.s_addr); The current API (struct vifctl) does not allow to specify an interface other way than using it's IP, and if there are more than a single interface with specified IP only the first one will be found. The attached patch (against 2.6.30.4) allows to specify an interface by its index, instead of IP address: struct vifctl vc = { .vifc_vifi = 1, .vifc_flags = VIFF_USE_IFINDEX, /* NEW */ .vifc_threshold = 1, .vifc_rate_limit = 0, .vifc_lcl_ifindex = if_nametoindex("eth0"), /* NEW */ .vifc_rmt_addr.s_addr = htonl(INADDR_ANY), }; setsockopt(fd, IPPROTO_IP, MRT_ADD_VIF, &vc, sizeof(vc)); Signed-off-by: Ilia K. <mail4ilia@gmail.com> === modified file 'include/linux/mroute.h' Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | tunnels: Optimize tx pathEric Dumazet2009-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently dirty a cache line to update tunnel device stats (tx_packets/tx_bytes). We better use the txq->tx_bytes/tx_packets counters that already are present in cpu cache, in the cache line shared with txq->_xmit_lock This patch extends IPTUNNEL_XMIT() macro to use txq pointer provided by the caller. Also &tunnel->dev->stats can be replaced by &dev->stats Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | ipv4: fib table algorithm performance improvementStephen Hemminger2009-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FIB algorithim for IPV4 is set at compile time, but kernel goes through the overhead of function call indirection at runtime. Save some cycles by turning the indirect calls to direct calls to either hash or trie code. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | icmp: No need to call sk_write_space()Eric Dumazet2009-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can make icmp messages tx completion callback a litle bit faster. Setting SOCK_USE_WRITE_QUEUE sk flag tells sock_wfree() to not call sk_write_space() on a socket we know no thread is posssibly waiting for write space. (on per cpu kernel internal icmp sockets only) This avoids the sock_def_write_space() call and read_lock(&sk->sk_callback_lock)/read_unlock(&sk->sk_callback_lock) calls as well. We avoid three atomic ops. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6Linus Torvalds2009-12-08
|\ \ \ \ \ \ \ \ \ \ \ | |_|_|_|_|_|_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...
| * | | | | | | | | | Merge commit 'v2.6.32-rc7'Eric W. Biederman2009-11-17
| |\ \ \ \ \ \ \ \ \ \ | | | |_|_|_|_|_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolve the conflict between v2.6.32-rc7 where dn_def_dev_handler gets a small bug fix and the sysctl tree where I am removing all sysctl strategy routines.
| * | | | | | | | | | sysctl net: Remove unused binary sysctl codeEric W. Biederman2009-11-12
| | |_|_|_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that sys_sysctl is a compatiblity wrapper around /proc/sys all sysctl strategy routines, and all ctl_name and strategy entries in the sysctl tables are unused, and can be revmoed. In addition neigh_sysctl_register has been modified to no longer take a strategy argument and it's callers have been modified not to pass one. Cc: "David Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | | | | | | | | | ipv4: additional update of dev_net(dev) to struct *net in ip_fragment.c, ↵David Ford2009-11-30
| |_|_|_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NULL ptr OOPS ipv4 ip_frag_reasm(), fully replace 'dev_net(dev)' with 'net', defined previously patched into 2.6.29. Between 2.6.28.10 and 2.6.29, net/ipv4/ip_fragment.c was patched, changing from dev_net(dev) to container_of(...). Unfortunately the goto section (out_fail) on oversized packets inside ip_frag_reasm() didn't get touched up as well. Oversized IP packets cause a NULL pointer dereference and immediate hang. I discovered this running openvasd and my previous email on this is titled: NULL pointer dereference at 2.6.32-rc8:net/ipv4/ip_fragment.c:566 Signed-off-by: David Ford <david@blue-labs.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | ipmr: missing dev_put() on error path in vif_add()Dan Carpenter2009-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The other error paths in front of this one have a dev_put() but this one got missed. Found by smatch static checker. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Wang Chen <ellre923@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | tcp: provide more information on the tcp receive_queue bugsIlpo Järvinen2009-11-13
| |/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The addition of rcv_nxt allows to discern whether the skb was out of place or tp->copied. Also catch fancy combination of flags if necessary (sadly we might miss the actual causer flags as it might have already returned). Btw, we perhaps would want to forward copied_seq in somewhere or otherwise we might have some nice loop with WARN stuff within but where to do that safely I don't know at this stage until more is known (but it is not made significantly worse by this patch). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | ipip: Fix handling of DF packets when pmtudisc is OFFHerbert Xu2009-11-06
| |_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC 2003 requires the outer header to have DF set if DF is set on the inner header, even when PMTU discovery is off for the tunnel. Our implementation does exactly that. For this to work properly the IPIP gateway also needs to engate in PMTU when the inner DF bit is set. As otherwise the original host would not be able to carry out its PMTU successfully since part of the path is only visible to the gateway. Unfortunately when the tunnel PMTU discovery setting is off, we do not collect the necessary soft state, resulting in blackholes when the original host tries to perform PMTU discovery. This problem is not reproducible on the IPIP gateway itself as the inner packet usually has skb->local_df set. This is not correctly cleared (an unrelated bug) when the packet passes through the tunnel, which allows fragmentation to occur. For hosts behind the IPIP gateway it is readily visible with a simple ping. This patch fixes the problem by performing PMTU discovery for all packets with the inner DF bit set, regardless of the PMTU discovery setting on the tunnel itself. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | netfilter: nf_nat: fix NAT issue in 2.6.30.4+Jozsef Kadlecsik2009-11-06
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vitezslav Samel discovered that since 2.6.30.4+ active FTP can not work over NAT. The "cause" of the problem was a fix of unacknowledged data detection with NAT (commit a3a9f79e361e864f0e9d75ebe2a0cb43d17c4272). However, actually, that fix uncovered a long standing bug in TCP conntrack: when NAT was enabled, we simply updated the max of the right edge of the segments we have seen (td_end), by the offset NAT produced with changing IP/port in the data. However, we did not update the other parameter (td_maxend) which is affected by the NAT offset. Thus that could drift away from the correct value and thus resulted breaking active FTP. The patch below fixes the issue by *not* updating the conntrack parameters from NAT, but instead taking into account the NAT offsets in conntrack in a consistent way. (Updating from NAT would be more harder and expensive because it'd need to re-calculate parameters we already calculated in conntrack.) Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | gre: Fix dev_addr clobbering for gretapHerbert Xu2009-10-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nathan Neulinger noticed that gretap devices get their MAC address from the local IP address, which results in invalid MAC addresses half of the time. This is because gretap is still using the tunnel netdev ops rather than the correct tap netdev ops struct. This patch also fixes changelink to not clobber the MAC address for the gretap case. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Nathan Neulinger <nneul@mst.edu> Signed-off-by: David S. Miller <davem@davemloft.net>