litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	net/*: use linux/kernel.h swap()	Ilpo Järvinen	2009-03-21
\| \| \| \| \| \| \| \| \|	tcp_sack_swap seems unnecessary so I pushed swap to the caller. Also removed comment that seemed then pointless, and added include when not already there. Compile tested. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2009-03-20
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/virtio_net.c
\| *	netns: oops in ip[6]_frag_reasm incrementing stats	Jorge Boncompte [DTI2]	2009-03-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dev can be NULL in ip[6]_frag_reasm for skb's coming from RAW sockets. Quagga's OSPFD sends fragmented packets on a RAW socket, when netfilter conntrack reassembles them on the OUTPUT path you hit this code path. You can test it with something like "hping2 -0 -d 2000 -f AA.BB.CC.DD" With help from Jarek Poplawski. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: Like icmp use register_pernet_subsys	Eric W. Biederman	2009-03-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To remove the possibility of packets flying around when network devices are being cleaned up use reisger_pernet_subsys instead of register_pernet_device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	netns: Fix icmp shutdown.	Eric W. Biederman	2009-03-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recently I had a kernel panic in icmp_send during a network namespace cleanup. There were packets in the arp queue that failed to be sent and we attempted to generate an ICMP host unreachable message, but failed because icmp_sk_exit had already been called. The network devices are removed from a network namespace and their arp queues are flushed before we do attempt to shutdown subsystems so this error should have been impossible. It turns out icmp_init is using register_pernet_device instead of register_pernet_subsys. Which resulted in icmp being shut down while we still had the possibility of packets in flight, making a nasty NULL pointer deference in interrupt context possible. Changing this to register_pernet_subsys fixes the problem in my testing. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: remove parameter from tcp_recv_urg().	Rami Rosen	2009-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes an unused parameter (addr_len) from tcp_recv_urg() method in net/ipv4/tcp.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: make sure xmit goal size never becomes zero	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not too likely to happen, would basically require crafted packets (must hit the max guard in tcp_bound_to_half_wnd()). It seems that nothing that bad would happen as there's tcp_mems and congestion window that prevent runaway at some point from hurting all too much (I'm not that sure what all those zero sized segments we would generate do though in write queue). Preventing it regardless is certainly the best way to go. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: cache result of earlier divides when mss-aligning things	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The results is very unlikely change every so often so we hardly need to divide again after doing that once for a connection. Yet, if divide still becomes necessary we detect that and do the right thing and again settle for non-divide state. Takes the u16 space which was previously taken by the plain xmit_size_goal. This should take care part of the tso vs non-tso difference we found earlier. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: simplify tcp_current_mss	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's very little need for most of the callsites to get tp->xmit_goal_size updated. That will cost us divide as is, so slice the function in two. Also, the only users of the tp->xmit_goal_size are directly behind tcp_current_mss(), so there's no need to store that variable into tcp_sock at all! The drop of xmit_goal_size currently leaves 16-bit hole and some reorganization would again be necessary to change that (but I'm aiming to fill that hole with u16 xmit_goal_size_segs to cache the results of the remaining divide to get that tso on regression). Bring xmit_goal_size parts into tcp.c Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: don't check mtu probe completion in the loop	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It seems that no variables clash such that we couldn't do the check just once later on. Therefore move it. Also kill dead obvious comment, dead argument and add unlikely since this mtu probe does not happen too often. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: consolidate paws check	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wow, it was quite tricky to merge that stream of negations but I think I finally got it right: check & replace_ts_recent: (s32)(rcv_tsval - ts_recent) >= 0 => 0 (s32)(ts_recent - rcv_tsval) <= 0 => 0 discard: (s32)(ts_recent - rcv_tsval) > TCP_PAWS_WINDOW => 1 (s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW => 0 I toggled the return values of tcp_paws_check around since the old encoding added yet-another negation making tracking of truth-values really complicated. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: kill dead end_seq variable in clean_rtx_queue	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've already forgotten what for this was necessary, anyway it's no longer used (if it ever was). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: remove pointless .dsack/.num_sacks code	Ilpo Järvinen	2009-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the pure assignment case, the earlier zeroing is still in effect. David S. Miller raised concerns if the ifs are there to avoid dirtying cachelines. I came to these conclusions: > We'll be dirty it anyway (now that I check), the first "real" statement > in tcp_rcv_established is: > > tp->rx_opt.saw_tstamp = 0; > > ...that'll land on the same dword. :-/ > > I suppose the blocks are there just because they had more complexity > inside when they had to calculate the eff_sacks too (maybe it would > have been better to just remove them in that drop-patch so you would > have had less head-ache :-)). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: '< 0' test on unsigned	Roel Kluin	2009-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	promote 'cnt' to size_t, to match 'len'. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: arp announce, arp_proxy and windows ip conflict verification	Denys Fedoryshchenko	2009-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Windows (XP at least) hosts on boot, with configured static ip, performing address conflict detection, which is defined in RFC3927. Here is quote of important information: " An ARP announcement is identical to the ARP Probe described above, except that now the sender and target IP addresses are both set to the host's newly selected IPv4 address. " But it same time this goes wrong with RFC5227. " The 'sender IP address' field MUST be set to all zeroes; this is to avoid polluting ARP caches in other hosts on the same link in the case where the address turns out to be already in use by another host. " When ARP proxy configured, it must not answer to both cases, because it is address conflict verification in any case. For Windows it is just causing to detect false "ip conflict". Already there is code for RFC5227, so just trivially we just check also if source ip == target ip. Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying ↵	Neil Horman	2009-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h \| 4 +++- net/core/datagram.c \| 2 +- net/core/skbuff.c \| 22 ++++++++++++++++++++++ net/ipv4/arp.c \| 2 +- net/ipv4/udp.c \| 2 +- net/packet/af_packet.c \| 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: allow timestamps even if SYN packet has tsval=0	Eric Dumazet	2009-03-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some systems send SYN packets with apparently wrong RFC1323 timestamp option values [timestamp tsval=0 tsecr=0]. It might be for security reasons (http://www.secuobs.com/plugs/25220.shtml ) Linux TCP stack ignores this option and sends back a SYN+ACK packet without timestamp option, thus many TCP flows cannot use timestamps and lose some benefit of RFC1323. Other operating systems seem to not care about initial tsval value, and let tcp flows to negotiate timestamp option. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: convert usage of packet_type to read_mostly	Stephen Hemminger	2009-03-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Protocols that use packet_type can be __read_mostly section for better locality. Elminate any unnecessary initializations of NULL. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: tcp_init_wl / tcp_update_wl argument cleanup	Hantzis Fotis	2009-03-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The above functions from include/net/tcp.h have been defined with an argument that they never use. The argument is 'u32 ack' which is never used inside the function body, and thus it can be removed. The rest of the patch involves the necessary changes to the function callers of the above two functions. Signed-off-by: Hantzis Fotis <xantzis@ceid.upatras.gr> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: get rid of two unnecessary u16s in TCP skb flags copying	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I guess these fields were one day 16-bit in the struct but nowadays they're just using 8 bits anyway. This is just a precaution, didn't result any change in my case but who knows what all those varying gcc versions & options do. I've been told that 16-bit is not so nice with some cpus. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: in sendmsg/pages open code the real goto target	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	copied was assigned zero right before the goto, so if (copied) cannot ever be true. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: kill eff_sacks "cache", the sole user can calculate itself	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also fixes insignificant bug that would cause sending of stale SACK block (would occur in some corner cases). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: add helper for AI algorithm	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It seems that implementation in yeah was inconsistent to what other did as it would increase cwnd one ack earlier than the others do. Size benefits: bictcp_cong_avoid \| -36 tcp_cong_avoid_ai \| +52 bictcp_cong_avoid \| -34 tcp_scalable_cong_avoid \| -36 tcp_veno_cong_avoid \| -12 tcp_yeah_cong_avoid \| -38 = -104 bytes total Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	htcp: merge icsk_ca_state compare	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to what is done elsewhere in TCP code when double state checks are being done. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: drop unnecessary local var in collapse	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: cleanup ca_state mess in tcp_timer	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Redundant checks made indentation impossible to follow. However, it might be useful to make this ca_state+is_sack indexed array. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: separate timeout marking loop to it's own function	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some comment about its current state added. So far I have seen very few cases where the thing is actually useful, usually just marginally (though admittedly I don't usually see top of window losses where it seems possible that there could be some gain), instead, more often the cases suffer from L-marking spike which is certainly not desirable (I'll bury improving it to my todo list, but on a low prio position). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: remove redundant code from tcp_mark_lost_retrans	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arnd Hannemann <hannemann@nets.rwth-aachen.de> noticed and was puzzled by the fact that !tcp_is_fack(tp) leads to early return near the beginning and the later on tcp_is_fack(tp) was still used in an if condition. The later check was a left-over from RFC3517 SACK stuff (== !tcp_is_fack(tp) behavior nowadays) as there wasn't clear way how to handle this particular check cheaply in the spirit of RFC3517 (using only SACK blocks, not holes + SACK blocks as with FACK). I sort of left it there as a reminder but since it's confusing other people just remove it and comment the missing-feature stuff instead. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Arnd Hannemann <hannemann@nets.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: fix corner case issue in segmentation during rexmitting	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If cur_mss grew very recently so that the previously G/TSOed skb now fits well into a single segment it would get send up in parts unless we calculate # of segments again. This corner-case could happen eg. after mtu probe completes or less than previously sack blocks are required for the opposite direction. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: Don't clear hints when tcp_fragmenting	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) We didn't remove any skbs, so no need to handle stale refs. 2) scoreboard_skb_hint is trivial, no timestamps were changed so no need to clear that one 3) lost_skb_hint needs tweaking similar to that of tcp_sacktag_one(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: deferring in middle of queue makes very little sense	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If skb can be sent right away, we certainly should do that if it's in the middle of the queue because it won't get more data into it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: fix lost_cnt_hint miscounts	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible that lost_cnt_hint gets underflow in tcp_clean_rtx_queue because the cumulative ACK can cover the segment where lost_skb_hint points to only partially, which means that the hint is not cleared, opposite to what my (earlier) comment claimed. Also I don't agree what I ended up writing about non-trivial case there to be what I intented to say. It was not supposed to happen that the hint won't get cleared and we underflow in any scenario. In general, this is quite hard to trigger in practice. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: don't backtrack to sacked skbs	Ilpo Järvinen	2009-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backtracking to sacked skbs is a horrible performance killer since the hint cannot be advanced successfully past them... ...And it's totally unnecessary too. In theory this is 2.6.27..28 regression but I doubt anybody can make .28 to have worse performance because of other TCP improvements. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'master' of ↵	David S. Miller	2009-03-02
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-tx.c net/8021q/vlan_core.c net/core/dev.c
\| *	tcp: fix retrans_out leaks	Ilpo Järvinen	2009-03-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's conflicting assumptions in shifting, the caller assumes that dupsack results in S'ed skbs (or a part of it) for sure but never gave a hint to tcp_sacktag_one when dsack is actually in use. Thus DSACK retrans_out -= pcount was not taken and the counter became out of sync. Remove obstacle from that information flow to get DSACKs accounted in tcp_sacktag_one as expected. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	inet fragments: fix sparse warning: context imbalance	Hannes Eder	2009-02-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Impact: Attribute function with __releases(...) Fix this sparse warning: net/ipv4/inet_fragment.c:276:35: warning: context imbalance in 'inet_frag_find' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'master' of ↵	David S. Miller	2009-02-25
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/orinoco/orinoco.c
\| *	tcp_scalable: Update malformed & dead url	Joe Perches	2009-02-24
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipip: used time_before for comparing jiffies	Wei Yongjun	2009-02-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	gre: used time_before for comparing jiffies	Wei Yongjun	2009-02-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	netlink: change nlmsg_notify() return value logic	Pablo Neira Ayuso	2009-02-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the return value of nlmsg_notify() as follows: If NETLINK_BROADCAST_ERROR is set by any of the listeners and an error in the delivery happened, return the broadcast error; else if there are no listeners apart from the socket that requested a change with the echo flag, return the result of the unicast notification. Thus, with this patch, the unicast notification is handled in the same way of a broadcast listener that has set the NETLINK_BROADCAST_ERROR socket flag. This patch is useful in case that the caller of nlmsg_notify() wants to know the result of the delivery of a netlink notification (including the broadcast delivery) and take any action in case that the delivery failed. For example, ctnetlink can drop packets if the event delivery failed to provide reliable logging and state-synchronization at the cost of dropping packets. This patch also modifies the rtnetlink code to ignore the return value of rtnl_notify() in all callers. The function rtnl_notify() (before this patch) returned the error of the unicast notification which makes rtnl_set_sk_err() reports errors to all listeners. This is not of any help since the origin of the change (the socket that requested the echoing) notices the ENOBUFS error if the notification fails and should resync itself. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	David S. Miller	2009-02-24
\|\\|
\| *	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	2009-02-23
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix double free at netns creation veth : add the set_mac_address capability sunlance: Beyond ARRAY_SIZE of ib->btx_ring sungem: another error printed one too early ISDN: fix sc/shmem printk format warning SMSC: timeout reaches -1 smsc9420: handle magic field of ethtool_eeprom sundance: missing parentheses? smsc9420: fix another postfixed timeout wimax/i2400m: driver loads firmware v1.4 instead of v1.3 vlan: Update skb->mac_header in __vlan_put_tag(). cxgb3: Add support for PCI ID 0x35. tcp: remove obsoleted comment about different passes TG3: &&/\|\| confusion ATM: misplaced parentheses? net/mv643xx: don't disable the mib timer too early and lock properly net/mv643xx: use GFP_ATOMIC while atomic atl1c: Atheros L1C Gigabit Ethernet driver net: Kill skb_truesize_check(), it only catches false-positives. net: forcedeth: Fix wake-on-lan regression
\| \| *	tcp: remove obsoleted comment about different passes	Ilpo Järvinen	2009-02-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is obsolete since the passes got combined. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	cipso: Fix documentation comment	Paul Moore	2009-02-22
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The CIPSO protocol engine incorrectly stated that the FIPS-188 specification could be found in the kernel's Documentation directory. This patch corrects that by removing the comment and directing users to the FIPS-188 documented hosted online. For the sake of completeness I've also included a link to the CIPSO draft specification on the NetLabel website. Thanks to Randy Dunlap for spotting the error and letting me know. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
* \|	Doc: Refer to ip-sysctl.txt for strict vs. loose rp_filter mode	Jesper Dangaard Brouer	2009-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IP_ADVANCED_ROUTER Kconfig describes the rp_filter proc option. Recent changes added a loose mode. Instead of documenting this change too places, refer to the document describing it: Documentation/networking/ip-sysctl.txt I'm considering moving the rp_filter description away from the Kconfig file into ip-sysctl.txt. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: Like icmp use register_pernet_subsys	Eric W. Biederman	2009-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To remove the possibility of packets flying around when network devices are being cleaned up use reisger_pernet_subsys instead of register_pernet_device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	netns: Fix icmp shutdown.	Eric W. Biederman	2009-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recently I had a kernel panic in icmp_send during a network namespace cleanup. There were packets in the arp queue that failed to be sent and we attempted to generate an ICMP host unreachable message, but failed because icmp_sk_exit had already been called. The network devices are removed from a network namespace and their arp queues are flushed before we do attempt to shutdown subsystems so this error should have been impossible. It turns out icmp_init is using register_pernet_device instead of register_pernet_subsys. Which resulted in icmp being shut down while we still had the possibility of packets in flight, making a nasty NULL pointer deference in interrupt context possible. Changing this to register_pernet_subsys fixes the problem in my testing. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Clean whitespaces in net/ipv4/Kconfig.	Jesper Dangaard Brouer	2009-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	While going through net/ipv4/Kconfig cleanup whitespaces. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Fix rp_filter description in net/ipv4/Kconfig.	Jesper Dangaard Brouer	2009-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The reverse path filter (rp_filter) will NOT get enabled when enabling forwarding. Read the code and tested in in practice. Most distributions do enable it in startup scripts. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>