aboutsummaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAge
* xfrm: xfrm_algo: correct usage of RIPEMD-160Adrian-Ken Rueegsegger2008-06-04
| | | | | | | | | | This patch fixes the usage of RIPEMD-160 in xfrm_algo which in turn allows hmac(rmd160) to be used as authentication mechanism in IPsec ESP and AH (see RFC 2857). Signed-off-by: Adrian-Ken Rueegsegger <rueegsegger@swiss-it.ch> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))Ilpo Järvinen2008-06-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that this skip path causes TCP to end up into an invalid state where ca_state was left to CA_Open while some segments already came into sacked_out. If next valid ACK doesn't contain new SACK information TCP fails to enter into tcp_fastretrans_alert(). Thus at least high_seq is set incorrectly to a too high seqno because some new data segments could be sent in between (and also, limited transmit is not being correctly invoked there). Reordering in both directions can easily cause this situation to occur. I guess we would want to use tcp_moderate_cwnd(tp) there as well as it may be possible to use this to trigger oversized burst to network by sending an old ACK with huge amount of SACK info, but I'm a bit unsure about its effects (mainly to FlightSize), so to be on the safe side I just currently fixed it minimally to keep TCP's state consistent (obviously, such nasty ACKs have been possible this far). Though it seems that FlightSize is already underestimated by some amount, so probably on the long term we might want to trigger recovery there too, if appropriate, to make FlightSize calculation to resemble reality at the time when the losses where discovered (but such change scares me too much now and requires some more thinking anyway how to do that as it likely involves some code shuffling). This bug was found by Brian Vowell while running my TCP debug patch to find cause of another TCP issue (fackets_out miscount). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: nf_conntrack_ipv6: fix inconsistent lock state in ↵Jarek Poplawski2008-06-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | nf_ct_frag6_gather() [ 63.531438] ================================= [ 63.531520] [ INFO: inconsistent lock state ] [ 63.531520] 2.6.26-rc4 #7 [ 63.531520] --------------------------------- [ 63.531520] inconsistent {softirq-on-W} -> {in-softirq-W} usage. [ 63.531520] tcpsic6/3864 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 63.531520] (&q->lock#2){-+..}, at: [<c07175b0>] ipv6_frag_rcv+0xd0/0xbd0 [ 63.531520] {softirq-on-W} state was registered at: [ 63.531520] [<c0143bba>] __lock_acquire+0x3aa/0x1080 [ 63.531520] [<c0144906>] lock_acquire+0x76/0xa0 [ 63.531520] [<c07a8f0b>] _spin_lock+0x2b/0x40 [ 63.531520] [<c0727636>] nf_ct_frag6_gather+0x3f6/0x910 ... According to this and another similar lockdep report inet_fragment locks are taken from nf_ct_frag6_gather() with softirqs enabled, but these locks are mainly used in softirq context, so disabling BHs is necessary. Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: xt_connlimit: fix accouning when receive RST packet in ↵Dong Wei2008-06-04
| | | | | | | | | | | | | | | | ESTABLISHED state In xt_connlimit match module, the counter of an IP is decreased when the TCP packet is go through the chain with ip_conntrack state TW. Well, it's very natural that the server and client close the socket with FIN packet. But when the client/server close the socket with RST packet(using so_linger), the counter for this connection still exsit. The following patch can fix it which is based on linux-2.6.25.4 Signed-off-by: Dong Wei <dwei.zh@gmail.com> Acked-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* route: Remove unused ifa_anycast fieldThomas Graf2008-06-03
| | | | | | | | | The field was supposed to allow the creation of an anycast route by assigning an anycast address to an address prefix. It was never implemented so this field is unused and serves no purpose. Remove it. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* netlink: Improve returned error codesThomas Graf2008-06-03
| | | | | | | | | | | Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and nla_nest_cancel() void functions. Return -EMSGSIZE instead of -1 if the provided message buffer is not big enough. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* route: Mark unused routing attributes as suchThomas Graf2008-06-03
| | | | | | | | Also removes an unused policy entry for an attribute which is only used in kernel->user direction. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* route: Mark unused route cache flags as such.Thomas Graf2008-06-03
| | | | | | | Also removes an obsolete check for the unused flag RTCF_MASQ. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_dma: remove duplicate assignment in dma_skb_copy_datagram_iovecBrice Goglin2008-06-03
| | | | | | | | | | | No need to compute copy twice in the frags loop in dma_skb_copy_datagram_iovec(). Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: neighbour table ABI problemStephen Hemminger2008-06-03
| | | | | | | | | | | | | | | | | | | The neighbor table time of last use information is returned in the incorrect unit. Kernel to user space ABI's need to use USER_HZ (or milliseconds), otherwise the application has to try and discover the real system HZ value which is problematic. Linux has standardized on keeping USER_HZ consistent (100hz) even when kernel is running internally at some other value. This change is small, but it breaks the ABI for older version of iproute2 utilities. But these utilities are already broken since they are looking at the psched_hz values which are completely different. So let's just go ahead and fix both kernel and user space. Older utilities will just print wrong values. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* irda: Sock leak on error path in irda_create.Pavel Emelyanov2008-06-03
| | | | | | | | | | | Bad type/protocol specified result in sk leak. Fix is simple - release the sk if bad values are given, but to make it possible just to call sk_free(), I move some sk initialization a bit lower. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ax25: Fix NULL pointer dereference and lockup.Jarek Poplawski2008-06-03
| | | | | | | | | | | | | | From: Jarek Poplawski <jarkao2@gmail.com> There is only one function in AX25 calling skb_append(), and it really looks suspicious: appends skb after previously enqueued one, but in the meantime this previous skb could be removed from the queue. This patch Fixes it the simple way, so this is not fully compatible with the current method, but testing hasn't shown any problems. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* bluetooth: rfcomm_dev_state_change deadlock fixDave Young2008-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's logic in __rfcomm_dlc_close: rfcomm_dlc_lock(d); d->state = BT_CLOSED; d->state_changed(d, err); rfcomm_dlc_unlock(d); In rfcomm_dev_state_change, it's possible that rfcomm_dev_put try to take the dlc lock, then we will deadlock. Here fixed it by unlock dlc before rfcomm_dev_get in rfcomm_dev_state_change. why not unlock just before rfcomm_dev_put? it's because there's another problem. rfcomm_dev_get/rfcomm_dev_del will take rfcomm_dev_lock, but in rfcomm_dev_add the lock order is : rfcomm_dev_lock --> dlc lock so I unlock dlc before the taken of rfcomm_dev_lock. Actually it's a regression caused by commit 1905f6c736cb618e07eca0c96e60e3c024023428 ("bluetooth : __rfcomm_dlc_close lock fix"), the dlc state_change could be two callbacks : rfcomm_sk_state_change and rfcomm_dev_state_change. I missed the rfcomm_sk_state_change that time. Thanks Arjan van de Ven <arjan@linux.intel.com> for the effort in commit 4c8411f8c115def968820a4df6658ccfd55d7f1a ("bluetooth: fix locking bug in the rfcomm socket cleanup handling") but he missed the rfcomm_dev_state_change lock issue. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-05-30
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits) llc: Fix double accounting of received packets netfilter: nf_conntrack_expect: fix error path unwind in nf_conntrack_expect_init() bluetooth: fix locking bug in the rfcomm socket cleanup handling mac80211: fix alignment issue with compare_ether_addr() mac80211: Fix for NULL pointer dereference in sta_info_get() mac80211: fix a typo in ieee80211_handle_filtered_frame comment rndis_wlan: add missing range check for power_output modparam iwlwifi: fix rate scale TLC column selection bug iwlwifi: fix exit from stay_in_table state rndis_wlan: Make connections to TKIP PSK networks work mac80211 : Fixes the status message for iwconfig rt2x00: Use atomic interface iteration in irq context rt2x00: Reset antenna RSSI after switch rt2x00: Don't count retries as failure rt2x00: Fix memleak in tx() path mac80211: reorder channel and freq reporting in wext scan report b43: Fix controller restart crash mac80211: fix ieee80211_rx_bss_put/get imbalance net/mac80211: always true conditionals b43: Upload both beacon templates on initial load ...
| * llc: Fix double accounting of received packetsArnaldo Carvalho de Melo2008-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | llc_sap_rcv was being preceded by skb_set_owner_r, then calling llc_state_process that calls sock_queue_rcv_skb, that in turn calls skb_set_owner_r again making the space allowed to be used by the socket to be leaked, making the socket to get stuck. Fix it by setting skb->sk at llc_sap_rcv and leave the accounting to be done only at sock_queue_rcv_skb. Reported-by: Dmitry Petukhov <dmgenp@gmail.com> Tested-by: Dmitry Petukhov <dmgenp@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nf_conntrack_expect: fix error path unwind in ↵Alexey Dobriyan2008-05-29
| | | | | | | | | | | | | | | | nf_conntrack_expect_init() Signed-off-by: Alexey Dobriyan <adobriyan@parallels.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2008-05-29
| |\ | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6
| | * mac80211: fix alignment issue with compare_ether_addr()Senthil Balasubramanian2008-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses an alignment issue with compare_ether_addr(). The addresses passed to compare_ether_addr should be two bytes aligned. It may function properly in x86 platform. However may not work properly on IA-64 or ARM processor. This also fixes a typo in mlme.c where the sk_buff struct name is incorect. Though sizeof() works for any incorrect structure pointer name as its just a pointer length that we want, lets just fix it. Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: Fix for NULL pointer dereference in sta_info_get()Senthil Balasubramanian2008-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses a NULL pointer dereference in sta_info_get(). TID and sta_info are extracted in ADDBA Timer expiry function through the timer handler's argument. The problem is extracging the TID (which was stored in timer_to_tid[] array of type "u8") through "int *" typecast which may also yield unwanted bytes for the MSB of TID that results in incorrect sta_info and ieee80211_local pointers. ieee80211_local pointer is NULL as illustrated below, it crashes in sta_info_get(). The problem started when extracting ieee80211_local pointer out of sta_info iteself and eventually crashed in stat_info_get(). The proper way to fix is to change the data type of TID to u8 instead of u16. However changing all the occurences requires some prototype changes as well. We should fix this in upcoming patches. Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: Luis Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: fix a typo in ieee80211_handle_filtered_frame commentYi Zhu2008-05-28
| | | | | | | | | | | | | | | | | | | | | fix a typo in ieee80211_handle_filtered_frame comment Signed-off-by: Yi Zhu <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211 : Fixes the status message for iwconfigAbhijeet Kolekar2008-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iwconfig was showing incorrect status messages when disassociated. Patch fixes this by always checking for association status in ioctl calls for getting ap address. Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: reorder channel and freq reporting in wext scan reportTomas Winkler2008-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch switch order of channel and freq (SIOCGIWFREQ) reports in scan results in order to overcome wpa_supplicant inability to handle channel numbers in 5.2Ghz band. Wext reporting channel number is ambiguous as channels 7-12 (802.11j) exist on both bands. Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: fix ieee80211_rx_bss_put/get imbalanceTomas Winkler2008-05-28
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes iee80211_rx_bss_put/get imbalance introduced by 'mac80211: enable IBSS merging' patch. Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * net/mac80211: always true conditionalsNicolas Kaiser2008-05-28
| | | | | | | | | | | | | | | | | | | | | Correct always true conditionals. Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | bluetooth: fix locking bug in the rfcomm socket cleanup handlingArjan van de Ven2008-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in net/bluetooth/rfcomm/sock.c, rfcomm_sk_state_change() does the following operation: if (parent && sock_flag(sk, SOCK_ZAPPED)) { /* We have to drop DLC lock here, otherwise * rfcomm_sock_destruct() will dead lock. */ rfcomm_dlc_unlock(d); rfcomm_sock_kill(sk); rfcomm_dlc_lock(d); } } which is fine, since rfcomm_sock_kill() will call sk_free() which will call rfcomm_sock_destruct() which takes the rfcomm_dlc_lock()... so far so good. HOWEVER, this assumes that the rfcomm_sk_state_change() function always gets called with the rfcomm_dlc_lock() taken. This is the case for all but one case, and in that case where we don't have the lock, we do a double unlock followed by an attempt to take the lock, which due to underflow isn't going anywhere fast. This patch fixes this by moving the stragling case inside the lock, like the other usages of the same call are doing in this code. This was found with the help of the www.kerneloops.org project, where this deadlock was observed 51 times at this point in time: http://www.kerneloops.org/search.php?search=rfcomm_sock_destruct Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | dccp ccid-3: Fix "t_ipi explosion" bugGerrit Renker2008-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The identification of this bug is thanks to Cheng Wei and Tomasz Grobelny. To avoid divide-by-zero, the implementation previously ignored RTTs smaller than 4 microseconds when performing integer division RTT/4. When the RTT reached a value less than 4 microseconds (as observed on loopback), this prevented the Window Counter CCVal value from advancing. As a result, the receiver stopped sending feedback. This in turn caused non-ending expiries of the nofeedback timer at the sender, so that the sending rate was progressively reduced until reaching the minimum of one packet per 64 seconds. The patch fixes this bug by handling integer division more intelligently. Due to consistent use of dccp_sample_rtt(), divide-by-zero-RTT is avoided. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | dccp: Fix to handle short sequence numbers packet correctlyWei Yongjun2008-05-27
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC4340 said: 8.5. Pseudocode ... If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet has short sequence numbers), drop packet and return But DCCP has some mistake to handle short sequence numbers packet, now it drop packet only if P.type is Data, Ack, or DataAck and P.X == 0. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-05-26
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits) vlan: Use bitmask of feature flags instead of seperate feature bits fmvj18x_cs: add NextCom NC5310 rev B support xirc2ps_cs: re-initialize the multicast address in do_reset 3C509: rx_bytes should not be increased when alloc_skb failed NETFRONT: Use __skb_queue_purge() VIRTIO: Use __skb_queue_purge() phylib: do EXPORT_SYMBOL on get_phy_id netlink: Fix nla_parse_nested_compat() to call nla_parse() directly WAN: protect HDLC proto list while insmod/rmmod drivers/net/fs_enet: remove null pointer dereference S2io: Version update for napi and MSI-X patches S2io: Added napi support when MSIX is enabled. S2io: Move all the transmit completions to a single msi-x (alarm) vector drivers/net/ehea - remove unnecessary memset after kzalloc au1000_eth: remove useless check Blackfin EMAC Driver: Removed duplicated include <linux/ethtool.h> cpmac bugfixes and enhancements e1000e: use resource_size_t, not unsigned long, for phys addrs net/usb: add support for Apple USB Ethernet Adapter uli526x: add support for netpoll ...
| * vlan: Use bitmask of feature flags instead of seperate feature bitsPatrick McHardy2008-05-23
| | | | | | | | | | | | | | | | | | | | | | Herbert Xu points out that the use of seperate feature bits for features to be propagated to VLAN devices is going to get messy real soon. Replace the VLAN feature bits by a bitmask of feature flags to be propagated and restore the old GSO_SHIFT/MASK values. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-05-22
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: The world is not perfect patch. tcp: Make prior_ssthresh a u32 xfrm_user: Remove zero length key checks. net/ipv4/arp.c: Use common hex_asc helpers cassini: Only use chip checksum for ipv4 packets. tcp: TCP connection times out if ICMP frag needed is delayed netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__ af_key: Fix selector family initialization. libertas: Fix ethtool statistics mac80211: fix NULL pointer dereference in ieee80211_compatible_rates mac80211: don't claim iwspy support orinoco_cs: add ID for SpeedStream wireless adapters hostap_cs: add ID for Conceptronic CON11CPro rtl8187: resource leak in error case ath5k: Fix loop variable initializations
| * net: The world is not perfect patch.Rami Rosen2008-05-21
| | | | | | | | | | | | | | | | | | | | | | | | Unless there will be any objection here, I suggest consider the following patch which simply removes the code for the -DI_WISH_WORLD_WERE_PERFECT in the three methods which use it. The compilation errors we get when using -DI_WISH_WORLD_WERE_PERFECT show that this code was not built and not used for really a long time. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_user: Remove zero length key checks.David S. Miller2008-05-21
| | | | | | | | | | | | | | | | | | | | | | The crypto layer will determine whether that is valid or not. Suggested by Herbert Xu, based upon a report and patch by Martin Willi. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
| * net/ipv4/arp.c: Use common hex_asc helpersDenis Cheng2008-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here the local hexbuf is a duplicate of global const char hex_asc from lib/hexdump.c, except the hex letters' cases: const char hexbuf[] = "0123456789ABCDEF"; const char hex_asc[] = "0123456789abcdef"; and here to print HW addresses, the hex cases are not significant. Thanks to Harvey Harrison to introduce the hex_asc_hi/hex_asc_lo helpers. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: TCP connection times out if ICMP frag needed is delayedSridhar Samudrala2008-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are seeing an issue with TCP in handling an ICMP frag needed message that is received after net.ipv4.tcp_retries1 retransmits. The default value of retries1 is 3. So if the path mtu changes and ICMP frag needed is lost for the first 3 retransmits or if it gets delayed until 3 retransmits are done, TCP doesn't update MSS correctly and continues to retransmit the orginal message until it timesout after tcp_retries2 retransmits. I am seeing this issue even with the latest 2.6.25.4 kernel. In tcp_retransmit_timer(), when retransmits counter exceeds tcp_retries1 value, the dst cache entry of the socket is reset. At this time, if we receive an ICMP frag needed message, the dst entry gets updated with the new MTU, but the TCP sockets dst_cache entry remains NULL. So the next time when we try to retransmit after the ICMP frag needed is received, tcp_retransmit_skb() gets called. Here the cur_mss value is calculated at the start of the routine with a NULL sk_dst_cache. Instead we should call tcp_current_mss after the rebuild_header that caches the dst entry with the updated mtu. Also the rebuild_header should be called before tcp_fragment so that skb is fragmented if the mss goes down. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * af_key: Fix selector family initialization.Kazunori MIYAZAWA2008-05-21
| | | | | | | | | | | | | | | | | | | | | | | | This propagates the xfrm_user fix made in commit bcf0dda8d2408fe1c1040cdec5a98e5fcad2ac72 ("[XFRM]: xfrm_user: fix selector family initialization") Based upon a bug report from, and tested by, Alan Swanson. Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2008-05-20
| |\ | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6
| | * mac80211: fix NULL pointer dereference in ieee80211_compatible_ratesHelmut Schaa2008-05-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a possible NULL pointer dereference in ieee80211_compatible_rates introduced in the patch "mac80211: fix association with some APs". If no bss is available just use all supported rates in the association request. Signed-off-by: Helmut Schaa <hschaa@suse.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: don't claim iwspy supportJohannes Berg2008-05-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | We removed iwspy support a very long time ago because it is useless, but forgot to stop claiming to support it. Apparently, nobody cares, but remove it nonetheless. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | | Merge branch 'for-2.6.26' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2008-05-20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.26' of git://linux-nfs.org/~bfields/linux: (25 commits) svcrdma: Verify read-list fits within RPCSVC_MAXPAGES svcrdma: Change svc_rdma_send_error return type to void svcrdma: Copy transport address and arm CQ before calling rdma_accept svcrdma: Set rqstp transport address in rdma_read_complete function svcrdma: Use ib verbs version of dma_unmap svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free svcrdma: Move the QP and cm_id destruction to svc_rdma_free svcrdma: Add reference for each SQ/RQ WR svcrdma: Move destroy to kernel thread svcrdma: Shrink scope of spinlock on RQ CQ svcrdma: Use standard Linux lists for context cache svcrdma: Simplify RDMA_READ deferral buffer management svcrdma: Remove unused READ_DONE context flags bit svcrdma: Return error from rdma_read_xdr so caller knows to free context svcrdma: Fix error handling during listening endpoint creation svcrdma: Free context on post_recv error in send_reply svcrdma: Free context on ib_post_recv error svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler svcrdma: Fix return value in svc_rdma_send svcrdma: Fix race with dto_tasklet in svc_rdma_send ...
| * \ \ Merge branch 'from-tomtucker' into for-2.6.26J. Bruce Fields2008-05-20
| |\ \ \
| | * | | svcrdma: Verify read-list fits within RPCSVC_MAXPAGESTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A RDMA read-list cannot contain more elements than RPCSVC_MAXPAGES or it will overflow the DTO context. Verify this when processing the protocol header. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Change svc_rdma_send_error return type to voidTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The svc_rdma_send_error function is called when an RPCRDMA protocol error is detected. This function attempts to post an error reply message. Since an error posting to a transport in error is ignored, change the return type to void. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Copy transport address and arm CQ before calling rdma_acceptTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This race was found by inspection. Messages can be received from the peer immediately following the rdma_accept call, however, the CQ have not yet been armed and the transport address has not yet been set. Set the transport address in the connect request handler and arm the CQ prior to calling rdma_accept. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Set rqstp transport address in rdma_read_complete functionTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rdma_read_complete function needs to copy the rqstp transport address from the transport. Failure to do so can result in using the wrong authentication method for the RPC or bug checking if the rqstp address is not valid. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Use ib verbs version of dma_unmapTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the ib_verbs version of the dma_unmap service in the svc_rdma_put_context function. This should support providers using software rdma. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_freeTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the transport is closing, the DTO tasklet may queue data that never gets processed. Clean up resources associated with this I/O. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Move the QP and cm_id destruction to svc_rdma_freeTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the destruction of the QP and CM_ID to the free path so that the QP cleanup code doesn't race with the dto_tasklet handling flushed WR. The QP reference is not needed because we now have a reference for every WR. Also add a guard in the SQ and RQ completion handlers to ignore calls generated by some providers when the QP is destroyed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Add reference for each SQ/RQ WRTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a reference on the transport for every outstanding WR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Move destroy to kernel threadTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some providers may wait while destroying adapter resources. Since it is possible that the last reference is put on the dto_tasklet, the actual destroy must be scheduled as a work item. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
| | * | | svcrdma: Shrink scope of spinlock on RQ CQTom Tucker2008-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rq_cq_reap function is only called from the dto_tasklet. The only resource shared with other threads is the sc_rq_dto_q. Move the spin lock to protect only this list. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>