aboutsummaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAge
* [BRIDGE]: need to ref count the LLC sapStephen Hemminger2006-05-23
| | | | | | | | | | | | Bridge will OOPS on removal if other application has the SAP open. The bridge SAP might be shared with other usages, so need to do reference counting on module removal rather than explicit close/delete. Since packet might arrive after or during removal, need to clear the receive function handle, so LLC only hands it to user (if any). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: SNMP NAT: fix memleak in snmp_object_decodeChris Wright2006-05-23
| | | | | | | | If kmalloc fails, error path leaks data allocated from asn1_oid_decode(). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: H.323 helper: fix sequence extension parsingPatrick McHardy2006-05-23
| | | | | | | | When parsing unknown sequence extensions the "son"-pointer points behind the last known extension for this type, don't try to interpret it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: H.323 helper: fix parser error propagationPatrick McHardy2006-05-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The condition "> H323_ERROR_STOP" can never be true since H323_ERROR_STOP is positive and is the highest possible return code, while real errors are negative, fix the checks. Also only abort on real errors in some spots that were just interpreting any return value != 0 as error. Fixes crashes caused by use of stale data after a parsing error occured: BUG: unable to handle kernel paging request at virtual address bfffffff printing eip: c01aa0f8 *pde = 1a801067 *pte = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: ip_nat_h323 ip_conntrack_h323 nfsd exportfs sch_sfq sch_red cls_fw sch_hfsc xt_length ipt_owner xt_MARK iptable_mangle nfs lockd sunrpc pppoe pppoxx CPU: 0 EIP: 0060:[<c01aa0f8>] Not tainted VLI EFLAGS: 00210646 (2.6.17-rc4 #8) EIP is at memmove+0x19/0x22 eax: d77264e9 ebx: d77264e9 ecx: e88d9b17 edx: d77264e9 esi: bfffffff edi: bfffffff ebp: de6a7680 esp: c0349db8 ds: 007b es: 007b ss: 0068 Process asterisk (pid: 3765, threadinfo=c0349000 task=da068540) Stack: <0>00000006 c0349e5e d77264e3 e09a2b4e e09a38a0 d7726052 d7726124 00000491 00000006 00000006 00000006 00000491 de6a7680 d772601e d7726032 c0349f74 e09a2dc2 00000006 c0349e5e 00000006 00000000 d76dda28 00000491 c0349f74 Call Trace: [<e09a2b4e>] mangle_contents+0x62/0xfe [ip_nat] [<e09a2dc2>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat] [<e0a2712d>] set_addr+0x74/0x14c [ip_nat_h323] [<e0ad531e>] process_setup+0x11b/0x29e [ip_conntrack_h323] [<e0ad534f>] process_setup+0x14c/0x29e [ip_conntrack_h323] [<e0ad57bd>] process_q931+0x3c/0x142 [ip_conntrack_h323] [<e0ad5dff>] q931_help+0xe0/0x144 [ip_conntrack_h323] ... Found by the PROTOS c07-h2250v4 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2006-05-23
|\ | | | | | | | | | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NETFILTER]: SNMP NAT: fix memory corruption [IRDA]: fixup type of ->lsap_state [IRDA]: fix 16/32 bit confusion [NET]: Fix "ntohl(ntohs" bugs [BNX2]: Use kmalloc instead of array [BNX2]: Fix bug in bnx2_nvram_write() [TG3]: Add some missing rx error counters
| * [NETFILTER]: SNMP NAT: fix memory corruptionPatrick McHardy2006-05-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix memory corruption caused by snmp_trap_decode: - When snmp_trap_decode fails before the id and address are allocated, the pointers contain random memory, but are freed by the caller (snmp_parse_mangle). - When snmp_trap_decode fails after allocating just the ID, it tries to free both address and ID, but the address pointer still contains random memory. The caller frees both ID and random memory again. - When snmp_trap_decode fails after allocating both, it frees both, and the callers frees both again. The corruption can be triggered remotely when the ip_nat_snmp_basic module is loaded and traffic on port 161 or 162 is NATed. Found by multiple testcases of the trap-app and trap-enc groups of the PROTOS c06-snmpv1 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [IRDA]: fix 16/32 bit confusionAlexey Dobriyan2006-05-22
| | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [NET]: Fix "ntohl(ntohs" bugsAlexey Dobriyan2006-05-22
| | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [PATCH] knfsd: Fix two problems that can cause rmmod nfsd to dieNeilBrown2006-05-23
|/ | | | | | | | | | | | | | | | | | Both cause the 'entries' count in the export cache to be non-zero at module removal time, so unregistering that cache fails and results in an oops. 1/ exp_pseudoroot (used for NFSv4 only) leaks a reference to an export entry. 2/ sunrpc_cache_update doesn't increment the entries count when it adds an entry. Thanks to "david m. richter" <richterd@citi.umich.edu> for triggering the problem and finding one of the bugs. Cc: "david m. richter" <richterd@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [SCTP]: Allow linger to abort 1-N style sockets.Vladislav Yasevich2006-05-19
| | | | | | | | | Enable SO_LINGER functionality for 1-N style sockets. The socket API draft will be clarfied to allow for this functionality. The linger settings will apply to all associations on a given socket. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
* [SCTP]: Validate the parameter length in HB-ACK chunk.Vladislav Yasevich2006-05-19
| | | | | | | | | | If SCTP receives a badly formatted HB-ACK chunk, it is possible that we may access invalid memory and potentially have a buffer overflow. We should really make sure that the chunk format is what we expect, before attempting to touch the data. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
* [SCTP]: A better solution to fix the race between sctp_peeloff() andVladislav Yasevich2006-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sctp_rcv(). The goal is to hold the ref on the association/endpoint throughout the state-machine process. We accomplish like this: /* ref on the assoc/ep is taken during lookup */ if owned_by_user(sk) sctp_add_backlog(skb, sk); else inqueue_push(skb, sk); /* drop the ref on the assoc/ep */ However, in sctp_add_backlog() we take the ref on assoc/ep and hold it while the skb is on the backlog queue. This allows us to get rid of the sock_hold/sock_put in the lookup routines. Now sctp_backlog_rcv() needs to account for potential association move. In the unlikely event that association moved, we need to retest if the new socket is locked by user. If we don't this, we may have two packets racing up the stack toward the same socket and we can't deal with it. If the new socket is still locked, we'll just add the skb to its backlog continuing to hold the ref on the association. This get's rid of the need to move packets from one backlog to another and it also safe in case new packets arrive on the same backlog queue. The last step, is to lock the new socket when we are moving the association to it. This is needed in case any new packets arrive on the association when it moved. We want these to go to the backlog since we would like to avoid the race between this new packet and a packet that may be sitting on the backlog queue of the old socket toward the same association. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
* [SCTP]: Set sk_err so that poll wakes up after a non-blocking connect failure.Sridhar Samudrala2006-05-19
| | | | | | Also fix some other cases where sk_err is not set for 1-1 style sockets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
* [NETFILTER]: nfnetlink_log: fix byteorder confusionPatrick McHardy2006-05-19
| | | | | | | | | | flags is a u16, so use htons instead of htonl. Also avoid double conversion. Noticed by Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix do_add_counters race, possible oops or info leak ↵Solar Designer2006-05-19
| | | | | | | | | | | | | | | | | | | | (CVE-2006-0039) Solar Designer found a race condition in do_add_counters(). The beginning of paddc is supposed to be the same as tmp which was sanity-checked above, but it might not be the same in reality. In case the integer overflow and/or the race condition are triggered, paddc->num_counters might not match the allocation size for paddc. If the check below (t->private->number != paddc->num_counters) nevertheless passes (perhaps this requires the race condition to be triggered), IPT_ENTRY_ITERATE() would read kernel memory beyond the allocation size, potentially causing an oops or leaking sensitive data (e.g., passwords from host system or from another VPS) via counter increments. This requires CAP_NET_ADMIN. Signed-off-by: Solar Designer <solar@openwall.com> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: GRE conntrack: fix htons/htonl confusionAlexey Dobriyan2006-05-19
| | | | | | | | GRE keys are 16 bit. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: fix format specifier for netfilter log targetsPhilip Craig2006-05-19
| | | | | | | | | The prefix argument for nf_log_packet is a format specifier, so don't pass the user defined string directly to it. Signed-off-by: Philip Craig <philipc@snapgear.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix memory leak in ipt_recentJesper Juhl2006-05-19
| | | | | | | | | | | | | | | | The Coverity checker spotted that we may leak 'hold' in net/ipv4/netfilter/ipt_recent.c::checkentry() when the following is true: if (!curr_table->status_proc) { ... if(!curr_table) { ... return 0; <-- here we leak. Simply moving an existing vfree(hold); up a bit avoids the possible leak. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: reno sacked_out count fixAngelo P. Castellani2006-05-17
| | | | | | | | | | | | | | | | | | | | | From: "Angelo P. Castellani" <angelo.castellani+lkml@gmail.com> Using NewReno, if a sk_buff is timed out and is accounted as lost_out, it should also be removed from the sacked_out. This is necessary because recovery using NewReno fast retransmit could take up to a lot RTTs and the sk_buff RTO can expire without actually being really lost. left_out = sacked_out + lost_out in_flight = packets_out - left_out + retrans_out Using NewReno without this patch, on very large network losses, left_out becames bigger than packets_out + retrans_out (!!). For this reason unsigned integer in_flight overflows to 2^32 - something. Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: Endian fix in net/ipv6/netfilter/ip6t_eui64.c:match().Alexey Dobriyan2006-05-16
| | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TR]: Remove an unused export.Adrian Bunk2006-05-16
| | | | | | | | | This patch removes the unused EXPORT_SYMBOL(tr_source_route). (Note, the usage in net/llc/llc_output.c can't be modular.) Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPX]: Correct return type of ipx_map_frame_type().Alexey Dobriyan2006-05-16
| | | | | | | Casting BE16 to int and back may or may not work. Correct, to be sure. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPX]: Correct argument type of ipxrtr_delete().Alexey Dobriyan2006-05-16
| | | | | | | | A single caller passes __u32. Inside function "net" is compared with __u32 (__be32 really, just wasn't annotated). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PKT_SCHED]: Potential jiffy wrap bug in dev_watchdog().Stephen Hemminger2006-05-16
| | | | | | | | There is a potential jiffy wraparound bug in the transmit watchdog that is easily avoided by using time_after(). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NEIGH]: Fix IP-over-ATM and ARP interaction.Simon Kelley2006-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The classical IP over ATM code maintains its own IPv4 <-> <ATM stuff> ARP table, using the standard neighbour-table code. The neigh_table_init function adds this neighbour table to a linked list of all neighbor tables which is used by the functions neigh_delete() neigh_add() and neightbl_set(), all called by the netlink code. Once the ATM neighbour table is added to the list, there are two tables with family == AF_INET there, and ARP entries sent via netlink go into the first table with matching family. This is indeterminate and often wrong. To see the bug, on a kernel with CLIP enabled, create a standard IPv4 ARP entry by pinging an unused address on a local subnet. Then attempt to complete that entry by doing ip neigh replace <ip address> lladdr <some mac address> nud reachable Looking at the ARP tables by using ip neigh show will reveal two ARP entries for the same address. One of these can be found in /proc/net/arp, and the other in /proc/net/atm/arp. This patch adds a new function, neigh_table_init_no_netlink() which does everything the neigh_table_init() does, except add the table to the netlink all-arp-tables chain. In addition neigh_table_init() has a check that all tables on the chain have a distinct address family. The init call in clip.c is changed to call neigh_table_init_no_netlink(). Since ATM ARP tables are rather more complicated than can currently be handled by the available rtattrs in the netlink protocol, no functionality is lost by this patch, and non-ATM ARP manipulation via netlink is rescued. A more complete solution would involve a rtattr for ATM ARP entries and some way for the netlink code to give neigh_add and friends more information than just address family with which to find the correct ARP table. [ I've changed the assertion checking in neigh_table_init() to not use BUG_ON() while holding neigh_tbl_lock. Instead we remember that we found an existing tbl with the same family, and after dropping the lock we'll give a diagnostic kernel log message and a stack dump. -DaveM ] Signed-off-by: Simon Kelley <simon@thekelleys.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET_SCHED]: HFSC: fix thinko in hfsc_adjust_levels()Patrick McHardy2006-05-11
| | | | | | | | | When deleting the last child the level of a class should drop to zero. Noticed by Andreas Mueller <andreas@stapelspeicher.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: skb leakage in inet6_csk_xmitAlexey Kuznetsov2006-05-10
| | | | | | | inet6_csk_xit does not free skb when routing fails. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* [BRIDGE]: Do sysfs registration inside rtnl.Stephen Hemminger2006-05-10
| | | | | | | | | Now that netdevice sysfs registration is done as part of register_netdevice; bridge code no longer has to be tricky when adding it's kobjects to bridges. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Do sysfs registration as part of register_netdevice.Stephen Hemminger2006-05-10
| | | | | | | | | | | | | | The last step of netdevice registration was being done by a delayed call, but because it was delayed, it was impossible to return any error code if the class_device registration failed. Side effects: * one state in registration process is unnecessary. * register_netdevice can sleep inside class_device registration/hotplug * code in netdev_run_todo only does unregistration so it is simpler. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] linkwatch: Handle jiffies wrap-aroundHerbert Xu2006-05-09
| | | | | | | | | | | | | | The test used in the linkwatch does not handle wrap-arounds correctly. Since the intention of the code is to eliminate bursts of messages we can afford to delay things up to a second. Using that fact we can easily handle wrap-arounds by making sure that we don't delay things by more than one second. This is based on diagnosis and a patch by Stefan Rompf. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stefan Rompf <stefan@loplof.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IRDA]: Removing unused EXPORT_SYMBOLsAdrian Bunk2006-05-09
| | | | | | | | | | | This patch removes the following unused EXPORT_SYMBOL's: - irias_find_attrib - irias_new_string_value - irias_new_octseq_value Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Make netdev_chain a raw notifier.Alan Stern2006-05-09
| | | | | | | | | | | | From: Alan Stern <stern@rowland.harvard.edu> This chain does it's own locking via the RTNL semaphore, and can also run recursively so adding a new mutex here was causing deadlocks. Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: ip_options_fragment() has no effect on fragmentationWei Yongjun2006-05-09
| | | | | | | | Fix error point to options in ip_options_fragment(). optptr get a error pointer to the ipv4 header, correct is pointer to ipv4 options. Signed-off-by: Wei Yongjun <weiyj@soft.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'upstream-fixes' of ↵Stephen Hemminger2006-05-08
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| * [PATCH] softmac: make non-operational after being stoppedDaniel Drake2006-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zd1211 with softmac and wpa_supplicant revealed an issue with softmac and the use of workqueues. Some of the work functions actually reschedule themselves, so this meant that there could still be pending work after flush_scheduled_work() had been called during ieee80211softmac_stop(). This patch introduces a "running" flag which is used to ensure that rescheduling does not happen in this situation. I also used this flag to ensure that softmac's hooks into ieee80211 are non-operational once the stop operation has been started. This simply makes softmac a little more robust, because I could crash it easily by receiving frames in the short timeframe after shutting down softmac and before turning off the ZD1211 radio. (ZD1211 is now fixed as well!) Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * [PATCH] softmac: don't reassociate if user asked for deauthenticationDaniel Drake2006-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | When wpa_supplicant exits, it uses SIOCSIWMLME to request deauthentication. softmac then tries to reassociate without any user intervention, which isn't the desired behaviour of this signal. This change makes softmac only attempt reassociation if the remote network itself deauthenticated us. Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | [IPV4]: Remove likely in ip_rcv_finish()Hua Zhong2006-05-06
| | | | | | | | | | | | | | | | | | | | | | | | This is another result from my likely profiling tool (dwalker@mvista.com just sent the patch of the profiling tool to linux-kernel mailing list, which is similar to what I use). On my system (not very busy, normal development machine within a VMWare workstation), I see a 6/5 miss/hit ratio for this "likely". Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [NET]: Create netdev attribute_groups with class_device_addStephen Hemminger2006-05-06
| | | | | | | | | | | | | | | | | | | | Atomically create attributes when class device is added. This avoids the race between registering class_device (which generates hotplug event), and the creation of attribute groups. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [TCP]: Fix snd_cwnd adjustments in tcp_highspeed.cJohn Heffner2006-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xiaoliang (David) Wei wrote: > Hi gurus, > > I am reading the code of tcp_highspeed.c in the kernel and have a > question on the hstcp_cong_avoid function, specifically the following > AI part (line 136~143 in net/ipv4/tcp_highspeed.c ): > > /* Do additive increase */ > if (tp->snd_cwnd < tp->snd_cwnd_clamp) { > tp->snd_cwnd_cnt += ca->ai; > if (tp->snd_cwnd_cnt >= tp->snd_cwnd) { > tp->snd_cwnd++; > tp->snd_cwnd_cnt -= tp->snd_cwnd; > } > } > > In this part, when (tp->snd_cwnd_cnt == tp->snd_cwnd), > snd_cwnd_cnt will be -1... snd_cwnd_cnt is defined as u16, will this > small chance of getting -1 becomes a problem? > Shall we change it by reversing the order of the cwnd++ and cwnd_cnt -= > cwnd? Absolutely correct. Thanks. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [NETROM/ROSE]: Kill module init version kernel log messages.Ralf Baechle2006-05-05
| | | | | | | | | | | | | | | | | | There are out of date and don't tell the user anything useful. The similar messages which IPV4 and the core networking used to output were killed a long time ago. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [DCCP]: Fix sock_orphan dead lockHerbert Xu2006-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to DCCP_CLOSED in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [BRIDGE]: keep track of received multicast packetsStephen Hemminger2006-05-05
| | | | | | | | | | | | | | | | It makes sense to add this simple statistic to keep track of received multicast packets. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SCTP]: Fix state table entries for chunks received in CLOSED state.Sridhar Samudrala2006-05-05
| | | | | | | | | | | | | | Discard an unexpected chunk in CLOSED state rather can calling BUG(). Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SCTP]: Fix panic's when receiving fragmented SCTP control chunks.Sridhar Samudrala2006-05-05
| | | | | | | | | | | | | | | | Use pskb_pull() to handle incoming COOKIE_ECHO and HEARTBEAT chunks that are received as skb's with fragment list. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SCTP]: Prevent possible infinite recursion with multiple bundled DATA.Vladislav Yasevich2006-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a rare situation that causes lksctp to go into infinite recursion and crash the system. The trigger is a packet that contains at least the first two DATA fragments of a message bundled together. The recursion is triggered when the user data buffer is smaller that the full data message. The problem is that we clone the skb for every fragment in the message. When reassembling the full message, we try to link skbs from the "first fragment" clone using the frag_list. However, since the frag_list is shared between two clones in this rare situation, we end up setting the frag_list pointer of the second fragment to point to itself. This causes sctp_skb_pull() to potentially recurse indefinitely. Proposed solution is to make a copy of the skb when attempting to link things using frag_list. Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [SCTP]: Allow spillover of receive buffer to avoid deadlock.Neil Horman2006-05-05
|/ | | | | | | | | | | | | | | | | This patch fixes a deadlock situation in the receive path by allowing temporary spillover of the receive buffer. - If the chunk we receive has a tsn that immediately follows the ctsn, accept it even if we run out of receive buffer space and renege data with higher TSNs. - Once we accept one chunk in a packet, accept all the remaining chunks even if we run out of receive buffer space. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Mark Butler <butlerm@middle.net> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DECNET]: Fix level1 router helloPatrick Caulfield2006-05-04
| | | | | | | | | | | | | This patch fixes hello messages sent when a node is a level 1 router. Slightly contrary to the spec (maybe) VMS ignores hello messages that do not name level2 routers that it also knows about. So, here we simply name all the routers that the node knows about rather just other level1 routers. (I hope the patch is clearer than the description. sorry). Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Fix sock_orphan dead lockHerbert Xu2006-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calling sock_orphan inside bh_lock_sock in tcp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to TCP_CLOSE in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. Note that I've also moved the increment on the orphan count forward. This may look like a problem as we're increasing it even if the socket is just about to be destroyed where it'll be decreased again. However, this simply enlarges a window that already exists. This also changes the orphan count test by one. Considering what the orphan count is meant to do this is no big deal. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ROSE]: Eleminate HZ from ROSE kernel interfacesRalf Baechle2006-05-04
| | | | | | | Convert all ROSE sysctl time values from jiffies to ms as units. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETROM]: Eleminate HZ from NET/ROM kernel interfacesRalf Baechle2006-05-04
| | | | | | | Convert all NET/ROM sysctl time values from jiffies to ms as units. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>