aboutsummaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAge
* [IPv6] IPsec: fix pmtu calculation of espKazunori MIYAZAWA2005-12-09
| | | | | | | | | | | It is a simple bug which uses the wrong member. This bug does not seriously affect ordinary use of IPsec. But it is important to pass IPv6 ready logo phase-2 conformance test of IPsec SGW. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Fix NULL pointer deref in checksum debugging.Stephen Hemminger2005-12-08
| | | | | | | | | | | | | | The problem I was seeing turned out to be that skb->dev is NULL when the checksum is being completed in user context. This happens because the reference to the device is dropped (to allow it to be released when packets are in the queue). Because skb->dev was NULL, the netdev_rx_csum_fault was panicing on deref of dev->name. How about this? Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [AF_PACKET]: Convert PACKET_MMAP over to vm_insert_page().David S. Miller2005-12-06
| | | | | | | | | So we can properly use __GFP_COMP and avoid the use of PG_reserved pages. With extremely helpful review from Hugh Dickins. Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] Vegas: timestamp before cloneDavid S. Miller2005-12-06
| | | | | | | | | | | | | We have to store the congestion control timestamp on the SKB before we clone it, not after. Else we get no timestamping information at all. tcp_transmit_skb() has been reworked so that we can do the timestamp still in one spot, instead of at all the call sites. Problem discovered, and initial fix, from Tom Young <tyo@ee.unimelb.edu.au>. Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] Vegas: Remove extra call to tcp_vegas_rtt_calcThomas Young2005-12-06
| | | | | | | | | Remove unneeded call to tcp_vegas_rtt_calc. The more accurate microsecond value has already been registered prior to calling tcp_vegas_cong_avoid. Signed-off-by: Thomas Young <tyo@ee.mu.oz.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] Vegas: stop resetting rtt every ackThomas Young2005-12-06
| | | | | | | | Move the resetting of rtt measurements to inside the once per RTT block of code. Signed-off-by: Thomas Young <tyo@ee.mu.oz.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DECNET]: add memory buffer settings Steven Whitehouse2005-12-05
| | | | | | | | The patch (originally from Steve) simply adds memory buffer settings to DECnet similar to those in TCP. Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: make function pointer argument parseable by kernel-docMartin Waitz2005-12-05
| | | | | | | | | | When a function takes a function pointer as argument it should use the 'return (*pointer)(params...)' syntax used everywhere else in the kernel as this is recognized by kernel-doc. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Don't use conntrack entry after dropping the referencePatrick McHardy2005-12-05
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix unbalanced read_unlock_bh in ctnetlinkPatrick McHardy2005-12-05
| | | | | | | | | NFA_NEST calls NFA_PUT which jumps to nfattr_failure if the skb has no room left. We call read_unlock_bh at nfattr_failure for the NFA_PUT inside the locked section, so move NFA_NEST inside the locked section too. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Wait for untracked references in nf_conntrack module unloadPatrick McHardy2005-12-05
| | | | | | | Noticed by Pablo Neira <pablo@eurodev.net>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Mark ctnetlink as EXPERIMENTALPatrick McHardy2005-12-05
| | | | | | | | Should have been marked EXPERIMENTAL from the beginning, as the current bunch of fixes show. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix CTA_PROTO_NUM attribute size in ctnetlinkPatrick McHardy2005-12-05
| | | | | | | CTA_PROTO_NUM is a u_int8_t. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix ip_conntrack_flush abuse in ctnetlinkPatrick McHardy2005-12-05
| | | | | | | | | | | ip_conntrack_flush() used to be part of ip_conntrack_cleanup(), which needs to drop _all_ references on module unload. Table flushed using ctnetlink just needs to clean the table and doesn't need to flush the event cache or wait for any references attached to skbs. Move everything but pure table flushing back to ip_conntrack_cleanup(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: nfnetlink: Fix calculation of minimum message lengthYasuyuki Kozakai2005-12-05
| | | | | | | | At least, valid nfnetlink message should have nlmsghdr and nfgenmsg. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: nf_conntrack: Fix missing check for ICMPv6 typeYasuyuki Kozakai2005-12-05
| | | | | | | | | This makes nf_conntrack_icmpv6 check that ICMPv6 type isn't < 128 to avoid accessing out of array valid_new[] and invmap[]. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix incorrect argument to ip_nat_initialized() in ctnetlinkPablo Neira Ayuso2005-12-05
| | | | | | | | | | | ip_nat_initialized() takes enum ip_nat_manip_type as it's second argument, not a hook number. Noticed and initial patch by Marcus Sundberg <marcus@ingate.com>. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* SUNRPC: Fix Oopsable condition in rpc_pipefsTrond Myklebust2005-12-03
| | | | | | | | | | The elements on rpci->in_upcall are tracked by the filp->private_data, which will ensure that they get released when the file is closed. The exception is if rpc_close_pipes() gets called first, since that sets rpci->ops to NULL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* [IPV6]: Load protocol module dynamically.YOSHIFUJI Hideaki2005-12-02
| | | | | | | | [ Modified to match inet_create() bug fix by Herbert Xu -DaveM ] Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] Fix EPROTONOSUPPORT error in inet_createHerbert Xu2005-12-02
| | | | | | | | | | | | There is a coding error in inet_create that causes it to always return ESOCKTNOSUPPORT. It should return EPROTONOSUPPORT when there are protocols registered for a given socket type but none of them match the requested protocol. This is based on a patch by Jayachandran C. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IGMP]: workaround for IGMP v1/v2 bugDavid Stevens2005-12-02
| | | | | | | | | | | | | | | | | From: David Stevens <dlstevens@us.ibm.com> As explained at: http://www.cs.ucsb.edu/~krishna/igmp_dos/ With IGMP version 1 and 2 it is possible to inject a unicast report to a client which will make it ignore multicast reports sent later by the router. The fix is to only accept the report if is was sent to a multicast or unicast address. Signed-off-by: David S. Miller <davem@davemloft.net>
* [SCTP]: Fix getsockname for sctp when an ipv6 socket accepts a connection fromNeil Horman2005-12-02
| | | | | | | | an ipv4 socket. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [SCTP]: Return socket errors only if the receive queue is empty.Neil Horman2005-12-02
| | | | | | | | | | This patch fixes an issue where it is possible to get valid data after a ENOTCONN error. It returns socket errors only after data queued on socket receive queue is consumed. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETLINK]: Fix processing of fib_lookup netlink messagesThomas Graf2005-12-01
| | | | | | | | | The receive path for fib_lookup netlink messages is lacking sanity checks for header and payload and is thus vulnerable to malformed netlink messages causing illegal memory references. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix recent match jiffies wrap mismatchesPhil Oester2005-12-01
| | | | | | | | | | | | | | | | | | | | | | Around jiffies wrap time (i.e. within first 5 mins after boot), recent match rules which contain both --seconds and --hitcount arguments experience false matches. This is because the last_pkts array is filled with zeros on creation, and when comparing 'now' to 0 (+ --seconds argument), time_before_eq thinks it has found a hit. Below patch adds a break if the packet value is zero. This has the unfortunate side effect of causing mismatches if a packet was received when jiffies really was equal to zero. The odds of that happening are slim compared to the problems caused by not adding the break however. Plus, the author used this same method just below, so it is "good enough". This fixes netfilter bugs #383 and #395. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Ignore ACKs ACKs on half open connections in TCP conntrackJozsef Kadlecsik2005-12-01
| | | | | | | | | | | | | | | | | | | | | | Mounting NFS file systems after a (warm) reboot could take a long time if firewalling and connection tracking was enabled. The reason is that the NFS clients tends to use the same ports (800 and counting down). Now on reboot, the server would still have a TCB for an existing TCP connection client:800 -> server:2049. The client sends a SYN from port 800 to server:2049, which elicits an ACK from the server. The firewall on the client drops the ACK because (from its point of view) the connection is still in half-open state, and it expects to see a SYNACK. The client will eventually time out after several minutes. The following patch corrects this, by accepting ACKs on half open connections as well. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: make two functions staticAdrian Bunk2005-11-29
| | | | | | | This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER] ipv4: small cleanupsAdrian Bunk2005-11-29
| | | | | | | | | This patch contains the following cleanups: - make needlessly global code static - ip_conntrack_core.c: ip_conntrack_flush() -> ip_conntrack_flush(void) Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: make two functions staticAdrian Bunk2005-11-29
| | | | | | | This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add const markers to various variables.Arjan van de Ven2005-11-29
| | | | | | | | | | | the patch below marks various variables const in net/; the goal is to move them to the .rodata section so that they can't false-share cachelines with things that get written to, as well as potentially helping gcc a bit with optimisations. (these were found using a gcc patch to warn about such variables) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ATM]: deregistration removes device from atm_devs list immediatelyStanislaw Gruszka2005-11-29
| | | | | | | | | | | | atm_dev_deregister() removes device from atm_dev list immediately to prevent operations on a phantom device. Decision to free device based only on ->refcnt now. Remove shutdown_atm_dev() use atm_dev_deregister() instead. atm_dev_deregister() also asynchronously releases all vccs related to device. Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ATM]: avoid race conditions related to atm_devs listStanislaw Gruszka2005-11-29
| | | | | | | | | | Use semaphore to protect atm_devs list, as no one need access to it from interrupt context. Avoid race conditions between atm_dev_register(), atm_dev_lookup() and atm_dev_deregister(). Fix double spin_unlock() bug. Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ATM]: attempt to autoload atm driversMitchell Blank Jr2005-11-29
| | | | | | From: Mitchell Blank Jr <mitch@sfgoth.com> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ATM]: atm_pcr_goal() doesn't modify its argument's contents -- mark it as constMitchell Blank Jr2005-11-29
| | | | | | Signed-off-by: Mitchell Blank Jr <mitch@sfgoth.com> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ATM]: always return the first interface for ATM_ITF_ANYMitchell Blank Jr2005-11-29
| | | | | | From: Mitchell Blank Jr <mitch@sfgoth.com> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] tcp/route: Another look at hash table sizesMike Stroyan2005-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tcp_ehash hash table gets too big on systems with really big memory. It is worse on systems with pages larger than 4KB. It wastes memory that could be better used. It also makes the netstat command slow because reading /proc/net/tcp and /proc/net/tcp6 needs to go through the full hash table. The default value should not be larger for larger page sizes. It seems that the effect of page size is an unintended error dating back a long time. I also wonder if the default value really should be a larger fraction of memory for systems with more memory. While systems with really big ram can afford more space for hash tables, it is not clear to me that they benefit from increasing the allocation ratio for this table. The amount of memory allocated is determined by net/ipv4/tcp.c:tcp_init and mm/page_alloc.c:alloc_large_system_hash. tcp_init calls alloc_large_system_hash passing parameters- bucketsize=sizeof(struct tcp_ehash_bucket) numentries=thash_entries scale=(num_physpages >= 128 * 1024) ? (25-PAGE_SHIFT) : (27-PAGE_SHIFT) limit=0 On i386, PAGE_SHIFT is 12 for a page size of 4K On ia64, PAGE_SHIFT defaults to 14 for a page size of 16K The num_physpages test above makes the allocation take a larger fraction of the total memory on systems with larger memory. The threshold size for a i386 system is 512MB. For an ia64 system with 16KB pages the threshold is 2GB. For smaller memory systems- On i386, scale = (27 - 12) = 15 On ia64, scale = (27 - 14) = 13 For larger memory systems- On i386, scale = (25 - 12) = 13 On ia64, scale = (25 - 14) = 11 For the rest of this discussion, I'll just track the larger memory case. The default behavior has numentries=thash_entries=0, so the allocated size is determined by either scale or by the default limit of 1/16 of total memory. In alloc_large_system_hash- | numentries = (flags & HASH_HIGHMEM) ? nr_all_pages : nr_kernel_pages; | numentries += (1UL << (20 - PAGE_SHIFT)) - 1; | numentries >>= 20 - PAGE_SHIFT; | numentries <<= 20 - PAGE_SHIFT; At this point, numentries is pages for all of memory, rounded up to the nearest megabyte boundary. | /* limit to 1 bucket per 2^scale bytes of low memory */ | if (scale > PAGE_SHIFT) | numentries >>= (scale - PAGE_SHIFT); | else | numentries <<= (PAGE_SHIFT - scale); On i386, numentries >>= (13 - 12), so numentries is 1/8196 of bytes of total memory. On ia64, numentries <<= (14 - 11), so numentries is 1/2048 of bytes of total memory. | log2qty = long_log2(numentries); | | do { | size = bucketsize << log2qty; bucketsize is 16, so size is 16 times numentries, rounded down to a power of two. On i386, size is 1/512 of bytes of total memory. On ia64, size is 1/128 of bytes of total memory. For smaller systems the results are On i386, size is 1/2048 of bytes of total memory. On ia64, size is 1/512 of bytes of total memory. The large page effect can be removed by just replacing the use of PAGE_SHIFT with a constant of 12 in the calls to alloc_large_system_hash. That makes them more like the other uses of that function from fs/inode.c and fs/dcache.c Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: Implement appropriate dummy rule 4 in ipv6_dev_get_saddr().YOSHIFUJI Hideaki2005-11-29
| | | | | | | | Ensure to update hiscore.rule in dummy rule 4 in ipv6_dev_get_saddr(). Pointed out by Yan Zheng <yanzheng@21cn.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* SUNRPC: Funny looking code in __rpc_purge_upcallTrond Myklebust2005-11-25
| | | | | | | | In __rpc_purge_upcall (net/sunrpc/rpc_pipe.c), the newer code to clean up the in_upcall list has a typo. Thanks to Vince Busam <vbusam@google.com> for spotting this! Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* [BRIDGE]: recompute features when adding a new deviceOlaf Rempel2005-11-23
| | | | | | | | | | We must recompute bridge features everytime the list of underlying devices changes, or we might end up with features that are not supported by all devices (eg. NETIF_F_TSO) This patch adds the missing recompute when adding a device to the bridge. Signed-off-by: Olaf Rempel <razzor@kopf-tisch.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: ip_conntrack_netlink.c needs linux/interrupt.hBenoit Boissinot2005-11-23
| | | | | | | | | | net/ipv4/netfilter/ip_conntrack_netlink.c: In function 'ctnetlink_dump_table': net/ipv4/netfilter/ip_conntrack_netlink.c:409: warning: implicit declaration of function 'local_bh_disable' net/ipv4/netfilter/ip_conntrack_netlink.c:427: warning: implicit declaration of function 'local_bh_enable' Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER] ctnetlink: Fix refcount leak ip_conntrack/nat_protoPablo Neira Ayuso2005-11-22
| | | | | | | | | | | Remove proto == NULL checking since ip_conntrack_[nat_]proto_find_get always returns a valid pointer. Fix missing ip_conntrack_proto_put in some paths. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Fix secondary IP addresses after promotionJamal Hadi Salim2005-11-22
| | | | | | | | | | | | This patch fixes the problem with promoting aliases when: a) a single primary and > 1 secondary addresses b) multiple primary addresses each with at least one secondary address Based on earlier efforts from Brian Pomerantz <bapper@piratehaven.org>, Patrick McHardy <kaber@trash.net> and Thomas Graf <tgraf@suug.ch> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETLINK]: Use tgid instead of pid for nlmsg_pidHerbert Xu2005-11-22
| | | | | Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP]: Add missing no_policy flag to struct net_protocolPatrick McHardy2005-11-21
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: fixed dependencies between modules related with ip_conntrackYasuyuki Kozakai2005-11-21
| | | | | | | | | | | | - IP_NF_CONNTRACK_MARK is bool and depends on only IP_NF_CONNTRACK which is tristate. If a variable depends on IP_NF_CONNTRACK_MARK and doesn't care about IP_NF_CONNTRACK, it can be y. This must be avoided. - IP_NF_CT_ACCT has same problem. - IP_NF_TARGET_CLUSTERIP also depends on IP_NF_MANGLE. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [FIB_TRIE]: Don't show local table in /proc/net/route outputPatrick McHardy2005-11-21
| | | | | | | Don't show local table to behave similar to fib_hash. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+advapi-fix/David S. Miller2005-11-20
|\
| * [IPV6]: Fix sending extension headers before and including routing header.YOSHIFUJI Hideaki2005-11-19
| | | | | | | | | | | | Based on suggestion from Masahide Nakamura <nakam@linux-ipv6.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
| * [IPV6]: Fix calculation of AH length during filling ancillary data.Ville Nuorvala2005-11-19
| | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
| * [IPV6]: Fix memory management error during setting up new advapi sockopts.YOSHIFUJI Hideaki2005-11-19
| | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>