aboutsummaryrefslogtreecommitdiffstats
path: root/net/sctp/socket.c
Commit message (Collapse)AuthorAge
* net: reserve ports for applications using fixed port numbersAmerigo Wang2010-05-16
| | | | | | | | | | | | | | | | | | | (Dropped the infiniband part, because Tetsuo modified the related code, I will send a separate patch for it once this is accepted.) This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which allows users to reserve ports for third-party applications. The reserved ports will not be used by automatic port assignments (e.g. when calling connect() or bind() with port number 0). Explicit port allocation behavior is unchanged. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-next' of ↵David S. Miller2010-05-03
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vxy/lksctp-dev Add missing linux/vmalloc.h include to net/sctp/probe.c Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: Use correct address family in sctp_getsockopt_peer_addrs()Vlad Yasevich2010-04-30
| | | | | | | | | | | | | | The function should use the address family of the address when trying to determine the length of the structure. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* | Merge branch 'master' of ↵David S. Miller2010-05-03
|\ \ | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | sctp: per_cpu variables should be in bh_disabled sectionVlad Yasevich2010-04-28
| | | | | | | | | | | | | | | | | | | | | | | | Since the change of the atomics to percpu variables, we now have to disable BH in process context when touching percpu variables. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sctp: avoid irq lock inversion while call sk->sk_data_ready()Wei Yongjun2010-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sk->sk_data_ready() of sctp socket can be called from both BH and non-BH contexts, but the default sk->sk_data_ready(), sock_def_readable(), can not be used in this case. Therefore, we have to make a new function sctp_data_ready() to grab sk->sk_data_ready() with BH disabling. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.33-rc6 #129 --------------------------------------------------------- sctp_darn/1517 just changed the state of lock: (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80 but this lock took another, SOFTIRQ-unsafe lock in the past: (slock-AF_INET){+.-...} and interrupts could create inverse lock ordering between them. other info that might help us debug this: 1 lock held by sctp_darn/1517: #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp] Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: sock_def_readable() and friends RCU conversionEric Dumazet2010-05-01
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we need two atomic operations (and associated dirtying) per incoming packet. RCU conversion is pretty much needed : 1) Add a new structure, called "struct socket_wq" to hold all fields that will need rcu_read_lock() protection (currently: a wait_queue_head_t and a struct fasync_struct pointer). [Future patch will add a list anchor for wakeup coalescing] 2) Attach one of such structure to each "struct socket" created in sock_alloc_inode(). 3) Respect RCU grace period when freeing a "struct socket_wq" 4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct socket_wq" 5) Change sk_sleep() function to use new sk->sk_wq instead of sk->sk_sleep 6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside a rcu_read_lock() section. 7) Change all sk_has_sleeper() callers to : - Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock) - Use wq_has_sleeper() to eventually wakeup tasks. - Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock) 8) sock_wake_async() is modified to use rcu protection as well. 9) Exceptions : macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq" instead of dynamically allocated ones. They dont need rcu freeing. Some cleanups or followups are probably needed, (possible sk_callback_lock conversion to a spinlock for example...). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: sk_add_backlog() take rmem_alloc into accountEric Dumazet2010-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current socket backlog limit is not enough to really stop DDOS attacks, because user thread spend many time to process a full backlog each round, and user might crazy spin on socket lock. We should add backlog size and receive_queue size (aka rmem_alloc) to pace writers, and let user run without being slow down too much. Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in stress situations. Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp receiver can now process ~200.000 pps (instead of ~100 pps before the patch) on a 8 core machine. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: sk_sleep() helperEric Dumazet2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define a new function to return the waitqueue of a "struct sock". static inline wait_queue_head_t *sk_sleep(struct sock *sk) { return sk->sk_sleep; } Change all read occurrences of sk_sleep by a call to this function. Needed for a future RCU conversion. sk_sleep wont be a field directly available. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2010-04-11
|\| | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/stmmac/stmmac_main.c drivers/net/wireless/wl12xx/wl1271_cmd.c drivers/net/wireless/wl12xx/wl1271_main.c drivers/net/wireless/wl12xx/wl1271_spi.c net/core/ethtool.c net/mac80211/scan.c
| * include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* | sctp: eliminate useless codeHagen Paul Pfeifer2010-03-31
|/ | | | | | | | | Remove duplicate declaration of symbol: struct hlist_node *node was already declared, the seconds declaration shadows the first one. CC: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: use limited socket backlogZhu Yi2010-03-05
| | | | | | | | | Make sctp adapt to the limited socket backlog change. Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/sctp: Eliminate useless codeJulia Lawall2010-01-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | The variable newinet is initialized twice to the same (side effect-free) expression. Drop one initialization. A simplified version of the semantic match that finds this problem is: (http://coccinelle.lip6.fr/) // <smpl> @forall@ idexpression *x; identifier f!=ERR_PTR; @@ x = f(...) ... when != x ( x = f(...,<+...x...+>,...) | * x = f(...) ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/sctp/socket.c: squish warningAndrew Morton2010-01-04
| | | | | | | | | | | net/sctp/socket.c: In function 'sctp_setsockopt_autoclose': net/sctp/socket.c:2090: warning: comparison is always false due to limited range of data type Cc: Andrei Pelinescu-Onciul <andrei@iptel.org> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: fix sctp_setsockopt_autoclose compile warningAndrei Pelinescu-Onciul2009-12-02
| | | | | | | | | | | Fix the following warning, when building on 64 bits: net/sctp/socket.c:2091: warning: large integer implicitly truncated to unsigned type Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Move && and || to end of previous lineJoe Perches2009-11-29
| | | | | | | | | | | Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: limit maximum autoclose setsockopt valueAndrei Pelinescu-Onciul2009-11-23
| | | | | | | | | To avoid overflowing the maximum timer interval when transforming the autoclose interval from seconds to jiffies, limit the maximum autoclose value to MAX_SCHEDULE_TIMEOUT/HZ. Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: remove deprecated SCTP_GET_*_OLD stuffsAmerigo Wang2009-11-23
| | | | | | | | SCTP_GET_*_OLD stuffs are schedlued to be removed. Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: allow setting path_maxrxt independent of SPP_PMTUD_ENABLEAndrei Pelinescu-Onciul2009-11-23
| | | | | | | | | | Since draft-ietf-tsvwg-sctpsocket-15.txt, setting the SPP_MTUD_ENABLE flag when changing pathmaxrxt via the SCTP_PEER_ADDR_PARAMS setsockopt is not required any longer. Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* Merge branch 'master' of ↵David S. Miller2009-11-17
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/can/Kconfig
| * sctp: Fix regression introduced by new sctp_connectx apiVlad Yasevich2009-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new (unrealeased to the user) sctp_connectx api c6ba68a26645dbc5029a9faa5687ebe6fcfc53e4 sctp: support non-blocking version of the new sctp_connectx() API introduced a regression cought by the user regression test suite. In particular, the API requires the user library to re-allocate the buffer and could potentially trigger a SIGFAULT. This change corrects that regression by passing the original address buffer to the kernel unmodified, but still allows for a returned association id. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: Set source addresses on the association before adding transportsVlad Yasevich2009-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent commit 8da645e101a8c20c6073efda3c7cc74eec01b87f sctp: Get rid of an extra routing lookup when adding a transport introduced a regression in the connection setup. The behavior was different between IPv4 and IPv6. IPv4 case ended up working because the route lookup routing returned a NULL route, which triggered another route lookup later in the output patch that succeeded. In the IPv6 case, a valid route was returned for first call, but we could not find a valid source address at the time since the source addresses were not set on the association yet. Thus resulted in a hung connection. The solution is to set the source addresses on the association prior to adding peers. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | inet: rename some inet_sock fieldsEric Dumazet2009-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Generalize socket rx gap / receive queue overflow cmsgNeil Horman2009-10-12
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a new socket level option to report number of queue overflows Recently I augmented the AF_PACKET protocol to report the number of frames lost on the socket receive queue between any two enqueued frames. This value was exported via a SOL_PACKET level cmsg. AFter I completed that work it was requested that this feature be generalized so that any datagram oriented socket could make use of this option. As such I've created this patch, It creates a new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue overflowed between any two given frames. It also augments the AF_PACKET protocol to take advantage of this new feature (as it previously did not touch sk->sk_drops, which this patch uses to record the overflow count). Tested successfully by me. Notes: 1) Unlike my previous patch, this patch simply records the sk_drops value, which is not a number of drops between packets, but rather a total number of drops. Deltas must be computed in user space. 2) While this patch currently works with datagram oriented protocols, it will also be accepted by non-datagram oriented protocols. I'm not sure if thats agreeable to everyone, but my argument in favor of doing so is that, for those protocols which aren't applicable to this option, sk_drops will always be zero, and reporting no drops on a receive queue that isn't used for those non-participating protocols seems reasonable to me. This also saves us having to code in a per-protocol opt in mechanism. 3) This applies cleanly to net-next assuming that commit 977750076d98c7ff6cbda51858bb5a5894a9d9ab (my af packet cmsg patch) is reverted Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make setsockopt() optlen be unsigned.David S. Miller2009-09-30
| | | | | | | | | | | | This provides safety against negative optlen at the type level instead of depending upon (sometimes non-trivial) checks against this sprinkled all over the the place, in each and every implementation. Based upon work done by Arjan van de Ven and feedback from Linus Torvalds. Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Fix SCTP_MAXSEG socket option to comply to spec.Vlad Yasevich2009-09-04
| | | | | | | | | | | | | | | | We had a bug that we never stored the user-defined value for MAXSEG when setting the value on an association. Thus future PMTU events ended up re-writing the frag point and increasing it past user limit. Additionally, when setting the option on the socket/endpoint, we effect all current associations, which is against spec. Now, we store the user 'maxseg' value along with the computed 'frag_point'. We inherit 'maxseg' from the socket at association creation and use it as an upper limit for 'frag_point' when its set. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Send user messages to the lower layer as oneVlad Yasevich2009-09-04
| | | | | | | | | | | | | | | | | Currenlty, sctp breaks up user messages into fragments and sends each fragment to the lower layer by itself. This means that for each fragment we go all the way down the stack and back up. This also discourages bundling of multiple fragments when they can fit into a sigle packet (ex: due to user setting a low fragmentation threashold). We introduce a new command SCTP_CMD_SND_MSG and hand the whole message down state machine. The state machine and the side-effect parser will cork the queue, add all chunks from the message to the queue, and then un-cork the queue thus causing the chunks to get transmitted. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Disallow new connection on a closing socketVlad Yasevich2009-09-04
| | | | | | | | | | | If a socket has a lot of association that are in the process of of being closed/aborted, it is possible for a remote to establish new associations during the time period that the old ones are shutting down. If this was a result of a close() call, there will be no socket and will cause a memory leak. We'll prevent this by setting the socket state to CLOSING and disallow new associations when in this state. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: fix warning at inet_sock_destruct() while release sctp socketWei Yongjun2009-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol with warning at inet_sock_destruct(). Actually, sctp can do this right with sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair. sctp_sock_rfree_frag(skb); sctp_skb_set_owner_r_frag(skb, newsk); This patch not revert the commit d55d87fdff8252d0e2f7c28c2d443aee17e9d70f, instead remove the sctp_sock_rfree_frag() function. ------------[ cut here ]------------ WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142() Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan] Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40 Call Trace: [<c042dd06>] warn_slowpath_common+0x6a/0x81 [<c064a39a>] ? inet_sock_destruct+0xe0/0x142 [<c042dd2f>] warn_slowpath_null+0x12/0x15 [<c064a39a>] inet_sock_destruct+0xe0/0x142 [<c05fde44>] __sk_free+0x19/0xcc [<c05fdf50>] sk_free+0x18/0x1a [<ca0d14ad>] sctp_close+0x192/0x1a1 [sctp] [<c0649f7f>] inet_release+0x47/0x4d [<c05fba4d>] sock_release+0x19/0x5e [<c05fbab3>] sock_close+0x21/0x25 [<c049c31b>] __fput+0xde/0x189 [<c049c3de>] fput+0x18/0x1a [<c049988f>] filp_close+0x56/0x60 [<c042f422>] put_files_struct+0x5d/0xa1 [<c042f49f>] exit_files+0x39/0x3d [<c043086a>] do_exit+0x1a5/0x5dd [<c04a86c2>] ? d_kill+0x35/0x3b [<c0438fa4>] ? dequeue_signal+0xa6/0x115 [<c0430d05>] do_group_exit+0x63/0x8a [<c0439504>] get_signal_to_deliver+0x2e1/0x2f9 [<c0401d9e>] do_notify_resume+0x7c/0x6b5 [<c043f601>] ? autoremove_wake_function+0x0/0x34 [<c04a864e>] ? __d_free+0x3d/0x40 [<c04a867b>] ? d_free+0x2a/0x3c [<c049ba7e>] ? vfs_write+0x103/0x117 [<c05fc8fa>] ? sys_socketcall+0x178/0x182 [<c0402a56>] work_notifysig+0x13/0x19 ---[ end trace 9db92c463e789fba ]--- Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: correct off-by-one write allocations reportsEric Dumazet2009-06-18
| | | | | | | | | | | | | commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. We need to take into account this offset when reporting sk_wmem_alloc to user, in PROC_FS files or various ioctls (SIOCOUTQ/TIOCOUTQ) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Use frag list abstraction interfaces.David S. Miller2009-06-09
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: support non-blocking version of the new sctp_connectx() APIVlad Yasevich2009-06-03
| | | | | | | | | | | | Prior implementation of the new sctp_connectx() call that returns an association ID did not work correctly on non-blocking socket. This is because we could not return both a EINPROGRESS error and an association id. This is a new implementation that supports this. Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: simplify sctp listening codeVlad Yasevich2009-03-13
| | | | | | | | | | sctp_inet_listen() call is split between UDP and TCP style. Looking at the code, the two functions are almost the same and can be merged into a single helper. This also fixes a bug that was fixed in the UDP function, but missed in the TCP function. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: fix the length check in sctp_getsockopt_maxburst()Wei Yongjun2009-03-03
| | | | | | | | | | | | The code in sctp_getsockopt_maxburst() doesn't allow len to be larger then struct sctp_assoc_value, which is a common case where app writers just pass down the sizeof(buf) or something similar. This patch fix the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: remove dup code in net/sctp/socket.cWei Yongjun2009-03-03
| | | | | | | | Remove dup check of "if (optlen < sizeof(int))". Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Inherit all socket options from parent correctly.Vlad Yasevich2009-02-16
| | | | | | | | | | | During peeloff/accept() sctp needs to save the parent socket state into the new socket so that any options set on the parent are inherited by the child socket. This was found when the parent/listener socket issues SO_BINDTODEVICE, but the data was misrouted after a route cache flush. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* trivial: fix then -> than typos in comments and documentationFrederik Schwarzer2009-01-06
| | | | | | | - (better, more, bigger ...) then -> (...) than Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* sctp: Add validity check for SCTP_PARTIAL_DELIVERY_POINT socket optionWei Yongjun2008-12-25
| | | | | | | | | | | | | | | | The latest ietf socket extensions API draft said: 8.1.21. Set or Get the SCTP Partial Delivery Point Note also that the call will fail if the user attempts to set this value larger than the socket receive buffer size. This patch add this validity check for SCTP_PARTIAL_DELIVERY_POINT socket option. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Implement socket option SCTP_GET_ASSOC_NUMBERWei Yongjun2008-12-25
| | | | | | | | | | | | | | Implement socket option SCTP_GET_ASSOC_NUMBER of the latest ietf socket extensions API draft. 8.2.5. Get the Current Number of Associations (SCTP_GET_ASSOC_NUMBER) This option gets the current number of associations that are attached to a one-to-many style socket. The option value is an uint32_t. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Fix a typo in socket.cWei Yongjun2008-12-25
| | | | | | | | Just fix a typo in socket.c. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Bring SCTP_MAXSEG socket option into ietf API extension complianceWei Yongjun2008-12-25
| | | | | | | | | Brings maxseg socket option set/get into line with the latest ietf socket extensions API draft, while maintaining backwards compatibility. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Use a percpu_counter for sockets_allocatedEric Dumazet2008-11-26
| | | | | | | | | | | | | Instead of using one atomic_t per protocol, use a percpu_counter for "sockets_allocated", to reduce cache line contention on heavy duty network servers. Note : We revert commit (248969ae31e1b3276fc4399d67ce29a5d81e6fd9 net: af_unix can make unix_nr_socks visbile in /proc), since it is not anymore used after sock_prot_inuse_add() addition Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make sure BHs are disabled in sock_prot_inuse_add()Eric Dumazet2008-11-25
| | | | | | | | | prot->destroy is not called with BH disabled. So we must add explicit BH disable around call to sock_prot_inuse_add() in sctp_destroy_sock() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make sure BHs are disabled in sock_prot_inuse_add()David S. Miller2008-11-23
| | | | | | | | | | The rule of calling sock_prot_inuse_add() is that BHs must be disabled. Some new calls were added where this was not true and this tiggers warnings as reported by Ilpo. Fix this by adding explicit BH disabling around those call sites. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sctp should update its inuse counterEric Dumazet2008-11-17
| | | | | | | | | | | This patch is a preparation to namespace conversion of /proc/net/protocols In order to have relevant information for SCTP protocols, we should use sock_prot_inuse_add() to update a (percpu and pernamespace) counter of inuse sockets. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: try harder to figure out address family when checking wildcardsVlad Yasevich2008-10-01
| | | | | | | | | | | sctp_is_any() function that is used to check for wildcard addresses only looks at the address itself to determine the address family. This function is used in the API to check the address passed in from the user. If the user simply zerroes out the sockaddr_storage and pass that in, we'll end up failing. So, let's try harder to determine the address family by also checking the socket if it's possible. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: fix random memory dereference with SCTP_HMAC_IDENT option.Vlad Yasevich2008-08-27
| | | | | | | | | The number of identifiers needs to be checked against the option length. Also, the identifier index provided needs to be verified to make sure that it doesn't exceed the bounds of the array. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: correct bounds check in sctp_setsockopt_auth_keyVlad Yasevich2008-08-27
| | | | | | | | | | | | The bonds check to prevent buffer overlflow was not exactly right. It still allowed overflow of up to 8 bytes which is sizeof(struct sctp_authkey). Since optlen is already checked against the size of that struct, we are guaranteed not to cause interger overflow either. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: add verification checks to SCTP_AUTH_KEY optionVlad Yasevich2008-08-25
| | | | | | | | | The structure used for SCTP_AUTH_KEY option contains a length that needs to be verfied to prevent buffer overflow conditions. Spoted by Eugene Teo <eteo@redhat.com>. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>