aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/socket.h
Commit message (Collapse)AuthorAge
* Hoist memcpy_fromiovec/memcpy_toiovec into lib/Rusty Russell2013-05-19
| | | | | | | | | | | | | | | | | | | | ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined! That function is only present with CONFIG_NET. Turns out that crypto/algif_skcipher.c also uses that outside net, but it actually needs sockets anyway. In addition, commit 6d4f0139d642c45411a47879325891ce2a7c164a added CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist that function and revert that commit too. socket.h already includes uio.h, so no callers need updating; trying only broke things fo x86_64 randconfig (thanks Fengguang!). Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* NFC: llcp: Implement socket optionsSamuel Ortiz2013-03-10
| | | | | | | | Some LLCP services (e.g. the validation ones) require some control over the LLCP link parameters like the receive window (RW) or the MIU extension (MIUX). This can only be done through socket options. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
* VSOCK: Introduce VM SocketsAndy King2013-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VM Sockets allows communication between virtual machines and the hypervisor. User level applications both in a virtual machine and on the host can use the VM Sockets API, which facilitates fast and efficient communication between guest virtual machines and their host. A socket address family, designed to be compatible with UDP and TCP at the interface level, is provided. Today, VM Sockets is used by various VMware Tools components inside the guest for zero-config, network-less access to VMware host services. In addition to this, VMware's users are using VM Sockets for various applications, where network access of the virtual machine is restricted or non-existent. Examples of this are VMs communicating with device proxies for proprietary hardware running as host applications and automated testing of applications running within virtual machines. The VMware VM Sockets are similar to other socket types, like Berkeley UNIX socket interface. The VM Sockets module supports both connection-oriented stream sockets like TCP, and connectionless datagram sockets like UDP. The VM Sockets protocol family is defined as "AF_VSOCK" and the socket operations split for SOCK_DGRAM and SOCK_STREAM. For additional information about the use of VM Sockets, please refer to the VM Sockets Programming Guide available at: https://www.vmware.com/support/developer/vmci-sdk/ Signed-off-by: George Zhang <georgezhang@vmware.com> Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andy king <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* UAPI: (Scripted) Disintegrate include/linuxDavid Howells2012-10-13
| | | | | | | | | Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
* net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)Yuchung Cheng2012-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | sendmsg() (or sendto()) with MSG_FASTOPEN is a combo of connect(2) and write(2). The application should replace connect() with it to send data in the opening SYN packet. For blocking socket, sendmsg() blocks until all the data are buffered locally and the handshake is completed like connect() call. It returns similar errno like connect() if the TCP handshake fails. For non-blocking socket, it returns the number of bytes queued (and transmitted in the SYN-data packet) if cookie is available. If cookie is not available, it transmits a data-less SYN packet with Fast Open cookie request option and returns -EINPROGRESS like connect(). Using MSG_FASTOPEN on connecting or connected socket will result in simlar errno like repeating connect() calls. Therefore the application should only use this flag on new sockets. The buffer size of sendmsg() is independent of the MSS of the connection. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: cleanup unsigned to unsigned intEric Dumazet2012-04-15
| | | | | | | Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: tcp_sendpages() should call tcp_push() onceEric Dumazet2012-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2f533844242 (tcp: allow splice() to build full TSO packets) added a regression for splice() calls using SPLICE_F_MORE. We need to call tcp_flush() at the end of the last page processed in tcp_sendpages(), or else transmits can be deferred and future sends stall. Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but with different semantic. For all sendpage() providers, its a transparent change. Only sock_sendpage() and tcp_sendpages() can differentiate the two different flags provided by pipe_to_sendpage() Reported-by: Tom Herbert <therbert@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: get rid of some pointless casts to sockaddrMaciej Żenczykowski2012-03-11
| | | | | | | | | | | | | | | | The following 4 functions: move_addr_to_kernel move_addr_to_user verify_iovec verify_compat_iovec are always effectively called with a sockaddr_storage. Make this explicit by changing their signature. This removes a large number of casts from sockaddr_storage to sockaddr. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make userland include of netlink.h more sane.David S. Miller2011-08-08
| | | | | | | | | | | | | | | | | | | | | | | | Currently userland will barf when including linux/netlink.h unless it precisely includes sys/socket.h first. The issue is where the definition of "sa_family_t" comes from. We've been back and forth on how to fix this issue in the past, see: http://thread.gmane.org/gmane.linux.debian.devel.bugs.general/622621 http://thread.gmane.org/gmane.linux.network/143380 Ben Hutchings suggested we take a hint from how we handle the sockaddr_storage type. First we define a "__kernel_sa_family_t" to linux/socket.h that is always defined. Then if __KERNEL__ is defined, we also define "sa_family_t" as equal to "__kernel_sa_family_t". Then in places like linux/netlink.h we use __kernel_sa_family_t in user visible datastructures. Reported-by: Michel Machado <michel@digirati.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>
* NFC: add NFC socket familyAloisio Almeida Jr2011-07-05
| | | | | | Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Signed-off-by: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* net: Add sendmmsg socket system callAnton Blanchard2011-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a multiple message send syscall and is the send version of the existing recvmmsg syscall. This is heavily based on the patch by Arnaldo that added recvmmsg. I wrote a microbenchmark to test the performance gains of using this new syscall: http://ozlabs.org/~anton/junkcode/sendmmsg_test.c The test was run on a ppc64 box with a 10 Gbit network card. The benchmark can send both UDP and RAW ethernet packets. 64B UDP batch pkts/sec 1 804570 2 872800 (+ 8 %) 4 916556 (+14 %) 8 939712 (+17 %) 16 952688 (+18 %) 32 956448 (+19 %) 64 964800 (+20 %) 64B raw socket batch pkts/sec 1 1201449 2 1350028 (+12 %) 4 1461416 (+22 %) 8 1513080 (+26 %) 16 1541216 (+28 %) 32 1553440 (+29 %) 64 1557888 (+30 %) We see a 20% improvement in throughput on UDP send and 30% on raw socket send. [ Add sparc syscall entries. -DaveM ] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Fix common misspellingsLucas De Marchi2011-03-31
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds2011-01-13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (46 commits) hwrng: via_rng - Fix memory scribbling on some CPUs crypto: padlock - Move padlock.h into include/crypto hwrng: via_rng - Fix asm constraints crypto: n2 - use __devexit not __exit in n2_unregister_algs crypto: mark crypto workqueues CPU_INTENSIVE crypto: mv_cesa - dont return PTR_ERR() of wrong pointer crypto: ripemd - Set module author and update email address crypto: omap-sham - backlog handling fix crypto: gf128mul - Remove experimental tag crypto: af_alg - fix af_alg memory_allocated data type crypto: aesni-intel - Fixed build with binutils 2.16 crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets net: Add missing lockdep class names for af_alg include: Install linux/if_alg.h for user-space crypto API crypto: omap-aes - checkpatch --file warning fixes crypto: omap-aes - initialize aes module once per request crypto: omap-aes - unnecessary code removed crypto: omap-aes - error handling implementation improved crypto: omap-aes - redundant locking is removed crypto: omap-aes - DMA initialization fixes for OMAP off mode ...
| * net - Add AF_ALG macrosHerbert Xu2010-11-19
| | | | | | | | | | | | | | | | | | This patch adds the socket family/level macros for the yet-to-be-born AF_ALG family. The AF_ALG family provides the user-space interface for the kernel crypto API. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net>
* | net: remove the duplicate #ifdef __KERNEL__Changli Gao2011-01-06
| | | | | | | | | | | | | | | | Since we are already in #ifdef __KERNEL__, we don't need to check it again. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Limit socket I/O iovec total length to INT_MAX.David S. Miller2010-10-28
|/ | | | | | | | | | | | | | | | | | | | | | | This helps protect us from overflow issues down in the individual protocol sendmsg/recvmsg handlers. Once we hit INT_MAX we truncate out the rest of the iovec by setting the iov_len members to zero. This works because: 1) For SOCK_STREAM and SOCK_SEQPACKET sockets, partial writes are allowed and the application will just continue with another write to send the rest of the data. 2) For datagram oriented sockets, where there must be a one-to-one correspondance between write() calls and packets on the wire, INT_MAX is going to be far larger than the packet size limit the protocol is going to check for and signal with -EMSGSIZE. Based upon a patch by Linus Torvalds. Signed-off-by: David S. Miller <davem@davemloft.net>
* socket: localize functionsstephen hemminger2010-10-21
| | | | | | | | A couple of functions in socket.c are only used there and should be localized. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Fix >4GB writes on 64-bit.David S. Miller2010-09-27
| | | | | | | | | | | | | | | | | | | | Fixes kernel bugzilla #16603 tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write zero bytes, for example. There is also the problem higher up of how verify_iovec() works. It wants to prevent the total length from looking like an error return value. However it does this using 'int', but syscalls return 'long' (and thus signed 64-bit on 64-bit machines). So it could trigger false-positives on 64-bit as written. So fix it to use 'long'. Reported-by: Olaf Bonorden <bono@onlinehome.de> Reported-by: Daniel Büse <dbuese@gmx.de> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* sock: Introduce cred_to_ucredEric W. Biederman2010-06-16
| | | | | | | | | | To keep the coming code clear and to allow both the sock code and the scm code to share the logic introduce a fuction to translate from struct cred to struct ucred. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-04-07
|\ | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c drivers/net/via-velocity.c drivers/net/wireless/iwlwifi/iwl-agn.c
| * net: Add MSG_WAITFORONE flag to recvmmsgBrandon L Black2010-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new flag MSG_WAITFORONE for the recvmmsg() syscall. When this flag is specified for a blocking socket, recvmmsg() will only block until at least 1 packet is available. The default behavior is to block until all vlen packets are available. This flag has no effect on non-blocking sockets or when used in combination with MSG_DONTWAIT. Signed-off-by: Brandon L Black <blblack@gmail.com> Acked-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net-caif: add CAIF protocol definitionsSjur Braendeland2010-03-30
|/ | | | | | | | | Add CAIF definitions to existing header files. Files: if_arp.h, if_ether.h, socket.h. Types: ARPHRD_CAIF, ETH_P_CAIF, AF_CAIF, PF_CAIF, SOL_CAIF, N_CAIF Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net,socket: introduce DECLARE_SOCKADDR helper to catch overflow at build timeCyrill Gorcunov2009-10-29
| | | | | | | | | | | | | | | | proto_ops->getname implies copying protocol specific data into storage unit (particulary to __kernel_sockaddr_storage). So when we implement new protocol support we should keep such a detail in mind (which is easy to forget about). Lets introduce DECLARE_SOCKADDR helper which check if storage unit is not overfowed at build time. Eventually inet_getname is switched to use DECLARE_SOCKADDR (to show example of usage). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Introduce recvmmsg socket syscallArnaldo Carvalho de Melo2009-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Support inclusion of <linux/socket.h> before <sys/socket.h>Ben Hutchings2009-10-05
| | | | | | | | | | | | | | | | | | | | The following user-space program fails to compile: #include <linux/socket.h> #include <sys/socket.h> int main() { return 0; } The reason is that <linux/socket.h> tests __GLIBC__ to decide whether it should define various structures and macros that are now defined for user-space by <sys/socket.h>, but __GLIBC__ is not defined if no libc headers have yet been included. It seems safe to drop support for libc 5 now. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Bastian Blank <waldi@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Add constants for the ieee 802.15.4 stackSergey Lapin2009-06-09
| | | | | | | | IEEE 802.15.4 stack requires several constants to be defined/adjusted. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Sergey Lapin <slapin@ossfans.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* af_iucv: add sockopt() to enable/disable use of IPRM_DATA msgsHendrik Brueckner2009-04-23
| | | | | | | | | | Provide the socket operations getsocktopt() and setsockopt() to enable/disable sending of data in the parameter list of IUCV messages. The patch sets respective flag only. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tun: fix tun_chr_aio_write so that aio worksMichael S. Tsirkin2009-04-21
| | | | | | | | | | | | | aio_write gets const struct iovec * but tun_chr_aio_write casts this to struct iovec * and modifies the iovec. As a result, attempts to use io_submit to send packets to a tun device fail with weird errors such as EINVAL. Since tun is the only user of skb_copy_datagram_from_iovec, we can fix this simply by changing the later so that it does not touch the iovec passed to it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: skb_copy_datagram_const_iovec()Michael S. Tsirkin2009-04-21
| | | | | | | | | | | | | There's an skb_copy_datagram_iovec() to copy out of a paged skb, but it modifies the iovec, and does not support starting at an offset in the destination. We want both in tun.c, so let's add the function. It's a carbon copy of skb_copy_datagram_iovec() with enough changes to be annoying. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'header-fixes-for-linus' of ↵Linus Torvalds2009-03-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits) x86: headers cleanup - setup.h emu101k1.h: fix duplicate include of <linux/types.h> compiler-gcc4: conditionalize #error on __KERNEL__ remove __KERNEL_STRICT_NAMES make netfilter use strict integer types make drm headers use strict integer types make MTD headers use strict integer types make most exported headers use strict integer types make exported headers use strict posix types unconditionally include asm/types.h from linux/types.h make linux/types.h as assembly safe Neither asm/types.h nor linux/types.h is required for arch/ia64/include/asm/fpu.h headers_check fix cleanup: linux/reiserfs_fs.h headers_check fix cleanup: linux/nubus.h headers_check fix cleanup: linux/coda_psdev.h headers_check fix: x86, setup.h headers_check fix: x86, prctl.h headers_check fix: linux/reinserfs_fs.h headers_check fix: linux/socket.h headers_check fix: linux/nubus.h ... Manually fix trivial conflicts in: include/linux/netfilter/xt_limit.h include/linux/netfilter/xt_statistic.h
| * headers_check fix: linux/socket.hJaswinder Singh Rajput2009-02-02
| | | | | | | | | | | | | | | | fix the following 'make headers_check' warning: usr/include/linux/socket.h:29: extern's make no sense in userspace Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
* | RDS: Add AF and PF #defines for RDS socketsAndy Grover2009-02-27
|/ | | | | | | | | RDS is a reliable datagram protocol used for IPC on Oracle database clusters. This adds address and protocol family numbers for it. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: implement GPRS virtual interface over PEP socketRémi Denis-Courmont2008-10-05
| | | | | Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: global definitionsRemi Denis-Courmont2008-09-22
| | | | | Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Define AF_ISDN and PF_ISDNKarsten Keil2008-07-26
| | | | | | Define the address and protocol family value for mISDN. Signed-off-by: Karsten Keil <kkeil@suse.de>
* net: Use standard structures for generic socket address structures.YOSHIFUJI Hideaki2008-07-20
| | | | | | | | | Use sockaddr_storage{} for generic socket address storage and ensures proper alignment. Use sockaddr{} for pointers to omit several casts. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] sysctl: make sysctl_somaxconn per-namespacePavel Emelyanov2008-01-28
| | | | | | | | | | | | Just move the variable on the struct net and adjust its usage. Others sysctls from sys.net.core table are more difficult to virtualize (i.e. make them per-namespace), but I'll look at them as well a bit later. Signed-off-by: Pavel Emelyanov <xemul@oenvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CAN]: Allocate protocol numbers for PF_CANOliver Hartkopp2008-01-28
| | | | | | | | | | This patch adds a protocol/address family number, ARP hardware type, ethernet packet type, and a line discipline number for the SocketCAN implementation. Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [Bluetooth] Add constant for Bluetooth socket options levelMarcel Holtmann2007-10-22
| | | | | | | Assign the next free socket options level to be used by the Bluetooth protocol and address family. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* O_CLOEXEC for SCM_RIGHTSUlrich Drepper2007-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part two in the O_CLOEXEC saga: adding support for file descriptors received through Unix domain sockets. The patch is once again pretty minimal, it introduces a new flag for recvmsg and passes it just like the existing MSG_CMSG_COMPAT flag. I think this bit is not used otherwise but the networking people will know better. This new flag is not recognized by recvfrom and recv. These functions cannot be used for that purpose and the asymmetry this introduces is not worse than the already existing MSG_CMSG_COMPAT situations. The patch must be applied on the patch which introduced O_CLOEXEC. It has to remove static from the new get_unused_fd_flags function but since scm.c cannot live in a module the function still hasn't to be exported. Here's a test program to make sure the code works. It's so much longer than the actual patch... #include <errno.h> #include <error.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <sys/un.h> #ifndef O_CLOEXEC # define O_CLOEXEC 02000000 #endif #ifndef MSG_CMSG_CLOEXEC # define MSG_CMSG_CLOEXEC 0x40000000 #endif int main (int argc, char *argv[]) { if (argc > 1) { int fd = atol (argv[1]); printf ("child: fd = %d\n", fd); if (fcntl (fd, F_GETFD) == 0 || errno != EBADF) { puts ("file descriptor valid in child"); return 1; } return 0; } struct sockaddr_un sun; strcpy (sun.sun_path, "./testsocket"); sun.sun_family = AF_UNIX; char databuf[] = "hello"; struct iovec iov[1]; iov[0].iov_base = databuf; iov[0].iov_len = sizeof (databuf); union { struct cmsghdr hdr; char bytes[CMSG_SPACE (sizeof (int))]; } buf; struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1, .msg_control = buf.bytes, .msg_controllen = sizeof (buf) }; struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg); cmsg->cmsg_level = SOL_SOCKET; cmsg->cmsg_type = SCM_RIGHTS; cmsg->cmsg_len = CMSG_LEN (sizeof (int)); msg.msg_controllen = cmsg->cmsg_len; pid_t child = fork (); if (child == -1) error (1, errno, "fork"); if (child == 0) { int sock = socket (PF_UNIX, SOCK_STREAM, 0); if (sock < 0) error (1, errno, "socket"); if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0) error (1, errno, "bind"); if (listen (sock, SOMAXCONN) < 0) error (1, errno, "listen"); int conn = accept (sock, NULL, NULL); if (conn == -1) error (1, errno, "accept"); *(int *) CMSG_DATA (cmsg) = sock; if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0) error (1, errno, "sendmsg"); return 0; } /* For a test suite this should be more robust like a barrier in shared memory. */ sleep (1); int sock = socket (PF_UNIX, SOCK_STREAM, 0); if (sock < 0) error (1, errno, "socket"); if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0) error (1, errno, "connect"); unlink (sun.sun_path); *(int *) CMSG_DATA (cmsg) = -1; if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0) error (1, errno, "recvmsg"); int fd = *(int *) CMSG_DATA (cmsg); if (fd == -1) error (1, 0, "no descriptor received"); char fdname[20]; snprintf (fdname, sizeof (fdname), "%d", fd); execl ("/proc/self/exe", argv[0], fdname, NULL); puts ("execl failed"); return 1; } [akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch] [akpm@linux-foundation.org: build fix] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Michael Buesch <mb@bu3sch.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [L2TP]: Changes to existing ppp and socket kernel headers for L2TPJames Chapman2007-07-11
| | | | | | | | | | | | | | | | | | Add struct sockaddr_pppol2tp to carry L2TP-specific address information for the PPPoX (PPPoL2TP) socket. Unfortunately we can't use the union inside struct sockaddr_pppox because the L2TP-specific data is larger than the current size of the union and we must preserve the size of struct sockaddr_pppox for binary compatibility. Also add a PPPIOCGL2TPSTATS ioctl to allow userspace to obtain L2TP counters and state from the kernel. Add new if_pppol2tp.h header. [ Modified to use aligned_u64 in statistics structure -DaveM ] Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel bothDavid Howells2007-04-26
| | | | | | | | | | | | | | Provide AF_RXRPC sockets that can be used to talk to AFS servers, or serve answers to AFS clients. KerberosIV security is fully supported. The patches and some example test programs can be found in: http://people.redhat.com/~dhowells/rxrpc/ This will eventually replace the old implementation of kernel-only RxRPC currently resident in net/rxrpc/. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Revert socket.h/stat.h ifdef hacks.David S. Miller2007-02-28
| | | | | | | | | | | | | This reverts 57a87bb0720a5cf7a9ece49a8c8ed288398fd1bb. As H. Peter Anvin states, this change broke klibc and it's not very easy to fix things up without duplicating everything into userspace. In the longer term we should have a better solution to this problem, but for now let's unbreak things. Signed-off-by: David S. Miller <davem@davemloft.net>
* [PATCH] scrub non-__GLIBC__ checks in linux/socket.h and linux/stat.hMike Frysinger2007-02-11
| | | | | | | | | | | | Userspace should be worrying about userspace, so having the socket.h and stat.h pollute the namespace in the non-glibc case is wrong and pretty much prevents any other libc from utilizing these headers sanely unless they set up the __GLIBC__ define themselves (which sucks) Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [S390]: Add AF_IUCV socket supportJennifer Hunt2007-02-08
| | | | | | | | | | From: Jennifer Hunt <jenhunt@us.ibm.com> This patch adds AF_IUCV socket support. Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Annotate csum_partial() callers in net/*Al Viro2006-12-03
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Supporting UDP-Lite (RFC 3828) in LinuxGerrit Renker2006-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a revision of the previously submitted patch, which alters the way files are organized and compiled in the following manner: * UDP and UDP-Lite now use separate object files * source file dependencies resolved via header files net/ipv{4,6}/udp_impl.h * order of inclusion files in udp.c/udplite.c adapted accordingly [NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828) This patch adds support for UDP-Lite to the IPv4 stack, provided as an extension to the existing UDPv4 code: * generic routines are all located in net/ipv4/udp.c * UDP-Lite specific routines are in net/ipv4/udplite.c * MIB/statistics support in /proc/net/snmp and /proc/net/udplite * shared API with extensions for partial checksum coverage [NET/IPv6]: Extension for UDP-Lite over IPv6 It extends the existing UDPv6 code base with support for UDP-Lite in the same manner as per UDPv4. In particular, * UDPv6 generic and shared code is in net/ipv6/udp.c * UDP-Litev6 specific extensions are in net/ipv6/udplite.c * MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6 * support for IPV6_ADDRFORM * aligned the coding style of protocol initialisation with af_inet6.c * made the error handling in udpv6_queue_rcv_skb consistent; to return `-1' on error on all error cases * consolidation of shared code [NET]: UDP-Lite Documentation and basic XFRM/Netfilter support The UDP-Lite patch further provides * API documentation for UDP-Lite * basic xfrm support * basic netfilter support for IPv4 and IPv6 (LOG target) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* Don't include <linux/config.h> and <linux/linkage.h> from linux/socket.hDavid Woodhouse2006-04-25
| | | | Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* [SECURITY]: TCP/UDP getpeersecCatherine Zhang2006-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements an application of the LSM-IPSec networking controls whereby an application can determine the label of the security association its TCP or UDP sockets are currently connected to via getsockopt and the auxiliary data mechanism of recvmsg. Patch purpose: This patch enables a security-aware application to retrieve the security context of an IPSec security association a particular TCP or UDP socket is using. The application can then use this security context to determine the security context for processing on behalf of the peer at the other end of this connection. In the case of UDP, the security context is for each individual packet. An example application is the inetd daemon, which could be modified to start daemons running at security contexts dependent on the remote client. Patch design approach: - Design for TCP The patch enables the SELinux LSM to set the peer security context for a socket based on the security context of the IPSec security association. The application may retrieve this context using getsockopt. When called, the kernel determines if the socket is a connected (TCP_ESTABLISHED) TCP socket and, if so, uses the dst_entry cache on the socket to retrieve the security associations. If a security association has a security context, the context string is returned, as for UNIX domain sockets. - Design for UDP Unlike TCP, UDP is connectionless. This requires a somewhat different API to retrieve the peer security context. With TCP, the peer security context stays the same throughout the connection, thus it can be retrieved at any time between when the connection is established and when it is torn down. With UDP, each read/write can have different peer and thus the security context might change every time. As a result the security context retrieval must be done TOGETHER with the packet retrieval. The solution is to build upon the existing Unix domain socket API for retrieving user credentials. Linux offers the API for obtaining user credentials via ancillary messages (i.e., out of band/control messages that are bundled together with a normal message). Patch implementation details: - Implementation for TCP The security context can be retrieved by applications using getsockopt with the existing SO_PEERSEC flag. As an example (ignoring error checking): getsockopt(sockfd, SOL_SOCKET, SO_PEERSEC, optbuf, &optlen); printf("Socket peer context is: %s\n", optbuf); The SELinux function, selinux_socket_getpeersec, is extended to check for labeled security associations for connected (TCP_ESTABLISHED == sk->sk_state) TCP sockets only. If so, the socket has a dst_cache of struct dst_entry values that may refer to security associations. If these have security associations with security contexts, the security context is returned. getsockopt returns a buffer that contains a security context string or the buffer is unmodified. - Implementation for UDP To retrieve the security context, the application first indicates to the kernel such desire by setting the IP_PASSSEC option via getsockopt. Then the application retrieves the security context using the auxiliary data mechanism. An example server application for UDP should look like this: toggle = 1; toggle_len = sizeof(toggle); setsockopt(sockfd, SOL_IP, IP_PASSSEC, &toggle, &toggle_len); recvmsg(sockfd, &msg_hdr, 0); if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) { cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr); if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) && cmsg_hdr->cmsg_level == SOL_IP && cmsg_hdr->cmsg_type == SCM_SECURITY) { memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext)); } } ip_setsockopt is enhanced with a new socket option IP_PASSSEC to allow a server socket to receive security context of the peer. A new ancillary message type SCM_SECURITY. When the packet is received we get the security context from the sec_path pointer which is contained in the sk_buff, and copy it to the ancillary message space. An additional LSM hook, selinux_socket_getpeersec_udp, is defined to retrieve the security context from the SELinux space. The existing function, selinux_socket_getpeersec does not suit our purpose, because the security context is copied directly to user space, rather than to kernel space. Testing: We have tested the patch by setting up TCP and UDP connections between applications on two machines using the IPSec policies that result in labeled security associations being built. For TCP, we can then extract the peer security context using getsockopt on either end. For UDP, the receiving end can retrieve the security context using the auxiliary data mechanism of recvmsg. Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TIPC] Initial mergePer Liden2006-01-12
| | | | | | | | TIPC (Transparent Inter Process Communication) is a protocol designed for intra cluster communication. For more information see http://tipc.sourceforge.net Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>