diff options
| author | Jay Vosburgh <fubar@us.ibm.com> | 2007-11-13 23:25:48 -0500 |
|---|---|---|
| committer | David S. Miller <davem@davemloft.net> | 2008-01-28 18:03:48 -0500 |
| commit | 9a6c686799346b6c95405c9e051f5023873504fa (patch) | |
| tree | f1a8414921b8535a5fb4acdf8d57e69459c48289 /Documentation/networking | |
| parent | 7a47dd7a2f178cc4e87d584b0469eef4b58b7aea (diff) | |
[BONDING]: Documentation update
Update the bonding documentation: more discussion on
initialization and configuration, changes to discussion of packet
reordering in balance-rr, update some out of date information.
Based in part on input from Rick Jones <rick.jones2@hp.com>
and Andy Gospodarek <andy@greyhouse.net>.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation/networking')
| -rw-r--r-- | Documentation/networking/bonding.txt | 204 |
1 files changed, 143 insertions, 61 deletions
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 6cc30e0d5795..a0cda062bc33 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt | |||
| @@ -1,7 +1,7 @@ | |||
| 1 | 1 | ||
| 2 | Linux Ethernet Bonding Driver HOWTO | 2 | Linux Ethernet Bonding Driver HOWTO |
| 3 | 3 | ||
| 4 | Latest update: 24 April 2006 | 4 | Latest update: 12 November 2007 |
| 5 | 5 | ||
| 6 | Initial release : Thomas Davis <tadavis at lbl.gov> | 6 | Initial release : Thomas Davis <tadavis at lbl.gov> |
| 7 | Corrections, HA extensions : 2000/10/03-15 : | 7 | Corrections, HA extensions : 2000/10/03-15 : |
| @@ -166,12 +166,17 @@ to use ifenslave. | |||
| 166 | 2. Bonding Driver Options | 166 | 2. Bonding Driver Options |
| 167 | ========================= | 167 | ========================= |
| 168 | 168 | ||
| 169 | Options for the bonding driver are supplied as parameters to | 169 | Options for the bonding driver are supplied as parameters to the |
| 170 | the bonding module at load time. They may be given as command line | 170 | bonding module at load time, or are specified via sysfs. |
| 171 | arguments to the insmod or modprobe command, but are usually specified | 171 | |
| 172 | in either the /etc/modules.conf or /etc/modprobe.conf configuration | 172 | Module options may be given as command line arguments to the |
| 173 | file, or in a distro-specific configuration file (some of which are | 173 | insmod or modprobe command, but are usually specified in either the |
| 174 | detailed in the next section). | 174 | /etc/modules.conf or /etc/modprobe.conf configuration file, or in a |
| 175 | distro-specific configuration file (some of which are detailed in the next | ||
| 176 | section). | ||
| 177 | |||
| 178 | Details on bonding support for sysfs is provided in the | ||
| 179 | "Configuring Bonding Manually via Sysfs" section, below. | ||
| 175 | 180 | ||
| 176 | The available bonding driver parameters are listed below. If a | 181 | The available bonding driver parameters are listed below. If a |
| 177 | parameter is not specified the default value is used. When initially | 182 | parameter is not specified the default value is used. When initially |
| @@ -812,11 +817,13 @@ the system /etc/modules.conf or /etc/modprobe.conf configuration file. | |||
| 812 | 3.2 Configuration with Initscripts Support | 817 | 3.2 Configuration with Initscripts Support |
| 813 | ------------------------------------------ | 818 | ------------------------------------------ |
| 814 | 819 | ||
| 815 | This section applies to distros using a version of initscripts | 820 | This section applies to distros using a recent version of |
| 816 | with bonding support, for example, Red Hat Linux 9 or Red Hat | 821 | initscripts with bonding support, for example, Red Hat Enterprise Linux |
| 817 | Enterprise Linux version 3 or 4. On these systems, the network | 822 | version 3 or later, Fedora, etc. On these systems, the network |
| 818 | initialization scripts have some knowledge of bonding, and can be | 823 | initialization scripts have knowledge of bonding, and can be configured to |
| 819 | configured to control bonding devices. | 824 | control bonding devices. Note that older versions of the initscripts |
| 825 | package have lower levels of support for bonding; this will be noted where | ||
| 826 | applicable. | ||
| 820 | 827 | ||
| 821 | These distros will not automatically load the network adapter | 828 | These distros will not automatically load the network adapter |
| 822 | driver unless the ethX device is configured with an IP address. | 829 | driver unless the ethX device is configured with an IP address. |
| @@ -864,11 +871,31 @@ USERCTL=no | |||
| 864 | Be sure to change the networking specific lines (IPADDR, | 871 | Be sure to change the networking specific lines (IPADDR, |
| 865 | NETMASK, NETWORK and BROADCAST) to match your network configuration. | 872 | NETMASK, NETWORK and BROADCAST) to match your network configuration. |
| 866 | 873 | ||
| 867 | Finally, it is necessary to edit /etc/modules.conf (or | 874 | For later versions of initscripts, such as that found with Fedora |
| 868 | /etc/modprobe.conf, depending upon your distro) to load the bonding | 875 | 7 and Red Hat Enterprise Linux version 5 (or later), it is possible, and, |
| 869 | module with your desired options when the bond0 interface is brought | 876 | indeed, preferable, to specify the bonding options in the ifcfg-bond0 |
| 870 | up. The following lines in /etc/modules.conf (or modprobe.conf) will | 877 | file, e.g. a line of the format: |
| 871 | load the bonding module, and select its options: | 878 | |
| 879 | BONDING_OPTS="mode=active-backup arp_interval=60 arp_ip_target=+192.168.1.254" | ||
| 880 | |||
| 881 | will configure the bond with the specified options. The options | ||
| 882 | specified in BONDING_OPTS are identical to the bonding module parameters | ||
| 883 | except for the arp_ip_target field. Each target should be included as a | ||
| 884 | separate option and should be preceded by a '+' to indicate it should be | ||
| 885 | added to the list of queried targets, e.g., | ||
| 886 | |||
| 887 | arp_ip_target=+192.168.1.1 arp_ip_target=+192.168.1.2 | ||
| 888 | |||
| 889 | is the proper syntax to specify multiple targets. When specifying | ||
| 890 | options via BONDING_OPTS, it is not necessary to edit /etc/modules.conf or | ||
| 891 | /etc/modprobe.conf. | ||
| 892 | |||
| 893 | For older versions of initscripts that do not support | ||
| 894 | BONDING_OPTS, it is necessary to edit /etc/modules.conf (or | ||
| 895 | /etc/modprobe.conf, depending upon your distro) to load the bonding module | ||
| 896 | with your desired options when the bond0 interface is brought up. The | ||
| 897 | following lines in /etc/modules.conf (or modprobe.conf) will load the | ||
| 898 | bonding module, and select its options: | ||
| 872 | 899 | ||
| 873 | alias bond0 bonding | 900 | alias bond0 bonding |
| 874 | options bond0 mode=balance-alb miimon=100 | 901 | options bond0 mode=balance-alb miimon=100 |
| @@ -883,9 +910,10 @@ up and running. | |||
| 883 | 3.2.1 Using DHCP with Initscripts | 910 | 3.2.1 Using DHCP with Initscripts |
| 884 | --------------------------------- | 911 | --------------------------------- |
| 885 | 912 | ||
| 886 | Recent versions of initscripts (the version supplied with | 913 | Recent versions of initscripts (the versions supplied with Fedora |
| 887 | Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do | 914 | Core 3 and Red Hat Enterprise Linux 4, or later versions, are reported to |
| 888 | have support for assigning IP information to bonding devices via DHCP. | 915 | work) have support for assigning IP information to bonding devices via |
| 916 | DHCP. | ||
| 889 | 917 | ||
| 890 | To configure bonding for DHCP, configure it as described | 918 | To configure bonding for DHCP, configure it as described |
| 891 | above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" | 919 | above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" |
| @@ -895,18 +923,14 @@ is case sensitive. | |||
| 895 | 3.2.2 Configuring Multiple Bonds with Initscripts | 923 | 3.2.2 Configuring Multiple Bonds with Initscripts |
| 896 | ------------------------------------------------- | 924 | ------------------------------------------------- |
| 897 | 925 | ||
| 898 | At this writing, the initscripts package does not directly | 926 | Initscripts packages that are included with Fedora 7 and Red Hat |
| 899 | support loading the bonding driver multiple times, so the process for | 927 | Enterprise Linux 5 support multiple bonding interfaces by simply |
| 900 | doing so is the same as described in the "Configuring Multiple Bonds | 928 | specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is the |
| 901 | Manually" section, below. | 929 | number of the bond. This support requires sysfs support in the kernel, |
| 902 | 930 | and a bonding driver of version 3.0.0 or later. Other configurations may | |
| 903 | NOTE: It has been observed that some Red Hat supplied kernels | 931 | not support this method for specifying multiple bonding interfaces; for |
| 904 | are apparently unable to rename modules at load time (the "-o bond1" | 932 | those instances, see the "Configuring Multiple Bonds Manually" section, |
| 905 | part). Attempts to pass that option to modprobe will produce an | 933 | below. |
| 906 | "Operation not permitted" error. This has been reported on some | ||
| 907 | Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels | ||
| 908 | exhibiting this problem, it will be impossible to configure multiple | ||
| 909 | bonds with differing parameters. | ||
| 910 | 934 | ||
| 911 | 3.3 Configuring Bonding Manually with Ifenslave | 935 | 3.3 Configuring Bonding Manually with Ifenslave |
| 912 | ----------------------------------------------- | 936 | ----------------------------------------------- |
| @@ -977,15 +1001,58 @@ initialization scripts lack support for configuring multiple bonds. | |||
| 977 | options, you may wish to use the "max_bonds" module parameter, | 1001 | options, you may wish to use the "max_bonds" module parameter, |
| 978 | documented above. | 1002 | documented above. |
| 979 | 1003 | ||
| 980 | To create multiple bonding devices with differing options, it | 1004 | To create multiple bonding devices with differing options, it is |
| 981 | is necessary to use bonding parameters exported by sysfs, documented | 1005 | preferrable to use bonding parameters exported by sysfs, documented in the |
| 982 | in the section below. | 1006 | section below. |
| 1007 | |||
| 1008 | For versions of bonding without sysfs support, the only means to | ||
| 1009 | provide multiple instances of bonding with differing options is to load | ||
| 1010 | the bonding driver multiple times. Note that current versions of the | ||
| 1011 | sysconfig network initialization scripts handle this automatically; if | ||
| 1012 | your distro uses these scripts, no special action is needed. See the | ||
| 1013 | section Configuring Bonding Devices, above, if you're not sure about your | ||
| 1014 | network initialization scripts. | ||
| 1015 | |||
| 1016 | To load multiple instances of the module, it is necessary to | ||
| 1017 | specify a different name for each instance (the module loading system | ||
| 1018 | requires that every loaded module, even multiple instances of the same | ||
| 1019 | module, have a unique name). This is accomplished by supplying multiple | ||
| 1020 | sets of bonding options in /etc/modprobe.conf, for example: | ||
| 1021 | |||
| 1022 | alias bond0 bonding | ||
| 1023 | options bond0 -o bond0 mode=balance-rr miimon=100 | ||
| 1024 | |||
| 1025 | alias bond1 bonding | ||
| 1026 | options bond1 -o bond1 mode=balance-alb miimon=50 | ||
| 1027 | |||
| 1028 | will load the bonding module two times. The first instance is | ||
| 1029 | named "bond0" and creates the bond0 device in balance-rr mode with an | ||
| 1030 | miimon of 100. The second instance is named "bond1" and creates the | ||
| 1031 | bond1 device in balance-alb mode with an miimon of 50. | ||
| 1032 | |||
| 1033 | In some circumstances (typically with older distributions), | ||
| 1034 | the above does not work, and the second bonding instance never sees | ||
| 1035 | its options. In that case, the second options line can be substituted | ||
| 1036 | as follows: | ||
| 1037 | |||
| 1038 | install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \ | ||
| 1039 | mode=balance-alb miimon=50 | ||
| 983 | 1040 | ||
| 1041 | This may be repeated any number of times, specifying a new and | ||
| 1042 | unique name in place of bond1 for each subsequent instance. | ||
| 1043 | |||
| 1044 | It has been observed that some Red Hat supplied kernels are unable | ||
| 1045 | to rename modules at load time (the "-o bond1" part). Attempts to pass | ||
| 1046 | that option to modprobe will produce an "Operation not permitted" error. | ||
| 1047 | This has been reported on some Fedora Core kernels, and has been seen on | ||
| 1048 | RHEL 4 as well. On kernels exhibiting this problem, it will be impossible | ||
| 1049 | to configure multiple bonds with differing parameters (as they are older | ||
| 1050 | kernels, and also lack sysfs support). | ||
| 984 | 1051 | ||
| 985 | 3.4 Configuring Bonding Manually via Sysfs | 1052 | 3.4 Configuring Bonding Manually via Sysfs |
| 986 | ------------------------------------------ | 1053 | ------------------------------------------ |
| 987 | 1054 | ||
| 988 | Starting with version 3.0, Channel Bonding may be configured | 1055 | Starting with version 3.0.0, Channel Bonding may be configured |
| 989 | via the sysfs interface. This interface allows dynamic configuration | 1056 | via the sysfs interface. This interface allows dynamic configuration |
| 990 | of all bonds in the system without unloading the module. It also | 1057 | of all bonds in the system without unloading the module. It also |
| 991 | allows for adding and removing bonds at runtime. Ifenslave is no | 1058 | allows for adding and removing bonds at runtime. Ifenslave is no |
| @@ -1030,9 +1097,6 @@ To enslave interface eth0 to bond bond0: | |||
| 1030 | To free slave eth0 from bond bond0: | 1097 | To free slave eth0 from bond bond0: |
| 1031 | # echo -eth0 > /sys/class/net/bond0/bonding/slaves | 1098 | # echo -eth0 > /sys/class/net/bond0/bonding/slaves |
| 1032 | 1099 | ||
| 1033 | NOTE: The bond must be up before slaves can be added. All | ||
| 1034 | slaves are freed when the interface is brought down. | ||
| 1035 | |||
| 1036 | When an interface is enslaved to a bond, symlinks between the | 1100 | When an interface is enslaved to a bond, symlinks between the |
| 1037 | two are created in the sysfs filesystem. In this case, you would get | 1101 | two are created in the sysfs filesystem. In this case, you would get |
| 1038 | /sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and | 1102 | /sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and |
| @@ -1622,6 +1686,15 @@ one for each switch in the network). This will insure that, | |||
| 1622 | regardless of which switch is active, the ARP monitor has a suitable | 1686 | regardless of which switch is active, the ARP monitor has a suitable |
| 1623 | target to query. | 1687 | target to query. |
| 1624 | 1688 | ||
| 1689 | Note, also, that of late many switches now support a functionality | ||
| 1690 | generally referred to as "trunk failover." This is a feature of the | ||
| 1691 | switch that causes the link state of a particular switch port to be set | ||
| 1692 | down (or up) when the state of another switch port goes down (or up). | ||
| 1693 | It's purpose is to propogate link failures from logically "exterior" ports | ||
| 1694 | to the logically "interior" ports that bonding is able to monitor via | ||
| 1695 | miimon. Availability and configuration for trunk failover varies by | ||
| 1696 | switch, but this can be a viable alternative to the ARP monitor when using | ||
| 1697 | suitable switches. | ||
| 1625 | 1698 | ||
| 1626 | 12. Configuring Bonding for Maximum Throughput | 1699 | 12. Configuring Bonding for Maximum Throughput |
| 1627 | ============================================== | 1700 | ============================================== |
| @@ -1709,7 +1782,7 @@ balance-rr: This mode is the only mode that will permit a single | |||
| 1709 | interfaces. It is therefore the only mode that will allow a | 1782 | interfaces. It is therefore the only mode that will allow a |
| 1710 | single TCP/IP stream to utilize more than one interface's | 1783 | single TCP/IP stream to utilize more than one interface's |
| 1711 | worth of throughput. This comes at a cost, however: the | 1784 | worth of throughput. This comes at a cost, however: the |
| 1712 | striping often results in peer systems receiving packets out | 1785 | striping generally results in peer systems receiving packets out |
| 1713 | of order, causing TCP/IP's congestion control system to kick | 1786 | of order, causing TCP/IP's congestion control system to kick |
| 1714 | in, often by retransmitting segments. | 1787 | in, often by retransmitting segments. |
| 1715 | 1788 | ||
| @@ -1721,22 +1794,20 @@ balance-rr: This mode is the only mode that will permit a single | |||
| 1721 | interface's worth of throughput, even after adjusting | 1794 | interface's worth of throughput, even after adjusting |
| 1722 | tcp_reordering. | 1795 | tcp_reordering. |
| 1723 | 1796 | ||
| 1724 | Note that this out of order delivery occurs when both the | 1797 | Note that the fraction of packets that will be delivered out of |
| 1725 | sending and receiving systems are utilizing a multiple | 1798 | order is highly variable, and is unlikely to be zero. The level |
| 1726 | interface bond. Consider a configuration in which a | 1799 | of reordering depends upon a variety of factors, including the |
| 1727 | balance-rr bond feeds into a single higher capacity network | 1800 | networking interfaces, the switch, and the topology of the |
| 1728 | channel (e.g., multiple 100Mb/sec ethernets feeding a single | 1801 | configuration. Speaking in general terms, higher speed network |
| 1729 | gigabit ethernet via an etherchannel capable switch). In this | 1802 | cards produce more reordering (due to factors such as packet |
| 1730 | configuration, traffic sent from the multiple 100Mb devices to | 1803 | coalescing), and a "many to many" topology will reorder at a |
| 1731 | a destination connected to the gigabit device will not see | 1804 | higher rate than a "many slow to one fast" configuration. |
| 1732 | packets out of order. However, traffic sent from the gigabit | 1805 | |
| 1733 | device to the multiple 100Mb devices may or may not see | 1806 | Many switches do not support any modes that stripe traffic |
| 1734 | traffic out of order, depending upon the balance policy of the | 1807 | (instead choosing a port based upon IP or MAC level addresses); |
| 1735 | switch. Many switches do not support any modes that stripe | 1808 | for those devices, traffic for a particular connection flowing |
| 1736 | traffic (instead choosing a port based upon IP or MAC level | 1809 | through the switch to a balance-rr bond will not utilize greater |
| 1737 | addresses); for those devices, traffic flowing from the | 1810 | than one interface's worth of bandwidth. |
| 1738 | gigabit device to the many 100Mb devices will only utilize one | ||
| 1739 | interface. | ||
| 1740 | 1811 | ||
| 1741 | If you are utilizing protocols other than TCP/IP, UDP for | 1812 | If you are utilizing protocols other than TCP/IP, UDP for |
| 1742 | example, and your application can tolerate out of order | 1813 | example, and your application can tolerate out of order |
| @@ -1936,6 +2007,10 @@ Failover may be delayed via the downdelay bonding module option. | |||
| 1936 | 13.2 Duplicated Incoming Packets | 2007 | 13.2 Duplicated Incoming Packets |
| 1937 | -------------------------------- | 2008 | -------------------------------- |
| 1938 | 2009 | ||
| 2010 | NOTE: Starting with version 3.0.2, the bonding driver has logic to | ||
| 2011 | suppress duplicate packets, which should largely eliminate this problem. | ||
| 2012 | The following description is kept for reference. | ||
| 2013 | |||
| 1939 | It is not uncommon to observe a short burst of duplicated | 2014 | It is not uncommon to observe a short burst of duplicated |
| 1940 | traffic when the bonding device is first used, or after it has been | 2015 | traffic when the bonding device is first used, or after it has been |
| 1941 | idle for some period of time. This is most easily observed by issuing | 2016 | idle for some period of time. This is most easily observed by issuing |
| @@ -2096,6 +2171,9 @@ The new driver was designed to be SMP safe from the start. | |||
| 2096 | EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, | 2171 | EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, |
| 2097 | devices need not be of the same speed. | 2172 | devices need not be of the same speed. |
| 2098 | 2173 | ||
| 2174 | Starting with version 3.2.1, bonding also supports Infiniband | ||
| 2175 | slaves in active-backup mode. | ||
| 2176 | |||
| 2099 | 3. How many bonding devices can I have? | 2177 | 3. How many bonding devices can I have? |
| 2100 | 2178 | ||
| 2101 | There is no limit. | 2179 | There is no limit. |
| @@ -2154,11 +2232,15 @@ switches currently available support 802.3ad. | |||
| 2154 | 2232 | ||
| 2155 | 8. Where does a bonding device get its MAC address from? | 2233 | 8. Where does a bonding device get its MAC address from? |
| 2156 | 2234 | ||
| 2157 | If not explicitly configured (with ifconfig or ip link), the | 2235 | When using slave devices that have fixed MAC addresses, or when |
| 2158 | MAC address of the bonding device is taken from its first slave | 2236 | the fail_over_mac option is enabled, the bonding device's MAC address is |
| 2159 | device. This MAC address is then passed to all following slaves and | 2237 | the MAC address of the active slave. |
| 2160 | remains persistent (even if the first slave is removed) until the | 2238 | |
| 2161 | bonding device is brought down or reconfigured. | 2239 | For other configurations, if not explicitly configured (with |
| 2240 | ifconfig or ip link), the MAC address of the bonding device is taken from | ||
| 2241 | its first slave device. This MAC address is then passed to all following | ||
| 2242 | slaves and remains persistent (even if the first slave is removed) until | ||
| 2243 | the bonding device is brought down or reconfigured. | ||
| 2162 | 2244 | ||
| 2163 | If you wish to change the MAC address, you can set it with | 2245 | If you wish to change the MAC address, you can set it with |
| 2164 | ifconfig or ip link: | 2246 | ifconfig or ip link: |
