Trying to set up a bond using two identical NICs
Intel Corporation 8254GI/PI Gigabit Ethernet Controller
on pci's slots 06 and 07
modified /etc/modprobe.conf as follows by adding these lines:
Code:
alias bond0 bonding
options bond0 mode=1 miimon=100 primary=eth0
alias eth2 e1000
(note, added alias eth2 because this new NIC interface is configured as our local heartbeat LAN between the other server).
Modified ifcfg-eth0 and ifcfg-eth1 to show:
Code:
DEVICE=eth0 (or eth1 for ifcfg-eth1)
MASTER=bond0
SLAVE=yes
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
the ifcfg-bond0 shows:
Code:
DEVICE=bond0
BOOTPROTO=static
IPADDR=xxx.xx.xx.xx
BROADCAST=xxx.xx.xx.xx
NETMASK=255.255.255.128
ONBOOT=yes
USERCTL=no
in dmesg we show:
bond0: duplicate address detected!
and ifconfig -a shows:
Code:
[root@dx3-wncf ~]# ifconfig -a
bond0 Link encap:Ethernet HWaddr 00:11:43:E8:64:5E
inet addr:165.xx.25.3 Bcast:xxx.xx.xx.xx Mask:255.255.255.128
inet6 addr: fe80::211:43ff:fee8:645e/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:562731 errors:0 dropped:0 overruns:0 frame:0
TX packets:580176 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:514711245 (490.8 MiB) TX bytes:584870495 (557.7 MiB)
bond0:0 Link encap:Ethernet HWaddr 00:11:43:E8:64:5E
inet addr:xxx.xx.xx.xx Bcast:xxx.xx.xx.xx Mask:255.255.255.128
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
bond0:1 Link encap:Ethernet HWaddr 00:11:43:E8:64:5E
inet addr:xxx.xx.xx.xx Bcast:xxx.xx.xx.xxx Mask:255.255.255.128
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
eth0 Link encap:Ethernet HWaddr 00:11:43:E8:64:5E
inet6 addr: fe80::211:43ff:fee8:645e/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:561178 errors:0 dropped:0 overruns:0 frame:0
TX packets:580023 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:513350633 (489.5 MiB) TX bytes:584860543 (557.7 MiB)
Base address:0xecc0 Memory:dfae0000-dfb00000
eth1 Link encap:Ethernet HWaddr 00:11:43:E8:64:5E
inet addr:10.0.5.3 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::211:43ff:fee8:645e/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:1553 errors:0 dropped:0 overruns:0 frame:0
TX packets:153 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1360612 (1.2 MiB) TX bytes:9952 (9.7 KiB)
Base address:0xdcc0 Memory:df8e0000-df900000
eth2 Link encap:Ethernet HWaddr 00:40:F4:73:20:E6
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:225 Base address:0xcc00
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8896 errors:0 dropped:0 overruns:0 frame:0
TX packets:8896 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1888981 (1.8 MiB) TX bytes:1888981 (1.8 MiB)
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
======================
[1] We have tried all the possible modes we desire to use: active-active, active-passive, broadcast, and round-robin (modes 0-3) and have the same issues of random intermittent and unpredicatable inbound and outbound packet loss
[2] We have looked into configurations on the ProCurves and have tried various trunking options, FEC (Fast Ethernet Channel) and LACP -- only FEC allows traffic to pass, but with the same problems of packet loss.
[3] We are unsure as to what the duplicate address errors are in dmesg and what the bond0:1 and bond0:0 are that are listed in the ifconfig output.
[4] When only one interface is plugged into one ProCurve the LAN functions nominally...that is when eth0 is active, and eth1 is down, or unplugged from the ProCurve, ping succeeds 100% of the time...thus we believe this to be a problem on the Dell server side of the house, in that it cannot resolve the duplicate inbound traffic, and that it jumps from ProCurve to ProCurve thus dropping packets in transition.
Summary of setup:
eth0 [slave, primary of bond0] -- into port 6 of lsw1 (ProCurve HP2824)
eth1 [slave, backup of bond0] -- into port 6 of lsw2 (ProCurve HP2824)
when both are plugged in, random packet loss
when only one is plugged in or active, SUCCESS
Sorry for the long post, but any possible insight is appreciated
Thanks