LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-27-2023, 05:26 PM   #1
Riccardo1987
LQ Newbie
 
Registered: Oct 2023
Posts: 4

Rep: Reputation: 0
Bandwidth LACP bonding


Hi everyone, I'm doing bandwidth tests between my PC with centos 7 and my SAN. I created a bonding mode=4 on the server of three 10 Gb network interfaces and set the LACP on three 10 Gb network ports in the SAN. All three ports of the server and SAN are connected to a management switch where the LACP is configured. I connect 3 ISCSI volumes to the server and do some bandwidth tests trying to saturate the network but I see that only a 10 Gb network card is being used. I start two copy processes that write to two different volumes and I always only use a 10Gb network card. I change the xmit_hash_policy value and set it to layer3+4 but nothing changes. Since the SAN has dual controllers, I configure and create two more ISCSI volumes on the second controller and mount them on the server. I test the copy from the server by starting two copy processes. One writes on controller 1 and one on controller 2. In this way I see that both network cards are used at maximum performance (10 Gb for each network card are saturated) I start 4 copy processes to also saturate the third network card but it is not used. I don't understand how to use all the network cards configured in LACP on the server. I don't really understand how LACP works. When does it use all network ports? Are there parameters on the server that I need to set to make it use all network cards? Thanks in advance.
 
Old 11-18-2023, 10:38 PM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,345

Rep: Reputation: Disabled
Quote:
Originally Posted by Riccardo1987 View Post
I created a bonding mode=4 on the server of three 10 Gb network interfaces
That means the LACP protocol is enabled, and as long as the switch at the other end has this enabled as well, a LAG should be created automatically.
Quote:
Originally Posted by Riccardo1987 View Post
I change the xmit_hash_policy value and set it to layer3+4 but nothing changes.
Do you know what this does?
Quote:
Originally Posted by Riccardo1987 View Post
I don't really understand how LACP works.
Well, that's your problem right there.

When you use Link Aggregation Groups, traffic is typically load balanced across the links using some sort of hashing algorithm. "Layer 2" means the MAC address is used, "Layer 3" means the IPv4 or IPv6 address is used, and "Layer 4" means the transport protocol (and possibly the associated port numbers) are used when creating a hash value for a given packet. This value then decides which outbound physical link will be used.

This means that traffic to/from a specific MAC address and/or IP address and/or port number will always be sent across the same physical link. It also means that the hash value decides the link, regardless of link utilization or available bandwidth: If traffic of type X going from address A to address B has hash value N, then it will always be sent over the link associated with N.

The one exception to this rule is when you configure the server to use round-robin load balancing, which is supported on Linux. Unfortunately this is not the panacea it may seem to be, as a) it's not supported by any switches I've ever seen, so you'll only get RR for outbound traffic, and b) you could easily end up saturating the link at the other end, meaning massive packet loss and random hangs or delays until TCP or the application detects the issue and performs the required throttling.

Edit: Oh, and please use paragraphs in the future. I suspect that your post being a Massive Block of Text has something to do with you not getting a response sooner.

Last edited by Ser Olmy; 11-18-2023 at 10:41 PM.
 
Old 11-19-2023, 01:18 AM   #3
Riccardo1987
LQ Newbie
 
Registered: Oct 2023
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Ser Olmy View Post
That means the LACP protocol is enabled, and as long as the switch at the other end has this enabled as well, a LAG should be created automatically.
Do you know what this does?
Well, that's your problem right there.

When you use Link Aggregation Groups, traffic is typically load balanced across the links using some sort of hashing algorithm. "Layer 2" means the MAC address is used, "Layer 3" means the IPv4 or IPv6 address is used, and "Layer 4" means the transport protocol (and possibly the associated port numbers) are used when creating a hash value for a given packet. This value then decides which outbound physical link will be used.

This means that traffic to/from a specific MAC address and/or IP address and/or port number will always be sent across the same physical link. It also means that the hash value decides the link, regardless of link utilization or available bandwidth: If traffic of type X going from address A to address B has hash value N, then it will always be sent over the link associated with N.

The one exception to this rule is when you configure the server to use round-robin load balancing, which is supported on Linux. Unfortunately this is not the panacea it may seem to be, as a) it's not supported by any switches I've ever seen, so you'll only get RR for outbound traffic, and b) you could easily end up saturating the link at the other end, meaning massive packet loss and random hangs or delays until TCP or the application detects the issue and performs the required throttling.

Edit: Oh, and please use paragraphs in the future. I suspect that your post being a Massive Block of Text has something to do with you not getting a response sooner.

thank you very much. Meanwhile the SAN parent company told me that MPIO must be used and not LACP. Here too I find it difficult to configure the network to increase network throughput. I just haven't figured out how to give the MPIO more than in a network interface to give it 30 Gb instead of 10 Gb. I also set it as round robin but I don't understand how to tell it that the interfaces to use are eth1 eth2 and eth3. Thanks again.
 
Old 11-19-2023, 10:44 AM   #4
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,345

Rep: Reputation: Disabled
Quote:
Originally Posted by Riccardo1987 View Post
thank you very much. Meanwhile the SAN parent company told me that MPIO must be used and not LACP.
LACP is just the protocol that enables the automatic creation of Link Aggregation Groups ("bond" interfaces in Linux); configuration of how the actual group(s) function is a separate matter.

If you connect two servers directly (no switch being involved), you can use a LAG in round-robin mode and utilise the full combined bandwidth of all interfaces. In all other scenarios this is not possible.

The SAN vendor is correct in that Multipath I/O is probably a simpler/better solution, or perhaps even the only practical solution.
Quote:
Originally Posted by Riccardo1987 View Post
Here too I find it difficult to configure the network to increase network throughput. I just haven't figured out how to give the MPIO more than in a network interface to give it 30 Gb instead of 10 Gb.
I assume we're talking about an iSCSI SAN? In that case, the documentation for the SAN should tell you how to configure IP addresses and routing in a multipath scenario.
Quote:
Originally Posted by Riccardo1987 View Post
I also set it as round robin but I don't understand how to tell it that the interfaces to use are eth1 eth2 and eth3.
You create a LAG interface (bondX) out of the relevant physical interfaces, and then set the load balancing mode on the "bond" interface and on the LAG in the switch. No further configuration is necessary (or even possible). Make sure the physical links are enabled but otherwise unconfigured (no IP addresses etc.)

But as I said, you will not be able to use the combined bandwidth of all interfaces in a LAG when transferring data between any two hosts (such as a server and a SAN), as the hashing algorithm in Linux and the switch(es) will select one physical link for any one particular type of traffic.
 
Old 11-20-2023, 08:46 AM   #5
Riccardo1987
LQ Newbie
 
Registered: Oct 2023
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Ser Olmy View Post
LACP is just the protocol that enables the automatic creation of Link Aggregation Groups ("bond" interfaces in Linux); configuration of how the actual group(s) function is a separate matter.

If you connect two servers directly (no switch being involved), you can use a LAG in round-robin mode and utilise the full combined bandwidth of all interfaces. In all other scenarios this is not possible.

The SAN vendor is correct in that Multipath I/O is probably a simpler/better solution, or perhaps even the only practical solution.
I assume we're talking about an iSCSI SAN? In that case, the documentation for the SAN should tell you how to configure IP addresses and routing in a multipath scenario.You create a LAG interface (bondX) out of the relevant physical interfaces, and then set the load balancing mode on the "bond" interface and on the LAG in the switch. No further configuration is necessary (or even possible). Make sure the physical links are enabled but otherwise unconfigured (no IP addresses etc.)

But as I said, you will not be able to use the combined bandwidth of all interfaces in a LAG when transferring data between any two hosts (such as a server and a SAN), as the hashing algorithm in Linux and the switch(es) will select one physical link for any one particular type of traffic.

Thank you so much.
Reading some forums, it seems to advise against using network card bonding with the MPIO. They seem to be bothering each other for some reason.

The parent company of the SAN told me that bonding is not necessary but I don't see any other solution to make 2 eth of the server communicate with another 2 eth of the SAN in such a way as to also provide redundancy of the network ports.

However, the connection between the server and the SAN is iSCSI and what I plan to do is connect the 2 eth of the server directly to the SAN ports by connecting the eth1 of the server to the eth1 of controller 1 of the SAN and the eth2 of the server to the eth1 of the controller 2 of the SAN.
 
Old 11-20-2023, 09:20 AM   #6
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,345

Rep: Reputation: Disabled
MPIO literally stands for "multipath I/O." Bonded interfaces act like a single Layer 2 connection, and do not qualify as "multipath."

If you connect the server and the SAN directly, and you configure LAGs at both ends, AND the SAN supports round-robin load balancing then yes, you should be able to use the full bandwidth of all bonded interfaces. But all those prerequisites must be met, otherwise you'll end up with only one interface being used, the others basically acting as active standbys.

I suspect MPIO would be the easier route here.
 
Old 11-20-2023, 09:40 AM   #7
Riccardo1987
LQ Newbie
 
Registered: Oct 2023
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Ser Olmy View Post
MPIO literally stands for "multipath I/O." Bonded interfaces act like a single Layer 2 connection, and do not qualify as "multipath."

If you connect the server and the SAN directly, and you configure LAGs at both ends, AND the SAN supports round-robin load balancing then yes, you should be able to use the full bandwidth of all bonded interfaces. But all those prerequisites must be met, otherwise you'll end up with only one interface being used, the others basically acting as active standbys.

I suspect MPIO would be the easier route here.
So do MPIO and Teaming/Bonding go together?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Open vSwitch bonding with LACP and VLAN yenn Linux - Networking 2 03-31-2014 10:00 AM
bonding (mode=1) above (mode=4) lacp drManhattan Linux - Networking 3 10-01-2013 10:01 AM
[SOLVED] Centos6.4 - LACP (bonding mode 4) + bridge setup mr.b-o-b Linux - Virtualization and Cloud 1 06-04-2013 03:41 PM
redhat el4 bonding round robin and LACP testin, bad results help? zerobane Linux - Networking 14 12-01-2009 03:17 PM
bonding bonding-xor and bonding-alb Chrysalis Linux - Networking 1 10-24-2009 02:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 11:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration