LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 06-20-2007, 04:06 AM   #1
PhillipHuang
Member
 
Registered: Aug 2006
Location: Shen Zhen
Distribution: Ubuntu 10.04
Posts: 193

Rep: Reputation: 33
Network issue in RHCE/GFS environment


Hello folks,

This thread is long, please pay more patients for your kindly reading.


1. Set up Storage-Cluster.

Cluster
---------
node1: eth1 192.168.3.249 -- Connect to Storage
eth2 192.168.11.249 -- Access IP
eth0 192.168.13.249 -- HeartBeat
CentOS4.4(kernel 2.6.9-42.0.3.ELsmp)
cman-kernel-smp-2.6.9-45.8
cman-devel-1.0.11-0
cman-kernheaders-2.6.9-45.8
cman-1.0.11-0
GFS-6.1.6-1
GFS-kernel-smp-2.6.9-60.3
lvm2-cluster-2.02.06-7.0.RHEL4
iscsi-initiator-utils-4.0.3.0-4
samba-3.0.10-1.4E.9
node2: eth1 192.168.3.52 -- Connect to Storage
eth2 192.168.11.52 -- Access IP
eth0 192.168.13.52 -- HeartBeat
other setting as same as node1

2.Create lv and mount

The background storage is implemented by iscsi, I create logic volumn as 500G, and then

format it to GFS filesystem.
Code:
# gfs_mkfs -p lock_dlm -t real:gfs -j 2 /dev/vg_milan/nesta
Here, the string "real" is the cluster name.

Then, I mount the formatted lv on the nodes one by one:
In node1:
Code:
[root@node1 ~]# mkdir -p /share
[root@node1 ~]# mount -t gfs /dev/vg_milan/netsa /share
[root@node1 ~]# chmod 777 /share
Repeat the above three steps in node2.

3. Configure the samba on node1 and node2, export /share as SMB share named "stress"

Now, I installed Windows on other two machines:
192.168.11.31 and 192.168.11.32

In 192.168.11.31, map the //192.168.11.249/stress as "Z:";
In 192.168.11.32, map the //192.168.11.52/stress as "Z:"

4. Running pressure programs on 192.168.11.31 and 192.168.11.32 to create a large number of

writing operations on the "/stress" samba share.

Use "dstat" command to monitor the networking status on nodes:

In node1, "eth1 send" and "eth2 recv" are both high, it is reasonable as I expection:
Code:
# dstat -N eth0,eth1,eth2 2
----total-cpu-usage---- -dsk/total- --net/eth0----net/eth1----net/eth2->
usr sys idl wai hiq siq|_read _writ|_recv _send:_recv _send:_recv _send>
  0   2  94   4   0   0|4322B 3753k|   0     0 :   0     0 :   0     0 >
  0   1  50  49   0   0| 554k 2202k|   0     0 : 584k   26k: 462B    0 >
  0   2  49  49   0   0| 532k 2098k|   0    35B: 743k 4544k: 809B    0 >
  0   1  50  49   0   0| 484k   80k|  35B    0 : 573k   24k: 569B    0 >
  0   1  50  49   0   0| 500k 2352k|   0     0 : 548k  739k: 440B    0 >
  0   1  50  49   0   0| 510k    0 |  35B   35B: 604k 1775k:1066B    0 >
  0   2  50  49   0   0| 526k 2212k|   0     0 : 575k   25k: 412B    0 >
  0   1  50  49   0   0| 534k  458k|   0    35B: 663k 2804k:1739B    0 >
  0   1  50  49   0   0| 538k    0 |  35B    0 : 574k   37k: 591B    0 >
  0  11  37  51   0   0| 496k   24M| 121k  128k: 864k 6799k:8131B 4978B>
  0   2  53  44   1   0| 494k    0 | 162k  196k:1481k   19M: 806B    0 >
  1  19  58  22   1   0| 408k 9754k| 178k  243k: 597k 5339k:  35M  223k>
  1  17  31  50   1   0| 506k  862k| 132B  158B: 914k 5904k:  60M  378k>
  1  19  29  51   1   0| 300k 7182k|  35B    0 : 435k   19k:  60M  377k>
  1  32  27  39   1   0| 176k   47M|   0     0 :1216k   25M:  51M  323k>
  1  29  27  43   1   0| 192k   42M|  35B   35B:2042k   50M:  42M  249k>
  0  29  38  32   1   0| 198k   41M| 936B 1293B:1748k   40M:  41M  233k>
  1  26  34  38   0   0| 246k   38M|   0    35B:1804k   42M:  41M  231k>
  1  27  33  38   1   0| 234k   41M|  35B    0 :1800k   40M:  40M  250k>

However, it is very stranger in node2: "eth1 recv and send" are both very high! while eth0 and eth2 have low I/O.
Code:
# dstat -N eth0,eth1,eth2 2
----total-cpu-usage---- -dsk/total- --net/eth0----net/eth1----net/eth2->
usr sys idl wai hiq siq|_read _writ|_recv _send:_recv _send:_recv _send>
  0  25  72   3   1   0|  38k  192k| 125k  119k: 949B  268B: 584B   37k>
  1  21  76   1   1   0|   0   446k| 191k  160k:  18M  339k: 843B  506k>
  1  22  75   2   1   0|  40k  524k| 250k  183k:  69M  694k:1066B  490k>
  1  35  61   1   1   0|   0    51M| 158B  123B:  72M  135k: 611B  467k>
  1  33  61   5   1   0|  94k   52M|   0    35B:  61M   58M: 814B  399k>
  0  19  60  20   0   0|  12k   33M|   0     0 :  54M   47M: 478B  260k>
  1  33  40  25   1   0|   0    52M|  35B   35B:  38M   41M: 874B  576k>
  1  41  19  39   1   0|   0    59M|1293B  936B:  60M   54M: 462B  552k>
  0  25  61  13   0   0|   0    42M|  35B    0 :  62M   62M: 575B  453k>
  1  40  56   2   1   0|   0    56M|   0    35B:  41M   44M: 484B  400k>
  1  39  52   7   1   0|   0    60M|   0     0 :  63M   59M: 442B  636k>
  1  39  58   2   1   0|   0    57M|  35B   35B:  63M   63M: 638B  607k>
  1  25  74   0   1   0|   0    38M|   0     0 :  56M   56M: 847B  221k>
  1  37  60   2   1   0|   0    55M|  35B    0 :  44M   42M:1354B  399k>
  1  40  57   1   1   0|   0    61M|   0    35B:  63M   60M: 713B  447k>
My question is, why does eth1 run in high level "resv and send"? while eth2 has such low run level?I really do no know why eth1-recv rate is so high in this test.

Please give me some hints or suggestions.


Thanks and Regards,
Phillip

Last edited by PhillipHuang; 06-21-2007 at 09:44 PM.
 
Old 06-20-2007, 07:59 AM   #2
brianmcgee
Member
 
Registered: Jun 2007
Location: Munich, Germany
Distribution: RHEL, CentOS, Fedora, SLES (...)
Posts: 398

Rep: Reputation: 36
You are right. This should not happen. However debugging complex cluster issues from distance is rather difficult. Usually you would ask someone to have a look at your problem. I guess this would mean at least a half days work to examine the configuration and finding the bug.

Maybe you should try the Samba and GFS mailinglist for this special issue. But there you will need a full sysreport of the cluster nodes.

For example I don't see any dlm-1.0* and dlm-kernel-2.* packages. Are they installed on your system?

What pressure tools are you using? Do they use some own locking so that the other node is not able to write the files?

Are you testing the new clustering features that are being implemented in Samba3/4?
 
Old 06-20-2007, 08:42 PM   #3
PhillipHuang
Member
 
Registered: Aug 2006
Location: Shen Zhen
Distribution: Ubuntu 10.04
Posts: 193

Original Poster
Rep: Reputation: 33
Hi brianmcgee,

Thanks for your reply, Here's the dlm packages:
Code:
[root@node1 ~]# rpm -qa | grep dlm
dlm-1.0.1-1
dlm-kernel-smp-2.6.9-44.3
dlm-devel-1.0.1-1
dlm-kernheaders-2.6.9-44.3
The pressure tools is writen by my customer, it is used in Windows Operating System to create many processes to write random files into the mapped(Samba share) directory. As I seen(while not very sure),it doesn't use own locking, all the processes are running parallelly,

Quote:
Are you testing the new clustering features that are being implemented in Samba3/4?
No, this testing is for fixing my customer issue, As your suggesting, I think I would upgrate the samba to the lastest stable version(3.0.25a) and try again.

Thanks a lot.

Regards,
Phillip
 
Old 06-21-2007, 01:28 AM   #4
rossonieri#1
Member
 
Registered: Jun 2007
Posts: 359

Rep: Reputation: 34
hi PhillipHuang,

using cluster doesnt mean that you can equally split the load -- since it was triggered by handshake between client request and the server response, what i can see from your output is normal file read/write access -- the client that connect to server2 did a lot of transfer.

HTH.

Last edited by rossonieri#1; 06-21-2007 at 01:30 AM.
 
Old 06-21-2007, 04:40 AM   #5
PhillipHuang
Member
 
Registered: Aug 2006
Location: Shen Zhen
Distribution: Ubuntu 10.04
Posts: 193

Original Poster
Rep: Reputation: 33
Hi Rossonieri,

In this test case, I did investigate the system performance when access GFS share by Samba protocol, and not care the load balancing. Yes, as you wrote, the output indicates there're a lot of transfer on node2:eth1, however, what amazes me is node2:eth1 has large "recv" load? In my opinion, I think it would just as same as node1:eth1, which has big "send" and node1:eth2 has big "recv".

eth1 is only used for connecting to storage devices, it should have not large "recv".
eth2 is the IP when accessing Samba, I think it should have large "recv" to receive the data.

By the way, I wrote this issue to samba mail-list.Some guys kindly mention CTDB would be related to use Samba across Cluster.

Regards,
Phillip

Last edited by PhillipHuang; 06-21-2007 at 04:42 AM.
 
Old 06-21-2007, 06:54 AM   #6
rossonieri#1
Member
 
Registered: Jun 2007
Posts: 359

Rep: Reputation: 34
hi PhillipHuang,

from this
maybe you can check the cluster service requirement.

from yours :
"In this test case, I did investigate the system performance when access GFS share by Samba protocol, and not care the load balancing"
well, the answer you'll find it there.

i need it to check your problem myself - this is what engineers do right?

keep posting the news.

Cheers.

Last edited by rossonieri#1; 06-21-2007 at 06:56 AM.
 
Old 06-21-2007, 08:40 PM   #7
PhillipHuang
Member
 
Registered: Aug 2006
Location: Shen Zhen
Distribution: Ubuntu 10.04
Posts: 193

Original Poster
Rep: Reputation: 33
Dear rossonieri,

Thanks for your kindly suggestion, I will do the CTDB-Setup test and update the result later.

Regards,
Phillip
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
GFS on Fedora psychobyte Linux - Enterprise 2 06-11-2007 04:01 AM
GFS installation RHEL4u4 oreitsma Linux - Enterprise 2 11-01-2006 04:28 AM
GPL Licensing issue for PCLINUXOS live cd : corporate environment nitinatindore Linux - Enterprise 7 02-02-2006 10:13 PM
redhat linux + GFS , anyone ! antonioxcom Linux - General 1 07-11-2004 05:07 PM


All times are GMT -5. The time now is 08:57 PM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration