LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 03-29-2012, 03:51 AM   #1
stardotstar
Member
 
Registered: Nov 2002
Location: /au/qld/bne/4157
Distribution: Gentoo mactel-linux
Posts: 238

Rep: Reputation: 30
advice on how to troubleshoot performance issues - HP DL360G5 + ESXi w/ Centos6x64


Hi all,

Well I am having a special kind of hell having moved from bare metal with my HP ProLiant DL360G5 (22G Ram, 2x 4core E5420s) to VMware ESXi (built on a RAID 1 Mirror Pair of 72G 10K SAS drives) hosting a dedicated VPS CentOS 6 x64 with cPanel/WHM as a LAMP server essentially.

The sites and payload is all the same as I ran on baremetal; the server is the same; it just got shipped to a new DC, was spun up with ESXi on the mirror pair of 72G 10K SAS and then a VM built on a mirror pair of SATA 300G.

I seem to be getting diabolical server loads in the vm. The other thing that I am being told is that the disk IO is shot to hell.

This is being measured at the physical level apparently so I am assured that it is not VM IO to blame.

I was told the following:

Quote:
If we copy data from SATA datastore 3 to SATA datastore 2 we get speeds of around 10,000KBps

If we copy data from SATA datastore 3 to SATA datastore 3 we get speeds of around 10,000KBps

If we copy data from SATA datastore 3 to SAS datastore 1 we get speeds of around 10,000KBps

If we copy data from SAS datastore 1 to SAS datastore 1 we get speeds of around 25,000KBps

If we copy data from SATA datastore via 1000mbit network from one of our servers to SAS datastore 1 we get speeds of around 25,000KBps

If we copy data from SATA datastore via 1000mbit network from one of our servers to SAS datastore 1 we get speeds of around 10,000KBps



If we copy data from SATA datastore on one of our servers to SATA datastore on the same server we get speeds of around 125,000KBps

If we copy data from SATA datastore on one of our servers to SATA datastore on a different server via 1000mbit network we get speeds of around 125,000KBps



If we copy data from SAS datastore on one of our servers to SAS datastore on the same server we get speeds of around 200,000KBps



So without a doubt there is an IO issue with your server, putting in new SAS drives or SATA drives will not fix anything, as all 3 independent raid 1 on your server are preforming extremely bad.
Now I did a UnixBench and got the following results:

Quote:
# # # # # # # ##### ###### # # #### # #
# # ## # # # # # # # ## # # # # #
# # # # # # ## ##### ##### # # # # ######
# # # # # # ## # # # # # # # # #
# # # ## # # # # # # # ## # # # #
#### # # # # # ##### ###### # # #### # #

Version 5.1.3 Based on the Byte Magazine Unix Benchmark

Multi-CPU version Version 5 revisions by Ian Smith,
Sunnyvale, CA, USA
January 13, 2011 johantheghost at yahoo period com


1 x Dhrystone 2 using register variables 1 2 3 4 5 6 7 8 9 10

1 x Double-Precision Whetstone 1 2 3 4 5 6 7 8 9 10

1 x Execl Throughput 1 2 3

1 x File Copy 1024 bufsize 2000 maxblocks 1 2 3

1 x File Copy 256 bufsize 500 maxblocks 1 2 3

1 x File Copy 4096 bufsize 8000 maxblocks 1 2 3

1 x Pipe Throughput 1 2 3 4 5 6 7 8 9 10

1 x Pipe-based Context Switching 1 2 3 4 5 6 7 8 9 10

1 x Process Creation 1 2 3

1 x System Call Overhead 1 2 3 4 5 6 7 8 9 10

1 x Shell Scripts (1 concurrent) 1 2 3

1 x Shell Scripts (8 concurrent) 1 2 3

8 x Dhrystone 2 using register variables 1 2 3 4 5 6 7 8 9 10

8 x Double-Precision Whetstone 1 2 3 4 5 6 7 8 9 10

8 x Execl Throughput 1 2 3

8 x File Copy 1024 bufsize 2000 maxblocks 1 2 3

8 x File Copy 256 bufsize 500 maxblocks 1 2 3

8 x File Copy 4096 bufsize 8000 maxblocks 1 2 3

8 x Pipe Throughput 1 2 3 4 5 6 7 8 9 10

8 x Pipe-based Context Switching 1 2 3 4 5 6 7 8 9 10

8 x Process Creation 1 2 3

8 x System Call Overhead 1 2 3 4 5 6 7 8 9 10

8 x Shell Scripts (1 concurrent) 1 2 3

8 x Shell Scripts (8 concurrent) 1 2 3

========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)

System: solaris.sourcepoint.com.au: GNU/Linux
OS: GNU/Linux -- 2.6.32-220.7.1.el6.x86_64 -- #1 SMP Wed Mar 7 00:52:02 GMT 2012
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 1: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 2: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 3: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 4: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 5: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 6: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
CPU 7: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz (5000.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSCALL/SYSRET
17:11:43 up 4:37, 6 users, load average: 7.33, 10.34, 8.48; runlevel 3

------------------------------------------------------------------------
Benchmark Run: Thu Mar 29 2012 17:11:43 - 17:47:10
8 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables 19435624.7 lps (10.0 s, 7 samples)
Double-Precision Whetstone 2765.0 MWIPS (9.7 s, 7 samples)
Execl Throughput 348.5 lps (31.3 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 402652.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 132149.1 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 641784.0 KBps (30.0 s, 2 samples)
Pipe Throughput 991098.6 lps (10.0 s, 7 samples)
Pipe-based Context Switching 92890.8 lps (10.0 s, 7 samples)
Process Creation 882.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 1263.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 502.7 lpm (60.1 s, 2 samples)
System Call Overhead 1170189.3 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 19435624.7 1665.4
Double-Precision Whetstone 55.0 2765.0 502.7
Execl Throughput 43.0 348.5 81.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 402652.5 1016.8
File Copy 256 bufsize 500 maxblocks 1655.0 132149.1 798.5
File Copy 4096 bufsize 8000 maxblocks 5800.0 641784.0 1106.5
Pipe Throughput 12440.0 991098.6 796.7
Pipe-based Context Switching 4000.0 92890.8 232.2
Process Creation 126.0 882.5 70.0
Shell Scripts (1 concurrent) 42.4 1263.2 297.9
Shell Scripts (8 concurrent) 6.0 502.7 837.9
System Call Overhead 15000.0 1170189.3 780.1
========
System Benchmarks Index Score 481.1

------------------------------------------------------------------------
Benchmark Run: Thu Mar 29 2012 17:47:10 - 18:23:56
8 CPUs in system; running 8 parallel copies of tests

Dhrystone 2 using register variables 138944790.0 lps (10.0 s, 7 samples)
Double-Precision Whetstone 22968.6 MWIPS (10.0 s, 7 samples)
Execl Throughput 3144.4 lps (29.4 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 295069.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 93845.6 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 684232.6 KBps (30.0 s, 2 samples)
Pipe Throughput 6113470.0 lps (10.0 s, 7 samples)
Pipe-based Context Switching 668762.2 lps (10.0 s, 7 samples)
Process Creation 4091.9 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 5655.4 lpm (60.1 s, 2 samples)
Shell Scripts (8 concurrent) 741.8 lpm (60.4 s, 2 samples)
System Call Overhead 2327030.9 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 138944790.0 11906.2
Double-Precision Whetstone 55.0 22968.6 4176.1
Execl Throughput 43.0 3144.4 731.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 295069.4 745.1
File Copy 256 bufsize 500 maxblocks 1655.0 93845.6 567.0
File Copy 4096 bufsize 8000 maxblocks 5800.0 684232.6 1179.7
Pipe Throughput 12440.0 6113470.0 4914.4
Pipe-based Context Switching 4000.0 668762.2 1671.9
Process Creation 126.0 4091.9 324.8
Shell Scripts (1 concurrent) 42.4 5655.4 1333.8
Shell Scripts (8 concurrent) 6.0 741.8 1236.3
System Call Overhead 15000.0 2327030.9 1551.4
========
System Benchmarks Index Score 1494.1
The last benchmark results I got on the same server baremetal was 669 and 2267.



I don't really know where to go with troubleshooting this.

I am told that if I go back to baremetal it won't make a difference becuase the IO is sluggish at the physical layer.

They say that it could be anything from BIOS to backplane to mixing SAS with SATA on the same hardware.

This server never ever gave me a hint of trouble.

In the iLO 2 there is no warnings. All drives apear normal.

hdparm is this

Quote:
root@solaris [~]# hdparm -tT /dev/sda

/dev/sda:
Timing cached reads: 6636 MB in 2.00 seconds = 3321.44 MB/sec
Timing buffered disk reads: 574 MB in 3.00 seconds = 191.31 MB/sec
Don't have the experience to troubleshoot this efficiently guys, the server is also remote to me so they want to charge to troubleshoot and guess and fiddle around - which I don't want them to do.

I would like to work out how to run some solid tests to see where and what the problem may be.

the server load for what it is doing seems always abnormally high like 12 and 15...

I hope I can get some help diagnosing this. Some people tell me that this is all down to virtualisation and IO will be crap but then the results I have been given above show that the slow performance is at server level anyway.

Best regards,
W

Last edited by stardotstar; 03-29-2012 at 06:23 PM.
 
Old 03-30-2012, 12:00 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,138

Rep: Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122
Quote:
Originally Posted by stardotstar View Post
I seem to be getting diabolical server loads in the vm. The other thing that I am being told is that the disk IO is shot to hell.
The latter will be causing the former in all likelihood. Do you see any %wa in top (or sar ...)
Presuming you are really talking about "loadavg".

It's possible your provider is correct. *And* you mates as well. Maybe something got bumped in the move - maybe it was always like that, and you weren't pushing the kit hard enough to find out.
Everything (especially the I/O) being virtualized might be enough to bump it over the (performance) cliff.

Did you ever do the hdparm on when you had it as bare-metal ?. Do you have historical sar data to ensure I/O loads are comparible ?.
I'd be inclined to get a liveCD booted and see some numbers from there. But then I don't have to justify the cost.
 
1 members found this post helpful.
Old 03-30-2012, 12:15 AM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,138

Rep: Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122Reputation: 4122
Just re-reading your post ...
Quote:
Originally Posted by stardotstar View Post
I am told that if I go back to baremetal it won't make a difference becuase the IO is sluggish at the physical layer.
If your provider said that, I'd make them go back to bare-metal and prove it.

As I said, they may be right, but it may also prove your case that it was o.k. in the past. Maybe they'd get some upgrade business out of it, so they may be inclined to agree to the test.
 
1 members found this post helpful.
Old 04-02-2012, 04:47 AM   #4
stardotstar
Member
 
Registered: Nov 2002
Location: /au/qld/bne/4157
Distribution: Gentoo mactel-linux
Posts: 238

Original Poster
Rep: Reputation: 30
Hi syg00 - thanks for the reply/s and sorry for slowly getting back to you
What you say makes perfect sense.
No, I don't have hdparm results or anything other than the unixbench data from the old install.
The possibility is that there is a problem just as you say - but somehow it just doesn't tally.
I am thinking RAID 0 in combination with slower disks (sata vs sas) I was using sas 10k drives in 1+0 previously and now 0 on sata.
I just wish I could get the flexability to do some other testing - I may need to get my provider to move my vm onto their hardware and do some testing on mine in the mean time - maybe add a pair of sas for the OS and then put the data and db volumes on a pair of 90G ssd even in 0's they would be significantly faster all round...

BTW I have attached screen grabs of the baremetal unixbench results (*all I have) and the current and currently typical top:

Quote:
top - 19:44:21 up 4 days, 8:10, 3 users, load average: 8.38, 6.14, 6.08
Tasks: 342 total, 19 running, 322 sleeping, 0 stopped, 1 zombie
Cpu(s): 60.6%us, 26.5%sy, 0.0%ni, 12.1%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st
Mem: 18402604k total, 10205456k used, 8197148k free, 389048k buffers
Swap: 4128760k total, 11564k used, 4117196k free, 7731956k cached
Not much in the way of wait really.
Attached Thumbnails
Click image for larger version

Name:	3.png
Views:	17
Size:	58.1 KB
ID:	9367   Click image for larger version

Name:	4.png
Views:	11
Size:	100.7 KB
ID:	9368  

Last edited by stardotstar; 04-02-2012 at 05:05 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Upgrade/Migrate VMware ESX/ESXi 4.x To ESXi 5.0 LXer Syndicated Linux News 0 09-11-2011 03:10 PM
Troubleshoot hardware issues with Debian commands shayno90 Linux - Hardware 1 05-17-2011 07:29 AM
RAID card for vmware esxi, and some other esxi questions JustinHoMi Linux - Enterprise 1 06-04-2010 03:57 PM
Poor network performance - what can I use to troubleshoot? litlmary Linux - Networking 9 09-05-2009 10:27 PM
Database performance optimization advice Michael_S Linux - Server 3 03-06-2008 08:04 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 12:51 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration