LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 04-11-2010, 07:54 AM   #1
joko007
LQ Newbie
 
Registered: Jun 2009
Posts: 4

Rep: Reputation: 0
Wait for I/O blocks everything (Debian stable)


Hi,

I'm running home server on Debian stable with DHCP, DNS, Mail, VDR, Filesharing and my Weatherstation as main services. The filesharing is used to mount homes at clients.
The machine features an Athlon BE-2300, 3GB RAM, GB-LAN, 1TB and 1.5TB SATA HDD plus HDDs for backups. Mainboard has an NVIDIA chipset with
Code:
nVidia Corporation MCP65 SATA Controller (rev a3)
The primary disks are running in RAID1 + LVM.

My problem is that I/O, especially to HDDs, blocks the whole system, i.e. ssh login is nearly not possible and clients using the shared homes are nearly disabled. Typical causes for high I/O loads in my case are e.g. the daily backup or copying large files (-->VDR) via the network.

I googled a lot, checked outputs of e.g. (a)top or dstat and found the following:
  • One of my disks in the RAID1 (the 1.5TB one, WD EARS) uses 4kB blocks which I somewhat ignored . Hence one can see at atop that one disk is at nearly 100% (/dev/sda) while the other is somewhat idleing (/dev/sdc) in average while copying files.

    Performance of my drives looks as follows (this varies +/-5MB/s):
    1.5TB (4kB blocks)
    Code:
    /dev/sda:
     Timing buffered disk reads:  258 MB in  3.01 seconds =  85.65 MB/sec
    1TB
    Code:
    /dev/sdc:
     Timing buffered disk reads:  276 MB in  3.01 seconds =  91.58 MB/sec
    So comparison is not too bad for the 4kB drive...

    RAID
    Code:
    /dev/md1:
     Timing buffered disk reads:  248 MB in  3.01 seconds =  82.29 MB/sec
    LVM with and w/o RAID
    Code:
    /dev/mapper/home-home:
     Timing buffered disk reads:  270 MB in  3.01 seconds =  89.71 MB/sec
    
    /dev/mapper/unsafe_data-unsafe_data:
     Timing buffered disk reads:  132 MB in  3.01 seconds =  43.84 MB/sec
    Looking at atop while copying a 150MB file to the server via samba:
    Code:
    ATOP - red                2010/04/11  14:41:26               10 seconds elapsed
    PRC | sys   1.72s | user   0.67s | #proc    162 | #zombie    0 | #exit      0 |
    CPU | sys      7% | user      3% | irq       6% | idle    112% | wait     72% |
    cpu | sys      6% | user      2% | irq       5% | idle     16% | cpu001 w 71% |
    cpu | sys      1% | user      1% | irq       0% | idle     98% | cpu000 w  0% |
    CPL | avg1   3.40 | avg5    1.18 | avg15   0.54 | csw     8050 | intr   55853 |
    MEM | tot    3.0G | free  659.1M | cache   1.8G | buff   39.9M | slab  208.4M |
    SWP | tot    2.8G | free    2.8G |              | vmcom 736.2M | vmlim   4.3G |
    DSK |         sda | busy     97% | read       0 | write    246 | avio   40 ms |
    DSK |         sdc | busy      3% | read       0 | write     67 | avio    4 ms |
    NET | transport   | tcpi   39141 | tcpo   13703 | udpi       2 | udpo       0 |
    NET | network     | ipi    39143 | ipo    13703 | ipfrw      0 | deliv  39143 |
    NET | lan      4% | pcki   39139 | pcko   13699 | si   46 Mbps | so  814 Kbps |
    NET | lo     ---- | pcki       4 | pcko       4 | si    0 Kbps | so    0 Kbps |
    
      PID  SYSCPU  USRCPU  VGROW  RGROW  RDDSK  WRDSK  ST EXC S  CPU CMD     1/1
     3433   0.62s   0.61s   -24K   -20K     0K     0K  --   - S  12% vdr-kbd
    25398   0.96s   0.06s     0K     0K     0K 54196K  --   - D  10% smbd
    27078   0.06s   0.00s     0K     0K     0K     0K  --   - R   1% atop
     1169   0.03s   0.00s     0K     0K     0K     0K  --   - D   0% md4_raid1
     2544   0.03s   0.00s     0K     0K     0K   152K  --   - S   0% kjournald
     2536   0.02s   0.00s     0K     0K     0K     0K  --   - S   0% xfsdatad/1
     2546   0.00s   0.00s     0K     0K     0K     0K  --   - S   0% kjournald
    26972   0.00s   0.00s     0K     0K     0K     0K  --   - D   0% pdflush
    27077   0.00s   0.00s     0K     0K     0K  4080K  --   - D   0% pdflush
    Note the difference between sda and sdc which are in the same RAID.
  • The dstat output seems to show some let's call it hickups. For example copying a 150MB file via samba to the server looks like the following:
    Code:
    ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
    usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
      2   1  95   2   0   0|4056k  866k|   0     0 | 2.1B  2.1B| 728   844
      1   0  99   0   0   0|   0     0 |  60B  302B|   0     0 | 605   496
      1   0  93   5   0   0|   0  2937k|  60B  522B|   0     0 | 663   537
      1   7  85   0   3   4|   0   552k|  13M  255k|   0     0 |  13k 1078
      2  28  42  15   5   8|   0    18M|  38M  653k|   0     0 |  35k 2027
      1   0  48  50   0   0|   0  6144k|  60B  476B|   0     0 | 676   643
      1   1  49  50   0   0|   0  5936k| 125k 3720B|   0     0 | 760   546
      1   1  48  50   0   0|   0    15M| 376k   38k|   0     0 | 994   613
      1   2  48  49   0   0|   0    14M| 126k 3152B|   0     0 | 782   684
      0   2  46  51   0   0|   0  6856k| 627k   14k|   0     0 |1206   591
      1   0   0  99   0   0|   0    21M|  63k  624B|   0     0 | 693   589
      1   0   0  99   0   0|   0  1416k|  60B  318B|   0     0 | 609   500
      0   1   0  99   0   0|   0     0 |  60B  318B|   0     0 | 609   489
      1   1   5  92   0   0|   0    24M|  60B  318B|   0     0 | 665   527
      0   1  48  50   0   0|   0    32M| 244B  318B|   0     0 | 673   593
      0   1  49  50   0   0|   0  8624k| 336B  318B|   0     0 | 628   491
      1   1  49  49   0   0|   0    21M| 244B  428B|   0     0 | 640   495
      0   0  49  50   0   0|   0     0 | 336B  412B|   0     0 | 519   836
      5   4  41  50   0   0|   0     0 | 244B  302B|   0     0 | 510   659
      3  14  27  45   4   6|   0    43M|  24M  420k|   0     0 |  22k 1682
      2  19  45  21   5   8|   0    89M|  27M  472k|   0     0 |  25k 1705
      0   1  48  50   0   0|   0  8632k|  60B  302B|   0     0 | 612   460
      1  10  49  30   2   7|   0    51M|  22M  379k|   0     0 |  20k 1465
      3  16  32  40   3   7|   0    74M|  24M  462k|   0     0 |  22k 1470
      0   0   0  99   0   0|   0     0 |  60B  318B|   0     0 | 612   428
      0   1   0  99   0   0|   0  9880k|  60B  318B|   0     0 | 629   453
      1   1  49  48   0   0|   0    35M|  60B  318B|   0     0 | 655   475
      0   0  77  22   0   0|   0  3888k|  60B  318B|   0     0 | 714   490
      0   0 100   0   0   0|   0  8192B|  60B  318B|   0     0 | 609   421
      0   4  78  18   0   0|   0    32M|  60B  318B|   0     0 | 668   498
      1   0  69  29   0   0|   0    76M|  60B  318B|   0     0 | 689   439
      0   1  99   0   0   0|   0  1728k|  60B  318B|   0     0 | 628   470
    Why are there such gaps? Is this normal?
  • Google brought me to some discussions like e.g. http://www.linuxquestions.org/questi...y-high-794777/ or http://forum.ubuntuusers.de/topic/fe.../#post-2360287 (in german, sorry). But I found nothing on how to deal with high I/O loads blocking the system.

So I guess my problem is twofold:
  1. Can I tell the kernel to balance better between different I/O tasks, i.e. the backup is not rendering my clients completely unusable? This might mitigate the symptoms.
  2. Any suggestions regarding the general performance or regarding the (in my opinion) strange values above? If I'm honest, I would like to avoid setting up the system from scratch....

Thanks in advance,
joko
 
Old 04-11-2010, 03:25 PM   #2
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,824

Rep: Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615
Try installing schedutils and preceding your intensive commands with "nice ionice -c3"
 
Old 04-13-2010, 04:28 PM   #3
joko007
LQ Newbie
 
Registered: Jun 2009
Posts: 4

Original Poster
Rep: Reputation: 0
Hi AlucardZero,

thanks for your answer! I just gave it a quick try copying a large file and the systems remains at least usable. Hence it seems to reduce my blocking symptoms. I'll further investigate it adding this to my backup scripts the next days.
BTW: For Debian the nice command is included in the util-linux package.

Furthermore I found that maybe tweaking the NFS-configuration might also help for the clients mounting their homes from the server. I will try to check this in the next days too. I guess the interesting parameters might be

Code:
RPCNFSDCOUNT=8
RPCNFSDPRIORITY=0
defining the number of processes and the priority --> NFS-Howto: http://nfs.sourceforge.net/nfs-howto/.

Nevertheless the basic performance issues still remain.

Best regards,
joko
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
tar.gz of a gnome-panel more stable for debian stable :lol frenchn00b Debian 4 05-07-2008 10:32 AM
LXer: For me, Debian Testing is more stable than Stable LXer Syndicated Linux News 0 04-22-2008 05:20 AM
Can't Wait for Debian Etch to go Stable richinsc Debian 12 04-13-2007 10:32 AM
Is Debian 3.1r4 "stable" really stable or it just called that? General Debian 22 01-29-2007 05:18 PM
Debian stable or non stable 1702fp Debian 19 02-24-2005 04:54 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 01:26 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration