LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Enterprise Linux Forums > Linux - Enterprise
User Name
Password
Linux - Enterprise This forum is for all items relating to using Linux in the Enterprise.

Notices


Reply
  Search this Thread
Old 10-09-2009, 03:15 AM   #1
asimba
Member
 
Registered: Mar 2005
Location: 127.0.0.0
Distribution: Red Hat / Fedora
Posts: 355

Rep: Reputation: 42
SAN , NAS and IO Scheduling


Hi,

I was looking for some insights about IO scheduling (Oracle Database) on Red Hat with SAN/NAS at back end
 
Old 10-09-2009, 08:25 PM   #2
edenCC
Member
 
Registered: May 2006
Location: China
Distribution: Debian
Posts: 198
Blog Entries: 1

Rep: Reputation: 32
Before that, you might need to know the difference here:
http://planet.admon.org/2009/09/a-co...io-schedulers/
 
Old 10-09-2009, 09:38 PM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,103

Rep: Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117
I thought Oracle used O_Direct ... (not that I use Oracle)
 
Old 10-13-2009, 03:47 AM   #4
asimba
Member
 
Registered: Mar 2005
Location: 127.0.0.0
Distribution: Red Hat / Fedora
Posts: 355

Original Poster
Rep: Reputation: 42
Hi,

thank you for your responses.

I was looking for some more information though.

typically io scheduling in context/relation to io controllers and considering that we usually have multi path - so - if one group fires a read request - and it gets split across multiple channels (same goes with write as well) so how do you really measure latency - typically in some of the scenarios where you have a parent process running multiple worker threads and those worker threads may have been assigned to different different queue/io channels.

I do not think as of now there is anything reliable which gives you a proper feedback on this kind of io.

and i was looking for something in a typical dbms context where you have different read/write threads - I was trying different scheduling stuff and trying to make out WHY a given scheduler was good - a typical 360 degree view which will affect io stuff - like
-disk kind - sata/scsi
-disk blocks - 4kb etc
- os buffering
-dbms buffering
-io controller buffering
- journaling in use - if that really would affect read/writes
-any others if I am missing.

-queue depth
-queue interval /service time
- Channel bandwidth etc


i need some more insight as how to really benchmark any scheme - noop/as/cfq/deadline

Last edited by asimba; 10-13-2009 at 04:26 AM.
 
Old 10-16-2009, 08:52 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,103

Rep: Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117
Have a read of this - reasonably old. I had seen a more expansive paper (this is almost an executive overview), but I can't find it now.
With outboard (hardware) caching controller(s) I would have expected NOOP to be right up there - especially if using O_Direct. Maybe try fio as a benchmarking tool - article here.
 
Old 10-17-2009, 12:29 PM   #6
asimba
Member
 
Registered: Mar 2005
Location: 127.0.0.0
Distribution: Red Hat / Fedora
Posts: 355

Original Poster
Rep: Reputation: 42
I read it when I was googling around.

But it doesnt addresses a few more questions as I read a few Kernel.org emails regarding IO Scheduling - Vivek Goyal - had typically some good stuff going around on this and Multi IO and IO Controllers - and related stuff.

I will paste excerpt and the link itself shortly
 
Old 10-17-2009, 12:34 PM   #7
asimba
Member
 
Registered: Mar 2005
Location: 127.0.0.0
Distribution: Red Hat / Fedora
Posts: 355

Original Poster
Rep: Reputation: 42
[EXCERPT]

http://people.redhat.com/~vgoyal/io-...ller-v10.patch


Fairness at logical device level vs at physical device level
------------------------------------------------------------

IO scheduler based controller has the limitation that it works only with the
bottom most devices in the IO stack where IO scheduler is attached.

For example, assume a user has created a logical device lv0 using three
underlying disks sda, sdb and sdc. Also assume there are two tasks T1 and T2
in two groups doing IO on lv0. Also assume that weights of groups are in the
ratio of 2:1 so T1 should get double the BW of T2 on lv0 device.

T1 T2
\ /
lv0
/ | \
sda sdb sdc


Now resource control will take place only on devices sda, sdb and sdc and
not at lv0 level. So if IO from two tasks is relatively uniformly
distributed across the disks then T1 and T2 will see the throughput ratio
in proportion to weight specified. But if IO from T1 and T2 is going to
different disks and there is no contention then at higher level they both
will see same BW.

Here a second level controller can produce better fairness numbers at
logical device but most likely at redued overall throughput of the system,
because it will try to control IO even if there is no contention at phsical
possibly leaving diksks unused in the system.

Hence, question comes that how important it is to control bandwidth at
higher level logical devices also. The actual contention for resources is
at the leaf block device so it probably makes sense to do any kind of
control there and not at the intermediate devices. Secondly probably it
also means better use of available resources.

Limited Fairness
----------------
Currently CFQ idles on a sequential reader queue to make sure it gets its
fair share. A second level controller will find it tricky to anticipate.
Either it will not have any anticipation logic and in that case it will not
provide fairness to single readers in a group (as dm-ioband does) or if it
starts anticipating then we should run into these strange situations where
second level controller is anticipating on one queue/group and underlying
IO scheduler might be anticipating on something else.

Need of device mapper tools
---------------------------
A device mapper based solution will require creation of a ioband device
on each physical/logical device one wants to control. So it requires usage
of device mapper tools even for the people who are not using device mapper.
At the same time creation of ioband device on each partition in the system to
control the IO can be cumbersome and overwhelming if system has got lots of
disks and partitions with-in.


IMHO, IO scheduler based IO controller is a reasonable approach to solve the
problem of group bandwidth control, and can do hierarchical IO scheduling
more tightly and efficiently.

But I am all ears to alternative approaches and suggestions how doing things
can be done better and will be glad to implement it.

TODO
====
- code cleanups, testing, bug fixing, optimizations, benchmarking etc...
- More testing to make sure there are no regressions in CFQ.

Testing
=======

Environment
==========
A 7200 RPM SATA drive with queue depth of 31. Ext3 filesystem. I am mostly
running fio jobs which have been limited to 30 seconds run and then monitored
the throughput and latency.

Test1: Random Reader Vs Random Writers
======================================
Launched a random reader and then increasing number of random writers to see
the effect on random reader BW and max lantecies.



http://people.redhat.com/~vgoyal/io-...ller-v10.patch
 
Old 10-18-2009, 01:14 AM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,103

Rep: Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117
That describes a very specific situation - do you even use cgroups ?. Do you create different pv's on the same (physical) disk - and assign them to different lv's ? ...
As I suggested, fio might be the way to go - you can set the test as you desire. If it's good enough for the guy writing those patches, should work for you.
 
Old 10-19-2009, 03:38 AM   #9
asimba
Member
 
Registered: Mar 2005
Location: 127.0.0.0
Distribution: Red Hat / Fedora
Posts: 355

Original Poster
Rep: Reputation: 42
sure,

Specific Situation - May be / May be not.

As of now - no cgroups - but a strong possiblity may be there if we can prove our case of ( different benchmarking stuff in different scnearios)

Do you create different pv's on the same (physical) disk - and assign them to different lv's ?

In few cases - yes.


[I wont be able to give more details on existing scenario - since its confidential]
 
Old 10-22-2009, 09:02 AM   #10
SteveInTallyFL
Member
 
Registered: May 2008
Location: Tallahassee, FL
Distribution: RHEL4, RHEL 5, OEL4, OEL5
Posts: 65

Rep: Reputation: 17
Interesting discussion. Oracle prefers / suggests using ASM over raw devices for the database. In this way you accomplish multiple objectives:
  • Let the database and ASM determine and manage I/O scheduling using their heuristic models.
  • Manage database backup and recovery with RMAN.
 
Old 10-26-2009, 05:40 AM   #11
asimba
Member
 
Registered: Mar 2005
Location: 127.0.0.0
Distribution: Red Hat / Fedora
Posts: 355

Original Poster
Rep: Reputation: 42
understandably - ASM might be solution - but what about raw devices and SAN - Wouldn't SAN controller still be merging read and write requests. - Can ASM bypass Fibre controller and manage - What about bottlenecks then - would it be easier to diagnose and isolate io issues.

going further [may be unrelated] what about TCQ/NCQ [Tagged command Queueing/NAtive stuff] Would asm go further and interact with SAS / SCSI disks ?

I appreciate all posters[in given thread] for their valuable time and feedbacks

Regards,
a
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Need Help on SAN & NAS srikanthrao_d Linux - Enterprise 4 12-09-2008 10:44 AM
difference b/w SAN & NAS myfoot Linux - Hardware 1 01-28-2008 09:04 AM
NAS? SAN? I'm lost haiders Linux - Networking 5 09-20-2006 08:25 PM
Looking for a backup solution (tape/nas/san...) yanik Linux - Enterprise 2 09-07-2006 02:41 PM
nas vs san Xris718 Linux - General 6 08-25-2005 02:42 PM

LinuxQuestions.org > Forums > Enterprise Linux Forums > Linux - Enterprise

All times are GMT -5. The time now is 04:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration