LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices

Reply
 
Search this Thread
Old 07-31-2013, 09:27 AM   #16
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,109

Original Poster
Rep: Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807

Wow. Lots of good information to get my head around -- thank all of you for your input.

Bearing in mind that I've never done RAID of any kind and that what I've got is what I've got -- two 500 GB drives in each of two boxes both of which are raw, delivered with no operating system -- and that Slackware is the distribution of choice (stable, dependable, rock-solid), I've done some reading and studying and pretty much "know" that I would not want RAID-0 but that I'm not sure if some other RAID level would be useful. Never done it, think it's a good idea, but... I've just never done it.

I have a lot of experience with data base management systems (DBMS) where a basic tenet is that the data base doesn't live on the root disk (or root array) if for no other reasons than a data base will grow (sometimes to astonishing sizes) and I/O can be intense (try updating a 10 million row table joined to 10 or so other tables sometime and you'll get the idea). These two boxes are going to be running DSpace which is an institutional repository application, one a production server, the other a development/back up server (the only way I that really know how to do this sort of thing). Spread the load around is a basic rule of thumb. DSpace uses PostgreSQL as the DBMS (it can also use Oracle but my experience with Oracle has been painful so that ain't gonna happen). As far as I can tell PostgreSQL is efficient and perfectly capable for the long haul (and there's none of the monkey business with MySQL, MariaDB and Oracle screwing around in the mix).

DSpace is both a catalog and repository of documents, video, audio, images and whatever else you can think of -- you're pointing at PDF, JPEG, MPEG, whatever from the data base to a file somewhere on a storage device; not to overstate the obvious but you don't store images or PDF or whatever you have in the data base. We have to catalog over 60,000 books (only a few published in the 20th century), lord knows how many historical documents, postage stamps from the beginning of postage stamps, coins from ancient to modern, scientific instruments from the beginning of those, you name it, we got it and we have to doing something with it.

My thinking is that 500 GB is a helluva lot of gigabytes and two of 'em adds up to a terabyte spread over two drives and that's a bigger helluva lot. On my own servers I have a great deal of geographic data -- the entire world in vector, 10x10 degree "patches" of surface definition, road vectors, geographic names for every feature in every country, all that stuff. It only adds up to about 50 GB. The geographic names are text, they all live in a MySQL data base (plus the raw text data files) and there's a few million of those and it's no big deal. Used to be a big deal (did CD-ROM swapping) but on one itty-bitty 250 GB drive all is well.

Basically, I like the idea of the two drives (plus a spare or two in a box on a shelf). I'm fairly certain that I can do the entire thing on one 500 GB drive and, maybe, mirror or RAID the other drive. From what I've been reading (and I could have misread this) you can RAID two drives without striping (striping just seems silly to me given two drives) but I'm not all that clear about which/what/how to do that or if it's even worth it given the redundant systems.

And there is the other requirement -- off the shelf Slackware stable, no additional software added to do a fancy-schmancy trick. I don't want to rebuild the kernel, I don't want to install and rely on something that may or may not exist next week, I want to work with Slackware only. Yeah, I'm picky but I've got good reason for being so -- I'll buy stability any day of the week and I'll avoid proprietary like the plague (these boxes are stock Intel graphics for that reason). Gonna be bad enough just figuring out how to use DSpace (and get the users up to speed, too). I'm going to have to unload FoxBase data (good lord!) and import it for a base-line; that sounds like a whole lot of fun (never even seen FoxBase let alone used it and it went obsolete in... what, 1997? Ugh!).

On top of all that, DSpace relies on Apache: Ant, Maven, Tomcat (it runs in Tomcat). Holy toot, yet another learning curve. Oh, yeah, thanks to Apache for OpenOffice 4, gonna go that way, too -- I do like Apache software and trust it, gotta send those folks a check.

So, bottom line, it's either RAID-something on two drives and split the load, rsync to the second server and keep my toes crossed.

Tiz a puzzlement.
 
Old 07-31-2013, 10:12 AM   #17
zhjim
Senior Member
 
Registered: Oct 2004
Distribution: Debian Squeeze x86_64
Posts: 1,467
Blog Entries: 11

Rep: Reputation: 184Reputation: 184
A lot of stuff coming up to you. So either just also dig into raid if you have the time and if not just skip it.

As you already found out raid 0 is just good for nothing beside getting even more speed out of fast disks. But Raid 1 makes your life easier when your not in the office and a drive failure comes by. Okay you could have somebody exchange the broken system disk with the rsynced backup disk but need some one around that knows what hes doing. With Raid 1 one drive fails the server goes on. And the next day you just plugin the spare drive and go about daily business.
Just keep in mind that raid is not backup. Raid is for keeping things going by means of redundance. Backup is for keeping your things safe. So even when you would rsync for backup if one drive fails you lose one part of the backup. Either the system disk or the backup disk. No difference to the raid I'd say.

I think you have a lot of things already sorted out so maybe refine this to the point where your really really sure about it and then see if time allows to do some stunts.
 
1 members found this post helpful.
Old 07-31-2013, 11:25 AM   #18
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,013

Rep: Reputation: Disabled
Quote:
Originally Posted by tronayne View Post
I have a lot of experience with data base management systems (DBMS) where a basic tenet is that the data base doesn't live on the root disk (or root array) if for no other reasons than a data base will grow (sometimes to astonishing sizes) and I/O can be intense (try updating a 10 million row table joined to 10 or so other tables sometime and you'll get the idea). These two boxes are going to be running DSpace which is an institutional repository application, one a production server, the other a development/back up server (the only way I that really know how to do this sort of thing).
If it's a production server, you really want to use RAID.

Even if the application can fail over gracefully to the second server, having to reinstall a server and restore from backup because of something as trivial and commonplace as a hard drive failure could be considered an indication of suboptimal design.

The Dell PowerEdge T110 II has a built-in PERC H200 SAS controller. According to the specifications, this is a true hardware RAID controller (as opposed to a "fakeRAID" controller where the driver does all the heavy lifting). You're likely to see a slight to moderate performance increase for write operations compared to a single drive due to caching, and you may get significantly higher read performance if the controller distributes the I/O load between RAID members (not all controllers do that with RAID 1 arrays, though). Enable writeback caching (and put the server on UPS!) for maximum effect.

Quote:
Originally Posted by tronayne View Post
Spread the load around is a basic rule of thumb.
I quite agree. However, on modern systems with plenty of RAM and no other demanding services running, the OS may not represent a significant I/O load on a database server. You may very well see a higher increase in performance if you put the databases and log files on separate spindles than you would by isolating the OS on one set of spindles and having the DB and logs on another set.

If you really need the best possible performance, use SSDs. No spinning drives come even close to touching SSDs for random read and write performance, since the seek time is non-existent. Be aware, though, that SSDs fail in sudden and spectacular ways, as I mentioned earlier. RAID is mandatory in such setups.
Quote:
Originally Posted by tronayne View Post
Basically, I like the idea of the two drives (plus a spare or two in a box on a shelf). I'm fairly certain that I can do the entire thing on one 500 GB drive and, maybe, mirror or RAID the other drive. From what I've been reading (and I could have misread this) you can RAID two drives without striping (striping just seems silly to me given two drives) but I'm not all that clear about which/what/how to do that or if it's even worth it given the redundant systems.
You could use four drives and set up two separate RAID arrays on the same controller. And as for spare drives, if the controller supports Hot Spares, you can even plug the spare drive into the server. The drive will just sit there without even spinning up its platters until another drive fails and the controller needs to activate the spare.

"Striping" just refers to the fact that the data is split into "stripes" of a certain (configurable) size, which are then spread across the array members. The opposite of striping would be a non-redundant JBOD (Just a Bunch Of Disks (yes, really)) setup, where data simply spills over from the first drive to the next and so on.

Quote:
Originally Posted by tronayne View Post
And there is the other requirement -- off the shelf Slackware stable, no additional software added to do a fancy-schmancy trick. I don't want to rebuild the kernel, I don't want to install and rely on something that may or may not exist next week, I want to work with Slackware only.
The PERC H200 uses the mpt2sas driver, so that shouldn't be a problem. You may still want to consider installing the Dell System Management software, as it does temperature, fan, RAM (ECC) and drive health monitoring and can alert you of any issues via mail.

Last edited by Ser Olmy; 07-31-2013 at 11:26 AM.
 
Old 07-31-2013, 12:46 PM   #19
perbh
Member
 
Registered: May 2008
Location: Republic of Texas
Posts: 259

Rep: Reputation: 39
- and plueeeze, let us not forget ups. Those raid-problems I have had have all been of the type: power-outage (if you're really (un)lucky, in the middle of a write), then rebooting as the power comes back - and then another outage in the middle of the boot and that's when your troubles start. I have had disks (raid and no-raid) where the partition-table has been completely screwed up because of successive outages within a short time. I might well be unlucky where I live/work - but we sure have a most unreliable power-supply .... with non-raid I can recover (if lucky) cuz I make a note/backup of the partition table

Last edited by perbh; 07-31-2013 at 12:49 PM.
 
Old 07-31-2013, 12:56 PM   #20
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,013

Rep: Reputation: Disabled
Quote:
Originally Posted by perbh View Post
- and plueeeze, let us not forget ups. Those raid-problems I have had have all been of the type: power-outage (if you're really (un)lucky, in the middle of a write), then rebooting as the power comes back - and then another outage in the middle of the boot and that's when your troubles start.
Some controllers have battery-backed cache. Provided they also disable the cache in the drive (and they usually do), sudden power outages are less likely to cause data corruption. You'll also see a message similar to this on the POST screen when the power comes back: "Valid data found in cache, writing to disk."

Of course, having battery-backed cache memory on the RAID controller does not in any way mean that an UPS is not required.
 
Old 07-31-2013, 01:27 PM   #21
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,109

Original Poster
Rep: Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807
Quote:
Originally Posted by perbh View Post
- and plueeeze, let us not forget ups. Those raid-problems I have had have all been of the type: power-outage (if you're really (un)lucky, in the middle of a write), then rebooting as the power comes back - and then another outage in the middle of the boot and that's when your troubles start. I have had disks (raid and no-raid) where the partition-table has been completely screwed up because of successive outages within a short time. I might well be unlucky where I live/work - but we sure have a most unreliable power-supply .... with non-raid I can recover (if lucky) cuz I make a note/backup of the partition table
Oh, I won't (didn't?) -- there's a brand-spanking new APC Back-UPS XS 1000 (impressive name, huh?) sitting there getting its battery all charged up to keep the cable box, these boxes and the router going when the power goes away (with the apcupsd package from SlackBuilds going on the box first thing to do a clean shutdown when the power stays gone for longer than the battery can do). Got that at home, got that at the building. Love it.

'Round here power outage is not an unusual event -- happens all the time, even if for a couple of seconds. One of the joys of living in the country -- matter of fact, my damned land line went down Saturday (storms) and won't be back up until tomorrow. Maybe. Cripes.

Now, if I was really serious, I'd have a diesel-powered generator that get kicked in when the power goes off... come to think of it, I might propose a natural gas generator... the building is in town where there's natural gas versus my house where it's propane or nuthin'.

Last edited by tronayne; 07-31-2013 at 01:37 PM. Reason: Forgot the last line, dang it.
 
Old 07-31-2013, 01:43 PM   #22
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,109

Original Poster
Rep: Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807
Quote:
Originally Posted by Ser Olmy View Post
If it's a production server, you really want to use RAID.
OK, I think I'm convinced -- RAID-1 with two disks? I do have two boxes, I'll do the first one RAID and see how that goes (lordy, I do hope it's easy to set the thing up following the README on the distribution DVD). Gotta lose my virginity someday. And I hoping the built-in hardware gets recognized without a lot of fiddling and twiddling. We'll see.

On a more serious note, thank for the time and trouble you've taken -- a good read.
 
Old 07-31-2013, 01:50 PM   #23
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,013

Rep: Reputation: Disabled
Quote:
Originally Posted by tronayne View Post
OK, I think I'm convinced -- RAID-1 with two disks? I do have two boxes, I'll do the first one RAID and see how that goes (lordy, I do hope it's easy to set the thing up following the README on the distribution DVD).
That's the beauty of hardware RAID: You don't have to set it up on the client OS. All you do is press the right key on the POST screen to enter the RAID controller configuration tool, and define your RAID array there. It's a simple matter of selecting both disks and choosing "RAID 1" from a pulldown menu. You could also use the Unified Server Configurator to do this.

Once the RAID array is defined, the mpt2sas driver will pick it up and the drive will appear in Linux as /dev/sda, just like an ordinary disk. The kernel doesn't "see" the individual disks in the array at all, just the logical drive.
 
Old 08-01-2013, 05:03 AM   #24
cascade9
Senior Member
 
Registered: Mar 2011
Location: Brisneyland
Distribution: Debian, aptosid
Posts: 3,718

Rep: Reputation: 904Reputation: 904Reputation: 904Reputation: 904Reputation: 904Reputation: 904Reputation: 904Reputation: 904
Quote:
Originally Posted by tronayne View Post
Yeah, I'm picky but I've got good reason for being so -- I'll buy stability any day of the week and I'll avoid proprietary like the plague (these boxes are stock Intel graphics for that reason).
The Dell PowerEdge T110 II actaully has a Matrox G200eW video.....many of the xeon CPUs that can be installed into that system do not have intel video.

Quote:
Originally Posted by Ser Olmy View Post
The Dell PowerEdge T110 II has a built-in PERC H200 SAS controller. According to the specifications, this is a true hardware RAID controller (as opposed to a "fakeRAID" controller where the driver does all the heavy lifting). You're likely to see a slight to moderate performance increase for write operations compared to a single drive due to caching, and you may get significantly higher read performance if the controller distributes the I/O load between RAID members (not all controllers do that with RAID 1 arrays, though). Enable writeback caching (and put the server on UPS!) for maximum effect.
PERC H200 is an optional extra and may not be installed.
 
Old 08-01-2013, 10:29 AM   #25
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,013

Rep: Reputation: Disabled
Quote:
Originally Posted by cascade9 View Post
PERC H200 is an optional extra and may not be installed.
In that case, the technical specifications are misleading. The H200 is listed under "RAID Controllers" in the same way the Broadcom NIC is listed under "Network Controller" immediately below.

Are you certain the H200 is an optional extra?

Edit: It seems you right. Page 36 of the technical guide clearly shows that the H200 is only present in some configurations. On the other hand, under "Storage Controllers" it says:
"T110 II supports software RAID (PERC S100, PERC S300) and hardware RAID (PERC H200) for internal storage."
Does that mean that the H200 add-in card is installed in the basic model (as was common on earlier PowerEdge models) or not?

Edit II: Curiosity got the better of me and I had a chat with Dell support. The presence of the H200 depends on the configuration. If you select the "11 ASSCBL", "12 ASSR0CBL", "13 ASSR1CBL" or "14 ASSR10CBL" configuration, you get the H200.

@tronayne: Which version of the T110 II do you have?

Last edited by Ser Olmy; 08-01-2013 at 11:58 AM.
 
Old 08-10-2013, 10:59 AM   #26
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,109

Original Poster
Rep: Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807Reputation: 807
I actually don't know yet exactly what configuration I have -- haven't opened the boxes yet, doing a lot of parallel build-test-dammit-fix-test... on the software on my existing server.

I have decided that RAID 1 is going to be the way go (two 500G drives in each server). However, I don't know much of anything about RAID -- never had a need and never have done it. I've been reading (and re-reading) the README_RAID.TXT file on the distribution media and still haven't quite got it.

I partition systems so that /opt, /usr/local, /home, /var/lib/pgsql, /var/lib/mysql and /var/lib/virtual are mounted. I do that so I can do a clean install of Slackware at each release and I format the root partition but do not format any of the others during setup, just add them to fstab. There won't be a separate partition for MySQL (or MariaDB) on these servers because the application they're for (DSpace) is a PostgreSQL data base and there's no need for MySQL/MariaDB. DSpace is going to live in /opt, the data base(s) are going to live in /var/lib/pgsql and all add-on software source will live in /usr/local/packages (so I can quickly and easily install the required and optional packages; e.g., Apache Ant, Maven and Tomcat, OpenOffice, and SlackBuilds utilities, libraries and whatnot. Basically, I don't want a single giant partition.

There most likely will not be any virtual machines on these guys either -- no need, no want.

I'm pretty sure I going to have, roughly, root (20GB), swap, /home (20GB), opt (20GB), /usr/local (20GB), /var/lib/pgsql (300GB), /documents (100G). There's 16GB RAM, so swap may want to be 32G but that seems to be overkill and I'll settle on 16GB (don't expect these guys to start swapping). The documents directory is for linked documents, audio, video and images (that don't get stored in the data base).

I want to use the default kernel, don't want to fine-tune, don't want to fiddle with the kernel in any way, the SMP kernel is just fine. I'm pretty sure that the boxes have hardware RAID and I ought to be able to turn that on with the built-in Dell configuration software.

Where I'm getting a little confused is with the partitions -- I can't imagine that I'll have trouble doing root plus four or five additional partitions and it looks like the partition types need to be RAID Autodetect (type FD) -- with the exception of swap (and will swap be duplicated on the second drive)?

The recommendation is 100MB unformatted space at the end of the drive (just in case); OK, it's a 500GB drive, 100MB is chump change.

And that's where I'm getting a little lost -- is it really that easy to
Code:
mdadm --create /dev/md1 --level 1 --raid-devices 2 /dev/sda2 /dev/sdb2
repeating that for each partition?

And, when running setup to assign each of them into /etc/fstab?

Sorry to be so ignorant, but, as I said, never done this and don't want to screw it up and if I've missed something important I'd really appreciate a head's up about it.
 
Old 08-10-2013, 11:41 AM   #27
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,013

Rep: Reputation: Disabled
Quote:
Originally Posted by tronayne View Post
And that's where I'm getting a little lost -- is it really that easy to
Code:
mdadm --create /dev/md1 --level 1 --raid-devices 2 /dev/sda2 /dev/sdb2
repeating that for each partition?
Yes, it actually is, if you have to use software RAID (as my chat with Dell Support indicated, unless your servers are equipped with a separate RAID controller).

If you have a true hardware RAID controller, you configure the RAID using the BIOS utility and partition /dev/sda as if it were a single disk.

Quote:
Originally Posted by tronayne View Post
And, when running setup to assign each of them into /etc/fstab?
Exactly. You may want to use UUIDs or labels in /etc/fstab rather than the /dev/mdx devices, as I have seen at least one case where the numbering changed after a kernel (or perhaps udev?) upgrade.

You can stick with the standard Slackware kernel, but creating an initrd is probably a good idea as the RAID auto-detect feature has been marked as deprecated.

I would recommend using a RAID volume for swap as well, if nothing else then just to make sure a faulty disk doesn't cause a kernel panic. As for the size of a swap partition, keep in mind that dumping 16 Gb worth of paging data to swap would take ages. The system will have become practically unusable long before you get that much swap usage. I'd stick with a gigabyte or two; the old rule of "twice the size of internal RAM" doesn't make sense on a system with 16 Gb RAM.

Last edited by Ser Olmy; 08-10-2013 at 11:45 AM.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help Creating UEFI A MENU For My Bootable (BIOS/UEFI) CDROM ssenuta Linux - Hardware 0 08-27-2012 10:11 PM
UEFI smoooth103 Slackware 4 04-25-2012 11:03 AM
mysqld_multi errors when starting servers, servers start when run manually jason.rohde Linux - Server 2 10-29-2011 09:18 AM
services/servers trouble Orion Pax Mandriva 3 10-30-2003 11:31 PM


All times are GMT -5. The time now is 08:40 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration