LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   Two New Servers, UEFI, Am I Going to Have Trouble? (https://www.linuxquestions.org/questions/slackware-14/two-new-servers-uefi-am-i-going-to-have-trouble-4175471322/)

tronayne 07-31-2013 08:27 AM

Wow. Lots of good information to get my head around -- thank all of you for your input.

Bearing in mind that I've never done RAID of any kind and that what I've got is what I've got -- two 500 GB drives in each of two boxes both of which are raw, delivered with no operating system -- and that Slackware is the distribution of choice (stable, dependable, rock-solid), I've done some reading and studying and pretty much "know" that I would not want RAID-0 but that I'm not sure if some other RAID level would be useful. Never done it, think it's a good idea, but... I've just never done it.

I have a lot of experience with data base management systems (DBMS) where a basic tenet is that the data base doesn't live on the root disk (or root array) if for no other reasons than a data base will grow (sometimes to astonishing sizes) and I/O can be intense (try updating a 10 million row table joined to 10 or so other tables sometime and you'll get the idea). These two boxes are going to be running DSpace which is an institutional repository application, one a production server, the other a development/back up server (the only way I that really know how to do this sort of thing). Spread the load around is a basic rule of thumb. DSpace uses PostgreSQL as the DBMS (it can also use Oracle but my experience with Oracle has been painful so that ain't gonna happen). As far as I can tell PostgreSQL is efficient and perfectly capable for the long haul (and there's none of the monkey business with MySQL, MariaDB and Oracle screwing around in the mix).

DSpace is both a catalog and repository of documents, video, audio, images and whatever else you can think of -- you're pointing at PDF, JPEG, MPEG, whatever from the data base to a file somewhere on a storage device; not to overstate the obvious but you don't store images or PDF or whatever you have in the data base. We have to catalog over 60,000 books (only a few published in the 20th century), lord knows how many historical documents, postage stamps from the beginning of postage stamps, coins from ancient to modern, scientific instruments from the beginning of those, you name it, we got it and we have to doing something with it.

My thinking is that 500 GB is a helluva lot of gigabytes and two of 'em adds up to a terabyte spread over two drives and that's a bigger helluva lot. On my own servers I have a great deal of geographic data -- the entire world in vector, 10x10 degree "patches" of surface definition, road vectors, geographic names for every feature in every country, all that stuff. It only adds up to about 50 GB. The geographic names are text, they all live in a MySQL data base (plus the raw text data files) and there's a few million of those and it's no big deal. Used to be a big deal (did CD-ROM swapping) but on one itty-bitty 250 GB drive all is well.

Basically, I like the idea of the two drives (plus a spare or two in a box on a shelf). I'm fairly certain that I can do the entire thing on one 500 GB drive and, maybe, mirror or RAID the other drive. From what I've been reading (and I could have misread this) you can RAID two drives without striping (striping just seems silly to me given two drives) but I'm not all that clear about which/what/how to do that or if it's even worth it given the redundant systems.

And there is the other requirement -- off the shelf Slackware stable, no additional software added to do a fancy-schmancy trick. I don't want to rebuild the kernel, I don't want to install and rely on something that may or may not exist next week, I want to work with Slackware only. Yeah, I'm picky but I've got good reason for being so -- I'll buy stability any day of the week and I'll avoid proprietary like the plague (these boxes are stock Intel graphics for that reason). Gonna be bad enough just figuring out how to use DSpace (and get the users up to speed, too). I'm going to have to unload FoxBase data (good lord!) and import it for a base-line; that sounds like a whole lot of fun (never even seen FoxBase let alone used it and it went obsolete in... what, 1997? Ugh!).

On top of all that, DSpace relies on Apache: Ant, Maven, Tomcat (it runs in Tomcat). Holy toot, yet another learning curve. Oh, yeah, thanks to Apache for OpenOffice 4, gonna go that way, too -- I do like Apache software and trust it, gotta send those folks a check.

So, bottom line, it's either RAID-something on two drives and split the load, rsync to the second server and keep my toes crossed.

Tiz a puzzlement.

zhjim 07-31-2013 09:12 AM

A lot of stuff coming up to you. So either just also dig into raid if you have the time and if not just skip it.

As you already found out raid 0 is just good for nothing beside getting even more speed out of fast disks. But Raid 1 makes your life easier when your not in the office and a drive failure comes by. Okay you could have somebody exchange the broken system disk with the rsynced backup disk but need some one around that knows what hes doing. With Raid 1 one drive fails the server goes on. And the next day you just plugin the spare drive and go about daily business.
Just keep in mind that raid is not backup. Raid is for keeping things going by means of redundance. Backup is for keeping your things safe. So even when you would rsync for backup if one drive fails you lose one part of the backup. Either the system disk or the backup disk. No difference to the raid I'd say.

I think you have a lot of things already sorted out so maybe refine this to the point where your really really sure about it and then see if time allows to do some stunts.

Ser Olmy 07-31-2013 10:25 AM

Quote:

Originally Posted by tronayne (Post 5000244)
I have a lot of experience with data base management systems (DBMS) where a basic tenet is that the data base doesn't live on the root disk (or root array) if for no other reasons than a data base will grow (sometimes to astonishing sizes) and I/O can be intense (try updating a 10 million row table joined to 10 or so other tables sometime and you'll get the idea). These two boxes are going to be running DSpace which is an institutional repository application, one a production server, the other a development/back up server (the only way I that really know how to do this sort of thing).

If it's a production server, you really want to use RAID.

Even if the application can fail over gracefully to the second server, having to reinstall a server and restore from backup because of something as trivial and commonplace as a hard drive failure could be considered an indication of suboptimal design.

The Dell PowerEdge T110 II has a built-in PERC H200 SAS controller. According to the specifications, this is a true hardware RAID controller (as opposed to a "fakeRAID" controller where the driver does all the heavy lifting). You're likely to see a slight to moderate performance increase for write operations compared to a single drive due to caching, and you may get significantly higher read performance if the controller distributes the I/O load between RAID members (not all controllers do that with RAID 1 arrays, though). Enable writeback caching (and put the server on UPS!) for maximum effect.

Quote:

Originally Posted by tronayne (Post 5000244)
Spread the load around is a basic rule of thumb.

I quite agree. However, on modern systems with plenty of RAM and no other demanding services running, the OS may not represent a significant I/O load on a database server. You may very well see a higher increase in performance if you put the databases and log files on separate spindles than you would by isolating the OS on one set of spindles and having the DB and logs on another set.

If you really need the best possible performance, use SSDs. No spinning drives come even close to touching SSDs for random read and write performance, since the seek time is non-existent. Be aware, though, that SSDs fail in sudden and spectacular ways, as I mentioned earlier. RAID is mandatory in such setups.
Quote:

Originally Posted by tronayne (Post 5000244)
Basically, I like the idea of the two drives (plus a spare or two in a box on a shelf). I'm fairly certain that I can do the entire thing on one 500 GB drive and, maybe, mirror or RAID the other drive. From what I've been reading (and I could have misread this) you can RAID two drives without striping (striping just seems silly to me given two drives) but I'm not all that clear about which/what/how to do that or if it's even worth it given the redundant systems.

You could use four drives and set up two separate RAID arrays on the same controller. And as for spare drives, if the controller supports Hot Spares, you can even plug the spare drive into the server. The drive will just sit there without even spinning up its platters until another drive fails and the controller needs to activate the spare.

"Striping" just refers to the fact that the data is split into "stripes" of a certain (configurable) size, which are then spread across the array members. The opposite of striping would be a non-redundant JBOD (Just a Bunch Of Disks (yes, really)) setup, where data simply spills over from the first drive to the next and so on.

Quote:

Originally Posted by tronayne (Post 5000244)
And there is the other requirement -- off the shelf Slackware stable, no additional software added to do a fancy-schmancy trick. I don't want to rebuild the kernel, I don't want to install and rely on something that may or may not exist next week, I want to work with Slackware only.

The PERC H200 uses the mpt2sas driver, so that shouldn't be a problem. You may still want to consider installing the Dell System Management software, as it does temperature, fan, RAM (ECC) and drive health monitoring and can alert you of any issues via mail.

perbh 07-31-2013 11:46 AM

- and plueeeze, let us not forget ups. Those raid-problems I have had have all been of the type: power-outage (if you're really (un)lucky, in the middle of a write), then rebooting as the power comes back - and then another outage in the middle of the boot and that's when your troubles start. I have had disks (raid and no-raid) where the partition-table has been completely screwed up because of successive outages within a short time. I might well be unlucky where I live/work - but we sure have a most unreliable power-supply .... with non-raid I can recover (if lucky) cuz I make a note/backup of the partition table

Ser Olmy 07-31-2013 11:56 AM

Quote:

Originally Posted by perbh (Post 5000378)
- and plueeeze, let us not forget ups. Those raid-problems I have had have all been of the type: power-outage (if you're really (un)lucky, in the middle of a write), then rebooting as the power comes back - and then another outage in the middle of the boot and that's when your troubles start.

Some controllers have battery-backed cache. Provided they also disable the cache in the drive (and they usually do), sudden power outages are less likely to cause data corruption. You'll also see a message similar to this on the POST screen when the power comes back: "Valid data found in cache, writing to disk."

Of course, having battery-backed cache memory on the RAID controller does not in any way mean that an UPS is not required.

tronayne 07-31-2013 12:27 PM

Quote:

Originally Posted by perbh (Post 5000378)
- and plueeeze, let us not forget ups. Those raid-problems I have had have all been of the type: power-outage (if you're really (un)lucky, in the middle of a write), then rebooting as the power comes back - and then another outage in the middle of the boot and that's when your troubles start. I have had disks (raid and no-raid) where the partition-table has been completely screwed up because of successive outages within a short time. I might well be unlucky where I live/work - but we sure have a most unreliable power-supply .... with non-raid I can recover (if lucky) cuz I make a note/backup of the partition table

Oh, I won't (didn't?) -- there's a brand-spanking new APC Back-UPS XS 1000 (impressive name, huh?) sitting there getting its battery all charged up to keep the cable box, these boxes and the router going when the power goes away (with the apcupsd package from SlackBuilds going on the box first thing to do a clean shutdown when the power stays gone for longer than the battery can do). Got that at home, got that at the building. Love it.

'Round here power outage is not an unusual event -- happens all the time, even if for a couple of seconds. One of the joys of living in the country -- matter of fact, my damned land line went down Saturday (storms) and won't be back up until tomorrow. Maybe. Cripes.

Now, if I was really serious, I'd have a diesel-powered generator that get kicked in when the power goes off... come to think of it, I might propose a natural gas generator... the building is in town where there's natural gas versus my house where it's propane or nuthin'.

tronayne 07-31-2013 12:43 PM

Quote:

Originally Posted by Ser Olmy (Post 5000336)
If it's a production server, you really want to use RAID.

OK, I think I'm convinced -- RAID-1 with two disks? I do have two boxes, I'll do the first one RAID and see how that goes (lordy, I do hope it's easy to set the thing up following the README on the distribution DVD). Gotta lose my virginity someday. And I hoping the built-in hardware gets recognized without a lot of fiddling and twiddling. We'll see.

On a more serious note, thank for the time and trouble you've taken -- a good read.

Ser Olmy 07-31-2013 12:50 PM

Quote:

Originally Posted by tronayne (Post 5000417)
OK, I think I'm convinced -- RAID-1 with two disks? I do have two boxes, I'll do the first one RAID and see how that goes (lordy, I do hope it's easy to set the thing up following the README on the distribution DVD).

That's the beauty of hardware RAID: You don't have to set it up on the client OS. All you do is press the right key on the POST screen to enter the RAID controller configuration tool, and define your RAID array there. It's a simple matter of selecting both disks and choosing "RAID 1" from a pulldown menu. You could also use the Unified Server Configurator to do this.

Once the RAID array is defined, the mpt2sas driver will pick it up and the drive will appear in Linux as /dev/sda, just like an ordinary disk. The kernel doesn't "see" the individual disks in the array at all, just the logical drive.

cascade9 08-01-2013 04:03 AM

Quote:

Originally Posted by tronayne (Post 5000244)
Yeah, I'm picky but I've got good reason for being so -- I'll buy stability any day of the week and I'll avoid proprietary like the plague (these boxes are stock Intel graphics for that reason).

The Dell PowerEdge T110 II actaully has a Matrox G200eW video.....many of the xeon CPUs that can be installed into that system do not have intel video.

Quote:

Originally Posted by Ser Olmy (Post 5000336)
The Dell PowerEdge T110 II has a built-in PERC H200 SAS controller. According to the specifications, this is a true hardware RAID controller (as opposed to a "fakeRAID" controller where the driver does all the heavy lifting). You're likely to see a slight to moderate performance increase for write operations compared to a single drive due to caching, and you may get significantly higher read performance if the controller distributes the I/O load between RAID members (not all controllers do that with RAID 1 arrays, though). Enable writeback caching (and put the server on UPS!) for maximum effect.

PERC H200 is an optional extra and may not be installed.

Ser Olmy 08-01-2013 09:29 AM

Quote:

Originally Posted by cascade9 (Post 5000818)
PERC H200 is an optional extra and may not be installed.

In that case, the technical specifications are misleading. The H200 is listed under "RAID Controllers" in the same way the Broadcom NIC is listed under "Network Controller" immediately below.

Are you certain the H200 is an optional extra?

Edit: It seems you right. Page 36 of the technical guide clearly shows that the H200 is only present in some configurations. On the other hand, under "Storage Controllers" it says:
"T110 II supports software RAID (PERC S100, PERC S300) and hardware RAID (PERC H200) for internal storage."
Does that mean that the H200 add-in card is installed in the basic model (as was common on earlier PowerEdge models) or not?

Edit II: Curiosity got the better of me and I had a chat with Dell support. The presence of the H200 depends on the configuration. If you select the "11 ASSCBL", "12 ASSR0CBL", "13 ASSR1CBL" or "14 ASSR10CBL" configuration, you get the H200.

@tronayne: Which version of the T110 II do you have?

tronayne 08-10-2013 09:59 AM

I actually don't know yet exactly what configuration I have -- haven't opened the boxes yet, doing a lot of parallel build-test-dammit-fix-test... on the software on my existing server.

I have decided that RAID 1 is going to be the way go (two 500G drives in each server). However, I don't know much of anything about RAID -- never had a need and never have done it. I've been reading (and re-reading) the README_RAID.TXT file on the distribution media and still haven't quite got it.

I partition systems so that /opt, /usr/local, /home, /var/lib/pgsql, /var/lib/mysql and /var/lib/virtual are mounted. I do that so I can do a clean install of Slackware at each release and I format the root partition but do not format any of the others during setup, just add them to fstab. There won't be a separate partition for MySQL (or MariaDB) on these servers because the application they're for (DSpace) is a PostgreSQL data base and there's no need for MySQL/MariaDB. DSpace is going to live in /opt, the data base(s) are going to live in /var/lib/pgsql and all add-on software source will live in /usr/local/packages (so I can quickly and easily install the required and optional packages; e.g., Apache Ant, Maven and Tomcat, OpenOffice, and SlackBuilds utilities, libraries and whatnot. Basically, I don't want a single giant partition.

There most likely will not be any virtual machines on these guys either -- no need, no want.

I'm pretty sure I going to have, roughly, root (20GB), swap, /home (20GB), opt (20GB), /usr/local (20GB), /var/lib/pgsql (300GB), /documents (100G). There's 16GB RAM, so swap may want to be 32G but that seems to be overkill and I'll settle on 16GB (don't expect these guys to start swapping). The documents directory is for linked documents, audio, video and images (that don't get stored in the data base).

I want to use the default kernel, don't want to fine-tune, don't want to fiddle with the kernel in any way, the SMP kernel is just fine. I'm pretty sure that the boxes have hardware RAID and I ought to be able to turn that on with the built-in Dell configuration software.

Where I'm getting a little confused is with the partitions -- I can't imagine that I'll have trouble doing root plus four or five additional partitions and it looks like the partition types need to be RAID Autodetect (type FD) -- with the exception of swap (and will swap be duplicated on the second drive)?

The recommendation is 100MB unformatted space at the end of the drive (just in case); OK, it's a 500GB drive, 100MB is chump change.

And that's where I'm getting a little lost -- is it really that easy to
Code:

mdadm --create /dev/md1 --level 1 --raid-devices 2 /dev/sda2 /dev/sdb2
repeating that for each partition?

And, when running setup to assign each of them into /etc/fstab?

Sorry to be so ignorant, but, as I said, never done this and don't want to screw it up and if I've missed something important I'd really appreciate a head's up about it.

Ser Olmy 08-10-2013 10:41 AM

Quote:

Originally Posted by tronayne (Post 5006800)
And that's where I'm getting a little lost -- is it really that easy to
Code:

mdadm --create /dev/md1 --level 1 --raid-devices 2 /dev/sda2 /dev/sdb2
repeating that for each partition?

Yes, it actually is, if you have to use software RAID (as my chat with Dell Support indicated, unless your servers are equipped with a separate RAID controller).

If you have a true hardware RAID controller, you configure the RAID using the BIOS utility and partition /dev/sda as if it were a single disk.

Quote:

Originally Posted by tronayne (Post 5006800)
And, when running setup to assign each of them into /etc/fstab?

Exactly. You may want to use UUIDs or labels in /etc/fstab rather than the /dev/mdx devices, as I have seen at least one case where the numbering changed after a kernel (or perhaps udev?) upgrade.

You can stick with the standard Slackware kernel, but creating an initrd is probably a good idea as the RAID auto-detect feature has been marked as deprecated.

I would recommend using a RAID volume for swap as well, if nothing else then just to make sure a faulty disk doesn't cause a kernel panic. As for the size of a swap partition, keep in mind that dumping 16 Gb worth of paging data to swap would take ages. The system will have become practically unusable long before you get that much swap usage. I'd stick with a gigabyte or two; the old rule of "twice the size of internal RAM" doesn't make sense on a system with 16 Gb RAM.


All times are GMT -5. The time now is 05:39 PM.