LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   A few ZFS questions for new media server build (https://www.linuxquestions.org/questions/linux-newbie-8/a-few-zfs-questions-for-new-media-server-build-4175615890/)

road hazard 10-18-2017 07:56 AM

A few ZFS questions for new media server build
 
First off, I'd still consider myself a Linux newbie.

On one of my media servers, I've been thinking about upgrading the drives to 8TB models and switching over from using Mint/mdadm (raid 6) to Antergos (or Debian Stretch) and ZFS (RaidZ2).

By nature, the version of ZFS in Debian is a few months old (6.5.9-?) and as time goes by, it will get more and more out of date. I REALLY like rolling distros and this is why I've been playing around with Antergos. Being a rolling distro, Antergos is quick on the draw with ZFS updates. But, do you think it's worth trading POTENTIAL stability issues for newer ZFS versions? I'm not saying Antergos isn't stable (knock on wood) but since Debian Stretch is considered Mount Rushmore rock solid and stable..... and my media server needs 24x7 up-time.... not sure which route to go. I also agree that sometimes, newer software can have newer bugs that are just as dangerous as keeping older versions.

RAM. If I'm not going to be using dedupe, do I really need a crap ton of RAM? When my RaidZ2 pool is built, I'll have roughly 40+TB of usable space.

Speaking of RAM, after much research, I think I'm coming down on the side of NOT using ECC. I know, I know.

My main reason for wanting to switch over to ZFS is reducing the rebuild time. A few months ago, I expanded my array and while I maintained 24x7 up-time, the rebuild/expansion/etc. took about a full week. With ZFS, the only feature I'll use/care about is that during a rebuild, only used sectors are copied. I'm not tooooo worried about data rot and quick rebuild is the only thing I'm after.

I issued the 'Dedupe=off' command on my pool and took care of that. I also turned off compression. When I look at the status of the pool, I see 1.0x under dedupe. Does that means it's off? (Want to make sure I did it right.)

Sanity check; snapshots don't happen unless I create one, right?

I think this covers my main talking points. Thanks.

IsaacKuo 10-18-2017 10:52 AM

Unless there's some specific feature you want or anticipate, I'd go with Debian Stable. If a feature pops up which you are just really itching to get, you can always dist-upgrade to Debian Testing or Debian Unstable to get it (bearing in mind the potential risk, of course).

So, you're talking 8 8TB drives? Although it's not as technically exciting, I'd consider mirror vdevs or simple RAID1 pairs. Yeah, you only get 32TB of storage space that way, but you'll get very fast rebuilds and minimal performance impact during rebuilds.

What sort of backup do you have? I understand that it can be disappointing to only get 1/3 of your drive space as "usable" when you have a backup in addition to RAID1, but that's really a reasonable price to pay in order to get 24/7 uptime. Although really, this doesn't get you 24/7 uptime. If you really want 24/7 uptime you need some sort of cluster (to account for motherboard/CPU/power supply/etc failure).

I haven't had a need for true 24/7 uptime, so my preference has been to use rsync to maintain a pseudo-mirror backup on another computer rather than RAID1. When my primary server fails, then...well, it's down until I physically swap in my secondary (physically moving the computers, swapping my PCI Hauppage card, plugging in the cables, etc). Down time is minutes. Obviously not good for some sort of business use, but it's okay for me. It's a drop in the bucket compared to more frequent down time from power outages and internet service outages.

But if I needed true 24/7 uptime, I'd be looking at something like GlusterFS. With 4 8TB drives in one server, and the other 4 8TB drives in another server, it would provide 32TB of storage...

AwesomeMachine 10-18-2017 05:01 PM

24/7 uptime also requires some alternative to utility power.

road hazard 10-18-2017 09:01 PM

Quote:

Originally Posted by IsaacKuo (Post 5771296)
So, you're talking 8 8TB drives? Although it's not as technically exciting, I'd consider mirror vdevs or simple RAID1 pairs. Yeah, you only get 32TB of storage space that way, but you'll get very fast rebuilds and minimal performance impact during rebuilds.

I'm leaning towards 7, 8TB drives. RaidZ2 will give me 40TB of space to play with. Last time I did an expansion, nobody complained that Plex wasn't able to play back movies but it took about a week. Non-stop I/O had my heart racing until it was finished but it crossed the finished line with no problems. (Had 6, 4TB drives and added 2 more. Raid 6.) One dumb thing I forgot to do was point my cache folder to something other than my array. I'm sure this would have sped things up a hair.

Quote:

Originally Posted by IsaacKuo (Post 5771296)
What sort of backup do you have? I understand that it can be disappointing to only get 1/3 of your drive space as "usable" when you have a backup in addition to RAID1, but that's really a reasonable price to pay in order to get 24/7 uptime. Although really, this doesn't get you 24/7 uptime. If you really want 24/7 uptime you need some sort of cluster (to account for motherboard/CPU/power supply/etc failure).

I haven't had a need for true 24/7 uptime, so my preference has been to use rsync to maintain a pseudo-mirror backup on another computer rather than RAID1. When my primary server fails, then...well, it's down until I physically swap in my secondary (physically moving the computers, swapping my PCI Hauppage card, plugging in the cables, etc). Down time is minutes. Obviously not good for some sort of business use, but it's okay for me. It's a drop in the bucket compared to more frequent down time from power outages and internet service outages.

But if I needed true 24/7 uptime, I'd be looking at something like GlusterFS. With 4 8TB drives in one server, and the other 4 8TB drives in another server, it would provide 32TB of storage...

Right now, backups are done every night to a 2nd array in the same box. I know this isn't the best situation but due to money constraints, this is my only option. I'm in the process of putting together a list of parts for my next media server and it will probably be a Ryzen 1700X CPU and an AM4 board with 8 SATA ports. I'll use 1 SATA port for my boot SSD drive and plug the 7, 8TB drives into the other ports. While the Ryzen CPU supports ECC, the board I was looking at does not. :( If I moved over to ThreadRipper and the sTR4? boards (x399), I saw some boards that got good reviews and support ECC RAM but they're in the $300+ range. Which is a bit more than I want to spend on a board. After this "2.0" media server gets built, my current one will be re-purposed as a backup server.

By "24x7 up-time", I mean .......... if I need to replace a failed disk and rebuild the array, I want everyone in the house to still be able to watch movies. If my power supply fails, everyone understands that Plex will be off-line until I get to Microcenter and get a replacement. :)

AwesomeMachine 10-18-2017 10:27 PM

Do you have many movies?

road hazard 10-18-2017 10:56 PM

Quote:

Originally Posted by AwesomeMachine (Post 5771507)
Do you have many movies?

Yep. Between movies, MP3s, home movies, digital photos and TV shows recorded OTA with my trusty HDHR, I'm using about 14TB of storage.

IsaacKuo 10-19-2017 09:57 AM

Question - how are the devices configured to connect to your Plex server? Is it by IP address or by host name? Do you have a local nameserver (like dnsmasq, or maybe your router)?

If it's by host name, then there are some nice options for doing a pseudo-cluster...

If I were in your situation, my thought would be to make a pseudo-cluster of two media servers. The "primary" would be the new box. The "secondary" would be the old box. I'd split the drives between the two...something like 4x8TB + 4x4TB in each. To save on SATA ports and also money, I'd put the OS/cache on fast USB3.0 thumbdrives rather than SSDs. For these purposes, I don't think the performance difference will be noticeable (read speed on a fast USB3.0 thumbdrive will greatly exceed the maximum speed of gigabit ethernet).

This isn't a true cluster, though. It's just two servers where the "secondary" is actually a backup - using rsnapshot or simple rsync to regularly copy over files from the primary. I prefer customizing my own rsnapshot style rsync/hardlink scripts, but the basic idea is to be a backup that also stores some historical deltas.

But it's like a cluster in the sense that you can have some devices point to the secondary rather than the primary. This lets you spread transcoding load so it's not all on the primary. We're not talking fancy dynamic load balancing, but you can manually adjust load balance by choosing which device points to which server.

Assuming you have a local nameserver, you can handle the case when one server goes down by pointing both host names to the remaining server. If things are configured by IP address, though, you can't have two IP addresses point to the same NIC. It may be simpler to just have all devices pointed to the same IP address (no load spreading), and change the secondary's IP address in case the primary breaks.

And what about a single drive failure? My preference is to just have plain old independent ext4 partitions on each data drive, and use symlinks to point to the various drives. If one drive fails, I just remove that symlink and replace it with a symlink to an nfs mount of the other server's equivalent drive. Hopefully I have enough spare drive space somewhere to make a just-in-case backup of that drive, while waiting for a replacement drive to arrive.

When the replacement drive arrives, I'd just swap in the drive and use rsync to copy over the relevant files. Over gigabit ethernet, it's practically as fast as a local drive copy.

I just have a preference for using boring ext partitions rather than various fancy methods of combining multiple drives into one file system. Over the years, I have used various things like software RAID, aufs, and so on. I keep coming back to boring ext partitions because of simplicity and because it plays particularly nicely with spinning down drives (reduced noise/heat/power). I can put the most used files on just one drive, so the other drives can spin down.


All times are GMT -5. The time now is 08:00 PM.