Home Server: RAID-GREED and Why Mirroring is Still Best
After moving my blog to its new home and getting my hands dirty with Drupal, it's time to continue my series of blog articles about setting up a home server. Remember? We talked about home server requirements, then I presented to you my small and energy-efficient, still ECC-protected and powerful AMD-based home server. Now it's time to explore some different ZFS disk pool RAID strategies.
The great thing about ZFS for home servers is that it gives you the power of RAID without the need to pay for expensive RAID-cards. In fact, the RAID options offered by ZFS rival those of really big, powerful and expensive enterprise disk systems! And all that with cheap, consumer-grade disks.
A lot of this has been written already, so let me focus on ZFS RAID options in a home server context (you can skip this short ZFS intro and jump straight to the RAID-Greed discussion below if you know this already):
Home Server ZFS RAID Options
For home servers, you have the same options than your big server brothers in datacenter-land. Enterprise RAID power to the people!
- RAID-0: Basic striping. More disks, more space. Simple but dangerous: One disk breaks, and your whole pool is lost. Only recommended in addition to configuring RAID for fault tolerance. See the
zpool(1M) addman page. - RAID-1: Basic mirroring. More disks, more reliability, same capacity. If a disk breaks, you still have enough disks left with all the data. You also get more read speed: ZFS will fetch data blocks in a round-robin fashion from all disks in the mirror, in parallel. While 2-way mirrors are most often used, 3- or more-way mirrors are possible, too. See
zpool(1M) attachfor details. - RAID-Z: Similarly to RAID-5, you get n-1 disks worth of space, and spend 1 parity disk for fault tolerance. One disk breaks, you still get to your data. Performance is more complicated: Writes are spread across all disks, so essentially, per I/O, you'll always get the performance of a single disk. Same for reads. If you're lucky and the stars align and you read a lot of data at once, you may get more than that. I'll let the
zpool(1M)man page explain the rest. - RAID-Z2 and -Z3: Same as RAID-Z but you get to spend 2 or 3 disks for parity. This means that 2 or 3 disks may fail before you lose any data. You guessed it: The
zpool(1M)man page is your friend.
What ZFS Gives You that Controllers Can't
So far, so good. There are a couple of things that ZFS RAID levels offer that traditional RAID controllers can't:
- ZFS RAID is about as fast as hardware RAID. That's right, there's no need to pay money for RAID controllers any more. The explanation is hidden in the algorithms behind ZFS and there are a couple of articles out there to explain why. You can trust me on this or ask me to write an extra article on ZFS performance.
- ZFS can detect and recover from partial fails. It's easy to detect a disk that is completely broken: It won't answer to any SATA/SCSI commands. It's less easy for the cases where the disk seems happy, but returns bad data. That's where you lose data without knowing, and these cases are much more common! ZFS will detect any bad block and be able to recover from it, if you configured any RAID-level above 0. Before ZFS, nothing could give you that level of data integrity. In my home server experience with consumer grade disks, I typically see checksum errors every 4-6 months and about 1 fully broken disk every 2 years, with a total population of about 4-6 disks.
- ZFS is open-source and cross-platform. This is important, because you can rip out ZFS disks from any server, mix them if you like, then put them into any other server that speaks ZFS, and your pool will be readable. Try that with a 3 year old RAID controller when it's broken and you can't buy any replacement one!
There's still more but let's focus on a different question, one that home server builders tend to neglect too often:
RAID-GREED: Is RAID-Z the Right Choice?
I've seen many people, enterprise customers, developers, consultants and home server builders blindly deciding to use RAID-Z (or RAID-5 if they use a controller). It seems like a natural choice: You buy disks by the Gigabyte, you want to get the most out of them, so you configure them for maximum space, because all that counts is capacity. Really?
Let's say we want to create a 2TB pool. What are the options?
- A single 2TB disk or multiple smaller, striped disks: Forget it. We're talking consumer disks here, they'll break sooner rather than later and then all of your data will be lost. If this is a backup, or scratch space and you have other copies elsewhere, this may work, but not for the main data pool of your server.
- Mirroring: Buy 2 x 2 TB disks and mirror them together. Today, a pair of Samsung F3 HD203WI
(the cheapest 2 TB disks I could find on Amazon.de) will cost you EUR 280 and you're done. You could also buy 4 x 1 TB disks and stripe together 2 mirrors of 2 disks to get a total of 2 TB, but that would be slightly more expensive (around EUR 300). If you're after performance, this is still a good option, because 4 disks in a striped mirror tend to be twice as fast as 2.
- RAID-Z: Buy 3 x 1 TB (for example the Samsung F3 HD103SJ
, which is the small brother of the 2TB drive above), and you'll get a 2TB RAID-Z pool for about EUR 225.
Mirroring and RAID-Z Compared
So how does Mirroring and RAID-Z compare?
- Price: RAID-Z wins by about 20%. Nice.
- Performance: We'll assume a random read usage pattern. The mirror will write at about the same performance as a single disk. So does the RAID-Z pool. But for reads, the mirror will be up to twice as fast, because data blocks can be fetched in parallel from both disks. The RAID-Z configuration needs to access all disks for every single block of data, so not much to gain here. There is a corner case where streaming large amounts of data (if you're lucky) can take advantage of all the disks in parallel, but this doesn't apply to the regular usage pattern of a home server.
The difference gets worse if you increase the number of disks: A 2 x 2 mirror will be roughly twice as fast for writes and 4 times as fast for reads, while a 2+1 RAID-Z is still stuck at the speed of a single disk for both writes and reads.
Does performance matter for a home server? Maybe, maybe not: Average performance for a single drive of this class is around 60-70 MB/s (and I've seen that on my current home server, too), but this is only about 50% of what Gigabit Ethernet offers, so you won't be able to run a backup at full network speed. But I agree that 60 MB/s may be enough for most use cases. - Fault-Tolerance: In the simplest case, there's almost a draw: Both the mirror and the RAID-Z config can compensate for 1 broken disk. But since the RAID-Z pool has more disks (and assuming roughly the same per-disk probability of disk failure), there's a bigger chance that a disk breaks and that you'll need to order a replacement (and replace it quickly enough!) for RAID-Z. The difference increases with the number of disks: In a 2x2 mirror of 2TB disk scenario, up to 2 disks may break, if they're the right ones, before data is lost. In a 3+1 RAID-Z scenario (same number of disks), still only one disk may break before you're in trouble.
- Flexibility: So you want to upgrade your pool because your new DSLR filled the old one too quickly with all those RAWs. In the 2-way mirror case, you just buy 2 extra disks (of your choice) and
zpool addthem, and you're done. Or you buy 2 bigger disks and replace your old disks with them, anticipating that the old one may break soon. With RAID-Z, the entry hurdle to replacement starts at 3 disks and goes up with stripe size, unless you want to end up with a chaotically mixed RAID-Z+Mirror configuration (This works, but is not recommended). This may become unwieldy if your server case isn't the biggest and it introduces yet more disks into your pool that may break and add to your gray hair count.
If you want to dig deeper, Richard Elling wrote a more complete discussion (with more disks) of Price/Performance/Fault Tolerance of RAID-Levels which I highly recommend to read.
To summarize: The 20% price advantage that RAID-Z gives us, doesn't buy us very much:
- We get slower read performance,
- we get less fault-tolerance (read: A higher probability that we lose data), and
- we get less flexibility and more clutter.
More disks in a RAID-Z set mean bigger savings, but performance, fault tolerance and flexibility gets worse the bigger the stripes get.
Conclusion and a Bonus Strategy
You already guessed it: I like mirroring! Especially for home servers. It's simple. It's fast. It does a better job at protecting my data. I can expand my pool in increments of 2, not 3 disks. As Richard concludes in his article: Life's happier with mirrors.
I even added an extra bonus to my home server pool strategy:
- In the summer of 2009, when I built my home system, I bought 2 x 1.5 TB disks (the older Samsung F2 HD154UI
) and mirrored them (today, I'd buy two same-sized disks from different vendors, just to be sure not to run into any serial production issues).
- In January, one of the disks showed 14 read errors. ZFS was able to fix them, because the disks were mirrored, and I got a warning: Time to think about hot spares. I bought the 2 TB Western Digital WD20EADS
for two reasons:
- Just in case Samsung had a bad year, I wanted to switch vendors. Over time, I'll spread risk over different vendors in my pool.
- When attaching the new disk to the 1.5 TB pool, I'll avoid any "sorry, your disk is just a few blocks to small" errors. While two disks may be sold with the same number of GB, they still may differ in small amounts. But adding a slightly smaller disk to a ZFS mirror simply doesn't work, but adding a bigger one always does.
- I kept the disk that showed the read errors for now (it still works ok after resilvering), and attached the new disk to the mirror, forming a 3-way mirror. This is like a hot-spare that is already sync'ed in, providing an extra layer of fault-tolerance (I wouldn't complain about the extra 4W of power consumption, this is still as good as the cheaper 2+1 disk RAID-Z configuration from a power perspective, but much more fault-tolerant).
- I'm now waiting for one of the 1.5 TB disks to really fail. This will give me an excuse to buy a second 2 TB disk, get rid of both 1.5 TB drives and ZFS will automatically grow the pool size to 2 TB. Automatic, organic pool growth through faulty drive replacement!
- Once any of the 2 TB drive starts showing first signs of failure, the whole cycle will start again, with bigger drives. Or, I may decide to add the next bigger hot spare sooner, rather than later, and upgrade to a 3-way mirror again, before any new errors start nagging me again.
All Good Things Come in Three
Now there you have it: I'm proposing 3-way mirrors for home servers. Really, there's no reason not to!
- Contrary to widespread GB-greed practice, disks are cheap!
- Disk performance and disk fault-tolerance are not cheap. While I may settle for 60 MB/s vs. the maximum of 128 GB/s that Gigabit Ethernet offers, I really don't want to lose data. This is the real goal: Don't lose data.
- You should always have a hot-spare. Why not sync it in already, save resilvering time and avoid windows of vulnerability?
- As an extra bonus, a 3-way mirror gives you 3x read performance. Cool!
- And finally, mirroring always gives you a granularity of 2 for expansion, which is useful. No need to save money to buy that 5-drive set! Using a 3rd drive is optional with mirroring, it really is just a hotspare that is sync'ed in already, and it lets you sleep really well!
What's your RAID strategy for home servers? What's your rationale behind it? What experiences did you make with broken disks, hot-spares and replacements? Let me know by leaving a comment!
Stay in Touch!
Did you like this article? Have you found it useful, interesting or entertaining?
Then click here to get free regular updates and help me reach my goal of 500 regular blog readers this summer!
Thank you for reading Constant Thinking.











Comments
Dumb question about boot drives
This may be a dumb question, but are your mirrored boot drives mirrored through the motherboard's RAID controller or through OpenSolaris? If they're mirrored through OpenSolaris, did you have to install the OS to a single drive first and then use the OS to create the mirror?
Thanks for an excellent series! I'm more-or-less duplicating your setup. For what it's worth, my offsite backup plan is to use Crashplan to back up to another server that I will install at a relative's house.
Cheers,
Brian
Yes, mirrored with OpenSolaris
Hi Brian,
yes, my boot drive is mirrored through OpenSolaris, not the motherboard. This gives you all the benefits of ZFS mirroring (data integrity, self-healing, etc.) and you're independent of the motherboard manufacturer's HW RAID solution.
If you're quick, you can mirror while you install: Start a terminal before running the installer, then do
zpool statusoften while the installer is running. As soon as you can see the pool, do azpool attach rpool c1d0(or whatever your second drive is.After installation, make sure you also add a GRUB bootloader to the mirror disk, see the
installgrub(1M)manpage.Hope this helps!
Hello Constantin, thank you
Hello Constantin,
thank you very much for your interesting blog - right now I am trying to learn OpenSolaris Dev 132 after my first contact with Solaris 10 (SunOS 5.10) years ago.
I would like to know your opinion about
a) zfs crypto - performance hit and
b) slicing (partitioning) the hard disks
for a home file server.
I want to use a 3 disk setup whereas one part is used for fast read / writes without fault tolerance. The other one should be raidz1.
In your recommended setup one has to use 3 disks for mirroring, whereas on is larger than the others. So basically in replacing a faulty drive the other one is also obsolete when upgrading the overall capacity - where is the benefit?
Thanks
Jason
BTW:
Is there really no ZFS GUI like in Solaris or in SXCE?
Disk space is cheap :).
Hi Jason,
thanks for your comment!
ZFS crypto shouldn't pose a big performance hit. People are using FileVault on their Macs and other crypto solutions for their disks like TrueCrypt on a daily basis. If something is important enough to be encrypted, then the performance penalty doesn't really count.
Slicing the disks is of course possible and you can mix and match if you like, but it's not recommended as this will prevent ZFS from optimizing serial block access on a per disk basis.
Basically my argument is: Disk space is cheap! Put it at your last priority, behind performance and availability.
The reason I recommend to use a bigger drive as the 3rd one is that once the smaller ones break and you buy the next of the bigger size, your pool will automatically grow, presumably together with your storage needs. That's a nice, organic growth strategy. It also plays well with a regular spending of approx 1 disk per year at the same cost (which automatically translates in bigger disks per year as prices go down and capacity per price goes up over time).
If you only have 3 disks, then I recommend you mirror two of them and use the third for testing etc. But then the mirror will be faster than the single disk, YMMV. At the scale of 3-4 disks, RAID-Z doesn't really help much, it only makes your setup less flexible.
But that's an opinion, you should play out the expected performances with different setups. Check out the articles by Richard Elling and Roch Bourbonnais that I mention in the article.
Cheers,
Constantin
Hello Constantin, thanks for
Hello Constantin,
thanks for your answer.
I am wondering:
how will OpenSolaris tell me when a hard drive has gone bad? via Email?
Is there an option to see the Serial Number of a hard drive so I know for sure which one I have to replace physically?
Have a nice weekend
Jason
Fault Management
Hi Jason,
there's a technology in OpenSolaris called Fault Management Architecture (FMA) that will try to keep the system alive if anything bad happens and you have given the system enough resources to counter any broken hardware with.
For example, you can tell ZFS that you have an extra drive and that this is to be used as a hot spare. If you then use a mirror and one of the drives fails (or produces a more than acceptable number of read or write errors), then FMA will tell ZFS to exchange the bad drive with the good one automatically by copying over all data, then using the spare drive and leave the broken drive alone.
This will all be reported in the /var/adm/messages file, which is the default place to log system messages in. The messages generated by FMA are clearly formulated and nicely documented. More information on FMA is available on OpenSolaris.org.
That all happens at the system level, but it does not take care of monitoring or system reporting, because users, customers, sysadmins etc. have different ways of dealing with monitoring and system messages. This means you'll have to set up your own way of monitoring the system or receive alerts, or check the logfiles and status outputs regularly.
If you want to get an email, SMS or whatever other action when something happens, I suggest you install one of the available system monitoring packages. Tom mentioned logwatch in the comment thread that sounds like a nice solution. There are also more advanced solutions like Nagios or Big Brother. Or you can write your own script (and learn something).
Hope this helps,
Constantin
Cool, thanks for that! Right
Cool,
thanks for that!
Right now I am stuck at a very more basic issue:
Install to USB thumb drive and booting from it -
http://opensolaris.org/jive/thread.jspa?threadID=124212
Is this issue still around
http://mail.opensolaris.org/pipermail/ug-mucosug/2009-April/000054.html
after this much time?
Bye
Jason
Check out the comment thread for the bugreport and Eon
Hi,
check out the comment thread attached to the bug report you mentioned. There are a few suggestions for workarounds there. I know Tschokko managed to get around this so you may try to contact him (he's in Germany, too).
Also, check out the Eon OpenSolaris distribution which specializes in running a NAS from a USB device: http://sites.google.com/site/eonstorage/
Cheers,
Constantin
Incorrect comment about performance
Hi Constantin,
You should really edit your post. The comment entitled "ZFS RAID is about as fast as hardware RAID" is obviously incorrect, as you point out at several points in the rest of your post. You could perhaps say "fast enough for most home servers", but a 2x2 RAID is going to be much faster for both reads and writes, as you point out yourself. That could get important for things like media streaming, so I'd not be so quick to say "fast enough" either. YMMV of course :)
I'd think for a NAS server, with possibly several clients pulling data, this would be even more dramatic.
Overall good post however. Thanks!
-Bob
RAID-Z Performance Clarifications
Hi Bob,
thank you for your comment!
Sorry if my post has been unclear, I'll try to clarify it with this comment:
ZFS RAID-Z avoids this problem completely and without the need for extra controller hardware: Every transaction (even the modification of an existing file) eventually gets written as its own, full stripe, thanks to variable block sizes. Therefore, there's no need for read-modify-write cycles in ZFS. This is explained in more detail in Jeff Bonwick's blog post about RAID-Z
In contrast, ZFS RAID-Z avoids random writes completely through its copy-on-write nature: All writes happen on new, free blocks. This gives ZFS the opportunity to choose the blocks it's going to write to (providing a reasonable number of blocks are free, of course) and arrange for them to be sequential, rather than random, leveraging the disk drive's maximum performance.
This is not particular to ZFS, this follows from the way the disks are addressed and behaves about the same with traditional mirrors or RAID-5. There's a nice performance comparison of RAID-Z vs. mirrors for reads and writes in Richard Elling's article: ZFS RAID recommendations: space, performance, and MTTDL.
Bottom line (and I think we agree here): Mirrors are faster than RAID-Z (and RAID-5), for the same number of total disks.
However, the surprise to some readers is that a software RAID-5 technology like ZFS RAID-Z can be as fast as hardware RAID-5, provide better fault-tolerance and add to it data integrity and many more features that traditional RAID-controllers can't offer, essentially for free. There's really no reason any more to buy a hardware RAID controller any more.
I hope this clarifies the "ZFS RAID is about as fast as hardware RAID" part a bit. Next time, I should probably split up the article into multiple segments to be more clear :).
Thanks,
Constantin
ZFS is open-source and cross-platform
Hi Constantin!
> This is important, because you can rip out ZFS disks from any server, mix them if you like, then put them into any other server that speaks ZFS, and your pool will be readable.
This sounds to good to be true (and sadly it's more complex). I would bet that you will run into trouble with this approach.
I learned it the hard way. I created zpools with an SXCE installation and copied large amounts of data to them. Is was dissatisfied with some corners of the distribution and switched to Solaris 10 only to discover that the zpool-format isn't as interchangeable as you assume(I saw versions ranging from 5 to 7 if I remember correctly. I missed a table of zpool format interchangeability, but was way to lazy to create one myself). So remember: The only way is up ;-) . Maybe the zpool format is now more stable than in those days, but I would be more cautious (more so on other OSes).
Ciao,
Niels
The Only Way Is Up :)
Hi Niels,
thank you for your comment.
Yes, the importing system must support the ZFS version of the pool to be imported.
An overview of Zpool versions is available from the ZFS community pages on opensolaris.org. This is close to the table you were looking for. See also the
zpool upgradesection of the ZFS Administration Guide.You can also ask a running system what pool version it uses by running the
zpool upgrade -vcommand. The current Solaris version 10 10/09 runs zpool version 15, whereas OpenSolaris 2009.06 runs zpool version 14.So, for released versions of Solaris, ironically Solaris 10 is more advanced in terms of zpool version than OpenSolaris :).
Back to your example: Yes, there is some risk in trying out a developer version of Solaris and then you may run into a situation where you'd need to migrate your data by means other than a simple zpool import. Fortunately, current OSes that support ZFS are very quick in adopting new zpool versions, so that is not a lasting issue. You could have stayed on SXCE for a couple of months until Solaris 10 catched up to the zpool version.
Anyway, I hope that you're now using the Solaris version you want with the ZFS pool features that you want :).
Thanks,
Constantin
You are incorrectly assuming
You are incorrectly assuming that, like on RAID5, RAID-Z must be made in upgrades of threes. It does not; since all writes are full stripe writes, disk sizes may be mixed and matched, and capacity to the pool can be added in single disks, of varying sizes.
Not only that, but ZFS will detect this and (in words of Jeff Bonwick) "gently rebalance" the data across added space (like Oracle's ASM!)
Lastly, a RAID-Z pool may contain as little as two disks; this is a unique feature of RAID-Z and is a sideeffect of every write being a full stripe write of a non-fixed size. Try it, seeing is believing!
So, while both of us might like mirrors, if one needs storage capacity without sacrificing 50% of your total disk capacity, RAID-Z is king. As the number of disks reaches five or higher, RAID-Z2 starts to make sense, since it offers double-parity, allowing for two disks to fail before data is lost.
Still not enough? An article by Rouch Burnonais, "when to (and not to) use RAID-Z" details calculations for maxium availability, and capacity in a STRIPED RAID-Z configuration; incidentally, this is the exact config we've been running for the past four years, since ZFS version 2 first came out in Solaris 10 u2 (6/06):
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
raidz ONLINE 0 0 0
c0t1d0 ONLINE 0 0 0
c0t2d0 ONLINE 0 0 0
c0t3d0 ONLINE 0 0 0
raidz ONLINE 0 0 0
c1t9d0 ONLINE 0 0 0
c1t10d0 ONLINE 0 0 0
c1t11d0 ONLINE 0 0 0
If one has thought about this carefully, and done one's homework, it is possible to have one's cake and eat it too: a striped RAID-Z configuration will offer striped performance while tolerating lost disks, without the costly 50% mirroring penalty.
It depends on the priorities
Hi UX-admin,
thanks for your comment!
Just to clarify: For a given RAID-Z pool, there are two ways to increase its capacity:
Yes, in a home server setting, beauty or balanced performance might not matter, and in fact, one of my earlier home server storage setups was very mixed. But in hindsight, I'd like to advocate for simplicity vs. squeezing the last bit of space out of a pool. After all: Disks are cheap, you data and your sanity is not.
Thanks for bringing up Roch's excellent article When to (and not to) use RAID-Z, another favorite of mine. You and Roch and my customers work in an enterprise environment, and thus can use a larger number of disks. And this is indeed where you can have your cake and eat it too (or more correctly: Balance performance and capacity on a finer granularity, as needed). But in a home server environment (which this article focuses on), the number of disks is limited. The typical motherboard comes with 6 SATA ports. Two of them are typically taken by boot disks (mirrored of course) and that leaves us with 4 disks for data. In this case, I wouldn't advocate for RAID-Z, I'd consider RAID-Z2 (like Andreas suggested) but for the cases where pool size can be 2 TB or less, I'd still go with a 2-way mirror and use a bigger disk as hot-spare or as a third mirror half (like described in the article), opening a door into the next bigger pool size. That leaves me with a spare SATA port to use for a scratch disk or for other purposes.
This is also driven by disk quality: Consumer disks are of low quality, and they are darn cheap. Therefore home server builders should place quality above capacity and performance, and avoid GB-greed. In the enterprise world, the setting is slightly different (although still, I don't see many cases where customers really need to place capacity above performance or fault-tolerance).
Thanks,
Constantin
Oh, and yes please to another
Oh, and yes please to another post about backup :)
I'll Try :).
Thanks,
Constantin
Brilliant - have been waiting
Brilliant - have been waiting for a clear, point by point explanation of this since about forever. Bravo!
Thanks!
Thanks, Dave!
And make sure you take Andreas Jantos' take on RAID-Z2 into account. He has a point on fault-tolerance, but I'd still try to fit everything into a 3-way mirror instead of a 2+2 RAID-Z2 for the simplicity and speed.
Cheers,
Constantin
Fault-Tolerance
I had done some thinking lately about that, but I came to the opposite conclusion regarding fault-tolerance. And I believe I may have found a flaw in your reasoning:
You compare a 2x2 mirroring solution to a 3+1 RAID-Z configuration, but you do not consider the larger space for RAID-Z(1). In your example (with 2TB drives) you have 4TB (mirror) vs 6TB (RAID-Z).
If I had 4 drives and would accept 4TB usable space, I'd use a RAID-Z2 configuration with 2+2 drives. The first failure is no problem in either configuration. But the 2nd failure would be fatal for the mirroring configuration in 1 out of 3 cases (assuming *no* correlation between disk failures). RAID-Z2 does not run that risk.
If you compare a 6 drives solution (3 two-way mirrors = 6TB vs RAID-Z2 3+2+1(hot spare)= 6TB), I still think the RAID-Z2 solution offers a higher safety:
The 2nd drive failure would have a 20% chance of rendering the RAID1 solution useless, the 3rd failure would have a fifty-fifty chance. RAID-Z2 would survive the 2nd failure and - if the time between the first and 3rd failure is sufficient for the resilvering - also the 3rd failure. Unfortunately the resilvering would put a lot of stress on the remaining drives (and take a long time), so that the chances of further failures are increased during resilver.
My current configuration is 3+1 RAID-Z1 (unfortunately), which I'd like to migrate to a RAID-Z2 configuration, preferably to 4+2 (increasing the available space). But not only costs of drives are a point to consider, but I'm also running short of available SATA ports and I do not want to migrate ∼4TB to USB drives...
So flexibility and expandability are a big plus for the mirroring solution (sigh)...
Oh, and as a side note: I bought three different 1.5 TB drives to check for the "sorry-your-disk-is-just-a-few-blocks-too-small-error" and can report that the 1.5TB from WD, Samsung (F2) and Seagate (7200 rpm) are indeed the same size and can be used interchangeable.
Cheers
Andreas
Yes, RAID-Z2 may be Better - If You Have The Drives
Hi Andreas,
yes, your analysis is correct and RAID-Z2 provides better fault tolerance than a 2x2 mirror. I should have worked it into the analysis, but didn't consider it because RAID-Z2 tends to use quite a few disks to be interesting, so thanks for providing it.
And for home servers, number of drives/ports is a limiting factor to keep in mind, so the bigger the disks, the better. Frankly, I'm glad that my data requirements fit in 2 TB now, so I can have a 3-way mirror (which beats RAID-Z2 on speed and flexibility and provides the same "2 drives may fail" resiliency). Once you are forced to have many drives, RAID-Z2 becomes interesting, albeit with the performance/flexibility penalty.
Given the low cost of pure harddisk space and accepting a resilvering window of 4-8 hours, perhaps a 2x2 mirror with a hot-spare is still better overall than RAID-Z2? Granted, it costs 5 ports, but you get almost everything...
Thanks for the disk size data point. I know this has been an issue in the past with enterprise drives and didn't want to run that risk with consumer ones, perhaps they learned to be RAID-friendly?
Thanks for you comment,
Constantin
What's your RAID strategy for home servers?
Hallo,
ich benutze zur Zeit ein Raid1 mit 2x 120GB (2.5Zoll) für das System mit VMs.
Für meine Wichtigen Sachen benutze ich 2x 160GB Platten im Raid1 und eine 120GB (7200rpm) für unwichtige Dinge.
Später werde ich auf 2x 1TB Platten im Raid1 wechseln und die beiden 160GB + die 120GB Platte rausnehmen.
Sollte der Speicherplatz knapp werden, kommt dann noch mal eine 1TB Platte dazu und ich werde somit auf ein Raidz umschwenken.
Also von Raid1 mit 1TB auf Raidz mit 2TB.
Dazu werde ich wie auf meinem alten Linux Server auf ein Online-Backup setzten.
Wenn es zu einem Brand im Haus kommt, wäre es doch sehr schade, wenn alle meine wirklich wichtigen Daten wie private Fotos vernichtet würden.
Ich werde versuchen, dass Backup-Script von Memopal zum laufen zu bekommen.
Dieses Script schaut nach, ob sich Daten geändert haben und läd diese Daten dann hoch.
Zur Zeit habe ich dort 5GB freien Speicherplatz...
Backups in's Netz
Hi,
Cloud Backups sind im Moment recht teuer, wenn sie im Bereich von 'zig GB sind. Das reicht gerade mal für wichtige Dokumente etc., aber nicht für einen Dump eines ganzen Pools...
Aber wer weiss, vielleicht in 5 Jahren :).
Vielen Dank für den Beitrag!
Ciao,
Constantin
Hmm... die Frage wieviel sind
Hmm... die Frage wieviel sind einem seine Daten wert und wie bequem ist man... Ich nutze ein Backup in der Cloud und zahle 4.50$ pro Monat bei unbegrenztem Speicherplatz (nutze gerade 250 GByte ohne irgendwelche Probleme). Sicher, der erste Upload kann ewig dauern, aber danach ist es sehr komfortabel.
Wenn man Speicher in die 'Cloud' zurück gibt, bekommt man bei Wuala sogar den Speicherplatz 1:1 vergütet ohne Geld ausgeben zu müssen.
Viele Grüße,
harryd
Hallo harryd, lass mich raten
Hallo harryd,
lass mich raten ;-)
http://sourceforge.net/apps/phpbb/freenas/viewtopic.php?f=10&t=4300&p=20...
Die Kiste ist der SPOF!
Hi,
:-). Naja, wenn man zufällig 'ne Handvoll von denen übrig hat und genug Zeit, die Daten rechtzeitig von einer Kiste in die nächste zu schaufeln...
Ciao,
Constantin
www.backblaze.com
Hi,
ich meinte damit eher den Bezug zum Online Backup von harryd71
www.backblaze.com
Have fun ;-)
Klingt cool!
Hi harryd71,
klingt cool, schau' ich mir gerne mal an!
Ciao,
Constantin
Noch was...
das sind natürlich zwei unterschiedliche Anbieter... Leider hat Wuala keinen Client für Solaris, Crashplan hat einen...
Viele Grüße,
harryd
Resilvering time and Backups
Hi Constantin,
How about putting your comment "save resilvering time and avoid windows of vulnerability" in bold when using 2TB disk and upwards ;-)
I am curious about your thoughts on backups for a home server. Good size tape drives can be expensive, USB attached disks may be easily portable and ok but a bit slow, securing the data with encryption while 'off site' would be nice. Any ideas?
Excellent Blog by the way ;-)
Cheers
Neil
Good point!
Hi Neil,
thanks for your comment!
You're right, resilvering time can be an important factor. My old pool consisted of 2 x 2 x 1 TB WD MyBook Essential Edition USB drives. There was a time where I had to do a lot of resilvering and it took about a day for a half-full pool. One time, a USB connection dropped to USB 1.1 speed and then resilvering time was really bad. That was a mirror, it would have been worse for a RAID-Z.
When I resilvered my new 2TB disk into the 2 x 1.5 TB pool, which is about 1 TB full, resilvering time was about 4.5 hours. This is about half the maximum write bandwidth quoted for the drive, so this is ok.
As drive sizes continue to increase exponentially while IOPS essentially stagnate, I expect the resilver time per disk to become worse and worse. This is why some people declared the death of RAID-5 and they're right. If anybody thinks about RAID-5, they should skip it and go straight for RAID-6 (or RAID-Z2, respectively).
For a home server, 4.5 hours or resilver time is fine. I've heard of enterprise customers running large RAID-Z stripes (think 10 drives) and then wondering why resilvering takes many days, while the system was still suffering "normal production load". This is when your greedy storage strategy becomes really ugly and where choosing mirrors over RAID-Z becomes a career move. Mind you: This is not ZFS' fault, just the pure physics of using drives in a RAID-Z fashion under load while resilvering.
Backups are another can of worms, perhaps I should write a whole article on them... For home servers, I only see three possibilities: 1. USB disks (cheap, not too slow, but have to be on site), 2. Cloud storage (expensive, slow, but off-site) and 3. another home server (maybe less expensive, fast, at least in a different room). zfs send/receive (over SSH if needed) is an excellent way to do full/incremental backups, but in some scenarios you really want a file-based backup.
Perhaps a combination of the above is the way to go, depending on data type (music, photos, documents).
Thanks for reading my blog, l hope I can put together a good article in backup strategies once my own one is somehow finalized :).
Cheers,
Constantin
Hot spare
You should always have a hot-spare. Why not sync it in already, save resilvering time and avoid windows of vulnerability?
Sure, if you have only 3 disks...
But as soon as you add another mirror vdev to your pool, as you mentioned above, then you'll want that hot spare unsilvered!
Good call on mirrors vs. raidz though. I saw the light on this a few months ago and converted my home server's single 4x750GB raidz-1 vdev to a pair of mirrors. A much better idea.
Agreed!
Hi Al,
thanks for your comment!
Yes, you're right. A 3-way mirror vs. an unsync'ed hotspare is more practical if you're only mirroring a single disk. For data pools up to 2 TB, this works well and it's probably a better deal than try to save a couple of bucks by splitting the mirror into 2 x 2 x 1TB disks.
OTOH, no matter how many mirror vdevs you run, the 3-way strategy is always 50% of your mirror cost. And with falling disk prices, it's becoming less and less of a burden over time.
Glad to hear your pair of mirror runs well!
Cheers,
Constantin
The third and fourth links in
The third and fourth links in this blog entry both point to the same URL, http://blogs.sun.com/constantin/entry/new_opensolaris_zfs_home_server. Was that intentional? I'd like to see the AMD based home server you created.
Thanks, it's fixed now.
Hi Josef,
thanks for your comment! It was indeed an error. I shouldn't post that late in the evening :). I've fixed the fourth link that points to my home server config. It's at: http://blogs.sun.com/constantin/entry/a_small_and_energy_efficient
Cheers,
Constantin