Introducing Sparse Encrypted ZFS Pools

Sparse ZFS Pools

Ever since I've been using a Mac, I enjoy using Sparse Encrypted Disk Images for a variety of tasks, for instance securely storing data that can be backed up somewhere else, say on a hosting server.

In fact, most of my project/personal data on my Mac sits on sparse encrypted disk images that are regularly rsynced to an external storage service, Strato's in particular.

The beauty of this solution lies in it simplicity:

Sparse encrypted disk images show up just like any other hard drive. But on the back end, they translate into a bunch of flat files that store all the data in an encrypted manner. By rsyncing the backing store, sparse encrypted disk images can be easily backed up across the net, while ensuring privacy and convenience.

Here's how to do similar things with Solaris and ZFS, including some extra data integrity magic:

So what we're looking for is a solution that is:

  • Encrypted, so data remains private,
  • Robust, so data survives even if the transmission/rsync/backup process is flawed,
  • Convenient, so data can be written or read easily on the home server side,
  • Finely-grained, so we can make our storage as big as we want, without worrying about file sizes.

The solution is to leverage a couple of Solaris mechanisms. Notice that this has not been blessed by Solaris Engineering, but it works for me and is fun to play with:

  • Using mkfile, we can create individual files of 1GB of size to store our data in.
  • We will mount these files using lofiadm as encrypted block devices, so they can be accessed by the system just like a regular disk would be accessed.
  • We can then combine a bunch of such lofi devices into a zpool using any of the redundancy schemes (RAID-Z2 in this particular case) to turn a bunch of files into a regular ZFS zpools
  • Later, we can either backup or rsync our backing store files to some cloud service without worrying about security (since all data is encrypted at the block level) and data integrity (since we leverage ZFS RAID-Z in an end-to-end manner). Even if our cloud storage service or whatever we use to backup out stuff with screws up the occasional file, we can get our data back since it's being RAID-Z2 protected.

Here's what to do:

  • Create a bunch of 1GB files to serve as backing stores:
      mkdir /export/szpools
      cd /export/szpools
      mkfile 1g szpool_1 szpool_2 szpool_3 szpool_4 szpool_5 szpool_6
     
  • Use lofiadm to turn those files into encrypted block devices:
      lofiadm -c aes-256-cbc -a /export/szpools/szpool_1
      Enter passphrase: 
      Re-enter passphrase: 
      /dev/lofi/1
      lofiadm -c aes-256-cbc -a /export/szpools/szpool_2
      Enter passphrase: 
      Re-enter passphrase: 
      /dev/lofi/2
      lofiadm -c aes-256-cbc -a /export/szpools/szpool_3
      Enter passphrase: 
      Re-enter passphrase: 
      /dev/lofi/3
      lofiadm -c aes-256-cbc -a /export/szpools/szpool_4
      Enter passphrase: 
      Re-enter passphrase: 
      /dev/lofi/4
      lofiadm -c aes-256-cbc -a /export/szpools/szpool_5
      Enter passphrase: 
      Re-enter passphrase: 
      /dev/lofi/5
      lofiadm -c aes-256-cbc -a /export/szpools/szpool_6
      Enter passphrase: 
      Re-enter passphrase: 
      /dev/lofi/6
     
  • Create a zpool out of those encrypted block devices:
    zpool create szpool raid-z2 /dev/lofi/1 /dev/lofi/2 /dev/lofi/3 /dev/lofi/4 /dev/lofi/5 /dev/lofi/6

Now you can access the pool through regular ZFS means and at the same time backup or rsync the backing store files to some cloud/web hosting service.

We can now create more devices and add them to the pool to grow its capacity and set up a regular cron job that rsyncs all data into an external service for backup.

Should your local system go up in flames, you can restore the data on a new system, recreate the lofi devices and import the pool.

A few questions and answers:

  • Why not use ZFS encryption? Yes, ZFS encryption would work as well, but I prefer to use block encryption in this case because it more effectively hides the structure of the pool to the backup service. They should not know what's on the files, not even that they're being used as virtual block devices.
  • Can the asynchronous nature of rsync and other mechanisms compromise data integrity? Yes and now. As long as there's a regular rsync process running that backs up all block devices, data can be restored reliably. There are a couple of mechanisms in ZFS that help import the pool even if the underlying devices have been backed up/restored in an unknown order.
  • Why go through this trouble if you can also use zfs send/receive for backup? zfs send/receive is fine, but it creates very large files, which are unwieldly for your typical hosting service. They're also not encrypted. This is a much more compatible solution because all it does is map ZFS magic onto a bunch of 1GB files.
  • Shouldn't this be scripted? Yes, it should.

Stay in Touch!

Did you like this article? Have you found it useful, interesting or entertaining?

Then click here to get free regular updates and help me reach my goal of 1,000 regular blog readers this summer!

Thank you for reading Constant Thinking.