OpenSolaris Home Server Scripting Howto Part One: Intro and a Simple ZFS Auto-Snapshot Enabling Script

ScriptingHomeserver.jpg

One of my OpenSolaris Home Server Tips is to script everything. That triggered quite some interest. So let's start a short series around OpenSolaris home server scripting.

Today, we'll talk a little bit about the "why?" of home server scripting, then run into a small surprise while we write a small script that will enable/disable the OpenSolaris ZFS Auto-Snapshot Service for us.

Wait, Why Would I Script Something I Could Easily Do By Hand?

Of course, we're real system heroes, aren't we? So why should we bother scripting? Just give us half an hour with the machine, then it'll do everything we want, right?

Only it doesn't take half an hour. It takes a lot more. Days, weekends, months, years. And then try to remember what exactly you configured where. Or how that thing with the power management configuration really worked. That's why scripting is a good idea:

  • Scripting is documentation: If you configure your home server with a script and not by hand, the script becomes the documentation. Nobody wants to write documentation, and nobody does (unless forced to). Still, documentation is a good thing. By vowing to script everything and not to do anything by hand, you automatically force yourself to document what you did.
  • Scripts make stuff repeatable: After hours of studying man pages and tutorials, you manage to set up your first OpenSolaris zone. Woohoo! Now do it again... See? A script helps you pour all your hard earned knowledge and myriads of little steps into one file that makes everything easily repeatable.
  • Scripting will make you aware of what you really do: Some things sound easy. But only after starting to put them together you discover that they're really quite complex. A script helps you keep stuff under control, even if it becomes more complicated than you anticipated. And it will...
  • You'll learn something new: Part of the fun of running a home server is to learn something new. And with OpenSolaris, there are lots of cool things to learn. By scripting, we dig deeper into the guts of the OS, enabling us to learn what really goes on behind the scenes.
  • More confidence, more control and more chicks: Overall, the more scripts you write to administer your server, the more confidence you'll build up as a sysadmin, as your control of the system becomes better and more powerful.
    Sorry, I lied about the "chicks" part, but it did get your attention, eh?
    The good news is: Scripting also saves time which you can invest in bars, restaurants and other social activities. Maybe I was right after all?
    (Note to my female home server admin readers: s/chick/hunk/g)

Our First Script: Enabling the ZFS Auto-Snapshot Service

So let's illustrate the above points while writing our first script. One of the first things everybody should do after a fresh OpenSolaris install is to enable the ZFS Automatic Snapshot Service. It does the bulk of the work behind the much touted Time-Slider feature of OpenSolaris by automatically creating periodic ZFS snapshots, and deleting old ones before your disk becomes full.

In the GUI, it's easy to set up: You tick the "Enable Time Slider" box in the "System->Administration->Time Slider Setup" panel and that's it. Here's how to do it the scripted way:

Behind the scenes, multiple SMF services work in concert to make this happen:

admin@krengi:~$ svcs -a | egrep "snapshot|time-slider"
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:frequent
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:hourly
disabled       15:24:14 svc:/system/filesystem/zfs/auto-snapshot:weekly
disabled       15:24:14 svc:/system/filesystem/zfs/auto-snapshot:monthly
disabled       15:24:14 svc:/system/filesystem/zfs/auto-snapshot:daily
disabled       15:24:15 svc:/application/time-slider:default
admin@krengi:~$ 

The five "auto-snapshot" services create the snapshots at specific intervals, and the "time-slider" service checks if the disk is becoming too full and deletes the oldest snapshots to make room if necessary. By default, all services are disabled. Our job is simple: Enable them.

But wait: For a home server, we want to avoid the frequent (= every 15 minutes) and the hourly snapshots as they would force the disks to spin up and down very often during the day, consuming unnecessary power. Daily, weekly and monthly will do just fine. So let's create a basic script that enables just the services we want:

#!/bin/ksh93
#
# autosnap
#
# Switch ZFS Automatic Snapshots and Time-Slider on or off.
#
 
# Get parameters
CMD=$1
 
# Services we want/don't want
AUTOSNAPSVCS="auto-snapshot|time-slider" # All related services.
SVCSEXCLUDE="frequent|hourly"            # We don't want the frequent ones.
 
# Find the services interesting to us.
function findsvcs {
        svcs -aHo FMRI | egrep $AUTOSNAPSVCS | egrep -v $SVCSEXCLUDE
}
 
# Enable auto snapshot related SMF services
function enable_auto_snapshot {
        svcadm enable $(findsvcs)
}
 
enable_auto_snapshot
 
exit 0

An Unexpected Obstacle

Let's try it out:

admin@krengi:/krongi/config/scripts$ ./autosnap
admin@krengi:/krongi/config/scripts$ svcs -a | egrep "snapshot|time-slider"
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:frequent
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:hourly
online         15:36:47 svc:/system/filesystem/zfs/auto-snapshot:weekly
online         15:36:47 svc:/system/filesystem/zfs/auto-snapshot:daily
online         15:36:47 svc:/system/filesystem/zfs/auto-snapshot:monthly
offline        15:36:44 svc:/application/time-slider:default

Wait, why is the time-slider service still offline?

admin@krengi:~$ svcs -x time-slider
svc:/application/time-slider:default (GNOME Desktop Snapshot Management Service)
 State: offline since Sat Mar 06 15:36:44 2010
Reason: Service svc:/system/filesystem/zfs/auto-snapshot:frequent is disabled.
   See: http://sun.com/msg/SMF-8000-GE
Reason: Service svc:/system/filesystem/zfs/auto-snapshot:hourly is disabled.
   See: http://sun.com/msg/SMF-8000-GE
   See: zfs(1M)
   See: /var/svc/log/application-time-slider:default.log
Impact: This service is not running.

Damn! The time-slider service insists on being dependent from the frequent and hourly instances of the auto-snapshot service. In fact, you can look it up in its manifest at /var/svc/manifest/application/time-slider.xml. But we really don't want the server to wake up every 15 minutes and burn precious power all the time.

Fortunately, we can fix this by relaxing the dependencies somewhat. The current "grouping" setting for the dependencies is:

admin@krengi:~$ svccfg -s time-slider listprop auto-snapshot-svcs/grouping
auto-snapshot-svcs/grouping  astring  require_all

Looking at the smf(5) man page, we learn that we can also set this to require_any, which seems to be much more appropriate to our needs. This can be done through:

admin@krengi:~$ svccfg -s time-slider setprop auto-snapshot-svcs/grouping = require_any
admin@krengi:~$ svcadm refresh time-slider

And everything is alright:

admin@krengi:~$ svcs -a | egrep "snapshot|time-slider"
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:frequent
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:hourly
online         15:36:47 svc:/system/filesystem/zfs/auto-snapshot:weekly
online         15:36:47 svc:/system/filesystem/zfs/auto-snapshot:daily
online         15:36:47 svc:/system/filesystem/zfs/auto-snapshot:monthly
online         16:28:17 svc:/application/time-slider:default

Putting It All Together

But that was manual, that doesn't count. Let's work this into our script. And while we're at it, let's also implement the reverse, so we can undo our special setting anytime:

#!/bin/ksh93
#
# autosnap
#
# Switch ZFS Automatic Snapshots and Time-Slider on or off.
#
 
# Get parameters
CMD=$1
 
# Services we want/don't want
AUTOSNAPSVCS="auto-snapshot|time-slider" # All related services.
SVCSEXCLUDE="frequent|hourly"            # We don't want the frequent ones.
 
# Find the services interesting to us.
function findsvcs {
	svcs -aHo FMRI | egrep $AUTOSNAPSVCS | egrep -v $SVCSEXCLUDE
}
 
# Enable auto snapshot related SMF services
function enable_auto_snapshot {
	# Change the dependency requirement for time-slider to "require_any" so
	# it starts even if not all snapshot services are online.
	svccfg -s time-slider setprop auto-snapshot-svcs/grouping = optional_all
	svcadm refresh time-slider # So it picks up the new setting.
	svcadm enable $(findsvcs)
}
 
# Disable auto snapshot related SMF services
function disable_auto_snapshot {
	svcadm disable $(findsvcs)
	# Change the dependency requirement for time-slider back to the default.
	svccfg -s time-slider setprop auto-snapshot-svcs/grouping = require_all
	svcadm refresh time-slider # So it picks up the new setting.
}
 
# Show a help message.
function show_help {
	echo "Usage: $0 command"
	echo "Supported commands are:"
	echo "  on"
	echo "  off"
}
 
#
# Main program.
#
case $CMD in
        "on" )
                enable_auto_snapshot
        ;;
        "off" )
                disable_auto_snapshot
        ;;
        * )
                show_help
        ;;
esac
 
exit 0

Does it work?

admin@krengi:~$ svcs -a | egrep "snapshot|time-slider"
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:frequent
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:hourly
disabled       16:32:56 svc:/application/time-slider:default
disabled       16:32:56 svc:/system/filesystem/zfs/auto-snapshot:daily
disabled       16:32:56 svc:/system/filesystem/zfs/auto-snapshot:weekly
disabled       16:32:57 svc:/system/filesystem/zfs/auto-snapshot:monthly
admin@krengi:~$ ./autosnap on
admin@krengi:~$ svcs -a | egrep "snapshot|time-slider"
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:frequent
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:hourly
online         16:35:38 svc:/application/time-slider:default
online         16:35:40 svc:/system/filesystem/zfs/auto-snapshot:weekly
online         16:35:41 svc:/system/filesystem/zfs/auto-snapshot:daily
online         16:35:41 svc:/system/filesystem/zfs/auto-snapshot:monthly
admin@krengi:~$ ./autosnap off
admin@krengi:~$ svcs -a | egrep "snapshot|time-slider"
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:frequent
disabled       Jan_09   svc:/system/filesystem/zfs/auto-snapshot:hourly
disabled       16:36:42 svc:/application/time-slider:default
disabled       16:36:42 svc:/system/filesystem/zfs/auto-snapshot:weekly
disabled       16:36:43 svc:/system/filesystem/zfs/auto-snapshot:daily
disabled       16:36:43 svc:/system/filesystem/zfs/auto-snapshot:monthly

Nice. Easier than ticking that GUI box, eh?

Conclusion

Even with our very simple example, we saw all the benefits that scripting brings to our home server, as opposed to configuring stuff by hand:

  • Our script documents exactly what we did to switch on the ZFS Auto-Snapshot Service.
  • It makes our configuration action repeatable: We can now switch the ZFS Auto-Snapshot Service on and off at will.
  • We became aware of what really happens behind the covers of enabling the ZFS Auto-Snapshot Service and that gave us the opportunity to better adapt it to our needs. Then we realized, that stuff wasn't so simple after all, and we created a workaround for the overzealous dependency of the time-slider SMF service.
  • While we wrote our script, we not only learned about the ZFS Auto-Snapshot service, we also learned a little bit of SMF along the way.
  • We can now tick off the ZFS Auto Scrub Service configuration from our TODO list for good and slepp well. We know we've completely nailed it. Of course, this is just a very little step, but they'll all add up to our confidence over time.

Now we're ready to build more complex scripts that help us administer our server. Remember: The goal is to be able to take a fresh installation of OpenSolaris, start our scripts, then watch as everything configures itself completely automatically!

In the following posts, we'll look at configuring power management, package repositories, users, etc., then work towards setting up and cloning complete zones in a fully scripted way.

Man Pages to Check Out

If the above Solaris SMF related commands are new to you, here are a few man pages to check out:

Your Turn

I hope this example was interesting and that it motivated you to stop tweaking stuff by hand and instead start scripting your admin work on your own.

Or did you write yourself some useful admin scripts already? What are your experiences with home server scripting? Leave a comment and share your views with others!

Update

Antoon pointed out in the comments that it would be cleaner to set up our own instance of the time-slider service, then modify it. He's right, in my original script I just went for the route of least resistance :).

To make up for it, here's the code that needs to be modified in the script to create our own instance while leaving the default one as is.

Since the time-slider service will get more special attention, we'll just look for the auto-snapshot services when building our list:

AUTOSNAPSVCS="auto-snapshot"  # All snapshot services.

Note: The |time-slider part is now missing from the grep pattern.

Then, we'll create a new instance of the time-slider service and name it "homeserver". We add our own instance of the auto-snapshot-svcs property group and populate it with our special list of dependencies:

# Enable auto snapshot related SMF services
function enable_auto_snapshot {
        # Create a new instance of the time-slider service with a modified
        # list of dependencies.
        dependencies=$(findsvcs)
        svccfg -s time-slider add homeserver
        svccfg -s time-slider:homeserver addpg auto-snapshot-svcs dependency
        svccfg -s time-slider:homeserver \
                "setprop auto-snapshot-svcs/entities = fmri: ($dependencies)"
        svcadm enable $dependencies time-slider:homeserver
}

Note: The rest of the auto-snapshot-svcs property group as well as all the other property groups will be automatically inherited from the default.

Finally, we need to update the "disable" function to clean up after ourselves:

# Disable auto snapshot related SMF services
function disable_auto_snapshot {
        svcadm disable -s $(findsvcs) time-slider:homeserver
        # Get rid of ouf special homeserver instance of the time-slider service.
        svccfg -s time-slider delete homeserver
}

Note: The disable -s part is necessary so SMF waits until all services are properly disabled. Only then we can proceed deleting our homeserver instance of the time-slider service.

That's it. Our script is now slightly more complex, but it is also cleaner, as it uses SMF's service instances and inheritance features to model our special requirements as opposed to hacking the defaults. As Antoon points out, using our own SMF instances is also a better way to document our special configuration needs.

Thanks to Antoon for pointing us into a cleaner direction!

Stay in Touch!

Did you like this article? Have you found it useful, interesting or entertaining?

Then click here to get free regular updates and help me reach my goal of 1,000 regular blog readers this summer!

Thank you for reading Constant Thinking.