A while ago, we introduced the ability to take snapshots of a drive. This is a very handy feature that allows you to quickly and efficiently save drive states even on live systems.
Thanks to ZFS, snapshots will only consume the delta between the current state and where the snapshot was taken. This means that if your original drive was 15GB and only 1MB of data has changed between the snapshot and the current state, the size of the snapshot would be 1MB. If you write another megabyte to the disk, the snapshot will grow by another megabyte.
Another beauty of this system is that any snapshot can be promoted (cloned) into a full disk drive. This means that you can create an independent copy that can be mounted on a server potentially on a different storage system entirely. As such, this forms the foundation for a storage management strategy (depending on your workload).
While using periodic snapshots can be a part of your backup strategy, it is unwise to rely on snapshots as your sole strategy.
There are also numerous situations where using these snapshots will not work, such as snapshotting a running database server. The snapshot functionality may still be useful on stopped database servers (to create a point-in-time restore), but again, it should not be your sole backup strategy.
Using our Python library, automating snapshots is really simple. However, given that we need to store the CloudSigma credentials on the system that triggers the snapshots, we’d strongly discourage you from exposing production service credentials insecurely. If you want to run this on a cloud server for example, please make sure that it is shielded off from the rest of the infrastructure (such as using our network policies feature) and that is fully locked down.
After installing the Python library, you can download and run the script as follows:
$ wget https://raw.githubusercontent.com/cloudsigma/pycloudsigma/master/samples/snapshot.py $ python snapshot.py drive-uuid my-snapshot
snapshot.py takes two arguments:
After you’ve manually created a snapshot and verified that it works (you can see this under the ‘snapshot’ section of the drive), we can now automate this.
The most suitable, and standardized way of running a task like this would be to the crontab (assuming you’re on a Linux or Mac OS X).
With the same user as you created the snapshot above run:
$ crontab -e
If you want to take a snapshot every night at 1AM, add the following line:
0 1 * * * python /path/to/snapshot.py drive-uuid my-snapshot >> $HOME/snapshot.log 2>&1
You’ll also notice that the script will log to a file named snapshot.log in the home directory of the user running the script.
Since snapshots grow over time, you will likely want to delete these snapshots after some time. To solve this problem, we’ve created another script that can do this for you. The script is called snapshot_purge.py and takes two arguments:
For instance, if you want to keep 30 days worth of snapshots, you can simply run:
$ wget https://raw.githubusercontent.com/cloudsigma/pycloudsigma/master/samples/snapshot_purge.py $ python snapshot_purge.py drive-uuid 30
You can of course automate this too. For instance, if we want to purge snapshots older then 30 days, we can add the following to our crontab (which will run at 1:30AM):
30 1 * * * python /path/to/snapshot_purge.py drive-uuid 30 >> $HOME/snapshot_purge.log 2>&1
That’s it, folks. Using these two scripts, you will be able to automate your drive snapshots. If you need to snapshot multiple drives, simply add more of the snapshot.py lines to your crontab with different UUIDs.
We’re of course just scraping just the surface of what can be done with snapshots, but I hope this serves as a quick crash course in using snapshots for your storage management routines.
If you have more sophisticated data retention needs, you can hopefully reuse some of the code in the scripts above.