ZFS Snapshots

Some of the workgroup-owned clusters and all the UD community clusters make use of ZFS to provide resilient shared storage space. ZFS is a Copy-On-Write (COW) storage technology, meaning that changes are always written to unused blocks in the underlying storage pool rather than overwriting blocks currently occupied by those entities. At any time a snapshot of the file system can be created very quickly and efficiently. Each snapshot references the blocks used at that point in time, and future changes naturally do not affect those blocks (thanks to COW).

The primary caveat to snapshots: they reference (consume) storage pool capacity that would otherwise get reused over time. Thus, it is often necessary to destroy older snapshots to make space available for new changes to the file system (the only other option being growing the underlying storage pool to add new capacity).

Each snapshot of a ZFS file system represents a point-in-time copy, making it an online backup copy of the file system. In many instances, the same snapshots are periodically transmitted to a secondary (often off-site) server where they guarantee recoverability of the file system should the primary server suffer a catastrophic failure.

Any user having access to the root mount point of a ZFS file system can see a list of available snapshots. For example, my home directory on the Caviness community cluster is backed by a ZFS file system. To display all available snapshots:

[frey@login00 ~]$ cd ~/.zfs/snapshot
[frey@login00 snapshot]$ ls -1
20180925-1815
20181023-1815
20181120-1815
20181218-1815
20190108-1815
20190115-1815
20190122-1815
20190129-1815
   :
20190314-0615
20190314-1815
20190315-0615

Each directory displayed is named according to the date and time the snapshot in question was created. Note that the names at the top of the listing have a by-month granularity, increasing to a by-week, by-day, and bi-daily granularity moving down the list. This demonstrates that for my home directory on Caviness:

snapshots are created twice each day, at 6:15 a.m. and p.m.
as time goes by, snapshots are removed in such a way that a single weekly and monthly (and eventually, yearly) snapshot remains

This pattern only holds so long as my usage remains well below my quota. If I use more and more capacity, the number of snapshots will decrease.

Each snapshot under .zfs/snapshot is itself a directory, which can be navigated from the filesystem using the standard Linux file system commands (like cd and ls). The files and directories present in the snapshot cannot be modified (rm and mv will not work, neither will attempts to edit files in place), but they can be read (e.g. with less or head)

[frey@login00 ~]$ ls -l ~/trace.txt
ls: cannot access /home/1001/trace.txt: No such file or directory
[frey@login00 ~]$ ls -l ~/.zfs/snapshot/20190206-0115/trace.txt
-rw-r--r-- 1 frey everyone 336436 Jan 25 09:39 /home/1001/.zfs/snapshot/20190206-0115/trace.txt
[frey@login00 ~]$ head -5 ~/.zfs/snapshot/20190206-0115/trace.txt 
#
# This file contains a code trace
# from a run of my program that is
# failing...
#

or copied to some writable location using cp

[frey@login00 ~]$ cp -a ~/.zfs/snapshot/20190206-0115/trace.txt ~/trace.txt
[frey@login00 ~]$ ls -l ~/trace.txt
-rw-r--r-- 1 frey everyone 336436 Jan 25 09:39 /home/1001/trace.txt
[frey@login00 ~]$ head -5 ~/trace.txt 
#
# This file contains a code trace
# from a run of my program that is
# failing...
#

ZFS Snapshots

Listing snapshots

Accessing a snapshot