Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:squidward.che:plankton [2014/03/18 18:12]
frey
— (current)
Line 1: Line 1:
-====== CCEI Storage Appliance ====== 
- 
-The ''​plankton.che.udel.edu''​ appliance is a home-grown medium-scale storage system. ​ Under the hood it uses a ZOL (ZFS on Linux) filesystem. ​ ZFS has several benefits: 
- 
-  * **Per-directory quota control**: ​ Any directory (be it a user's storage, content to be visible on the web, whatever) can have a quota (maximum size) or a reservation (guaranteed size). 
-  * **Storage pools**: ​ The filesystem doesn'​t live on physical hard disks, it is spread across groups of disks; the filesystem is easily grown by adding more disks. 
-  * **Increased integrity**: ​ ZFS has advanced data-integrity features like triple-parity RAID and self-healing of silent corruption. ​ Triple-parity means that if three hard disks fail for a given storage pool data will not be lost.  Silent corruption is the mangling of bits on the hard disk itself, such that when you later read the data off disk you cannot tell if it is correct or not; ZFS mitigates this using additional checksums and parity data. 
-  * **Snapshots**: ​ More on this [[#​snapshots|below]]. 
- 
-ZFS on Linux is an open source project that derives from the Sun (now Oracle) ZFS code that is a part of the Solaris operating system. 
- 
-With ZOL providing the storage, the appliance needs interfaces through which users (CCEI staff and students) can access it.  The following file-sharing mechanisms are currently configured on ''​plankton'':​ 
- 
-  * **[[#​samba|Samba]]**: ​ Also known as CIFS or SMB, Samba allows you to mount directories on ''​plankton''​ on your Windows/​Mac/​Linux desktop and work with them as you would any other disk (drag and drop to copy, double-click to open and edit). ​ Samba is NOT a secure file transfer protocol, though, so it is available ONLY when you are on-campus. ​ Even when on-campus, don't use it on any data you consider to be private. 
-  * **SFTP/​SCP**: ​ The Secure FTP and Secure CP programs (part of the SSH client suite) can be used from anywhere on the Internet to connect to ''​plankton''​ and manipulate files. 
-  * **NFS**: ​ Each CCEI student with an account on ''​squidward''​ as well as on ''​plankton''​ can access his/her ''​plankton''​ directory directly from the head node of ''​squidward.''​ 
- 
-<WRAP center round important 60%> 
-Please note that the storage appliance is currently a demonstration unit and all information on this page is subject to change. 
-</​WRAP>​ 
- 
-===== Samba Access ===== 
- 
-Students: ​ Samba URLs look like ''​smb://​plankton.che.udel.edu/​students-[username]/''​ where ''​[username]''​ is your username (e.g. for me, ''​frey''​). 
- 
-Staff: Samba URLs look like ''​smb://​plankton.che.udel.edu/​staff-[username]/''​ where ''​[username]''​ is your username (e.g. for me, ''​frey''​). 
- 
-==== Mac ==== 
- 
-In the Finder choose **Connect to Server…** from the **Go** menu.  Enter your ''​plankton''​ URL and click the "​Connect"​ button. ​ You will be prompted for your username and password. ​ If successful, your ''​plankton''​ directory will appear on the desktop and/or in the righthand pane of Finder windows. 
- 
-==== Windows ==== 
- 
-Given the URL mentioned above, you can find your Windows "​folder name" by: 
- 
-  - Replace all forward slashes with backslashes 
-  - Remove the leading ''​smb:''​ 
- 
-E.g. ''​\\plankton.che.udel.edu\students-frey\''​. ​ Given your "​folder name," follow the directions presented on [[http://​windows.microsoft.com/​en-us/​windows/​create-shortcut-map-network-drive#​1TC=windows-7|this Microsoft support page]]. ​ You'll need to enable the checkbox for "//​Connect using different credentials//​."​ 
- 
-===== Snapshots ===== 
- 
-When a file is written on ZFS, the data is always written to unused blocks: ​ this is known as //​copy-on-write//​. ​ So long as there are enough unused blocks available, the blocks containing the old copy of the file will not be overwritten with the new data.  This also serves to increase write performance,​ since it usually means that a contiguous set of blocks can be allocated and written in one pass (versus a disparate set of blocks that must each be located and written). 
- 
-Another benefit of copy-on-write and the retention of older blocks is that for some period of time one or more older copies of a file will still be present. ​ Imagine that on Wednesday the filesystem creates a copy of the metadata that maps filenames to blocks they occupy. ​ On Thursday I delete a page from ''​charts.xls''​ and save it, then realize I wanted to keep that page!  If I could consult Wednesday'​s metadata copy, I could discover the blocks that contained the file prior to my mistake and recover it.  This is exactly what ZFS snapshots are:  a point-in-time copy of the metadata associated with the files/​directories on the filesystem. 
- 
-There is an invisible directory in every ''​plankton''​ file share called ''​.zfs''​ that contains any snapshots that are available. ​ For example, if I SFTP to ''​plankton''​ I can use the command ''​cd .zfs/​snapshot''​ and ''​ls''​ to see what snapshots are available: 
-<​code>​ 
-sftp> cd .zfs/​snapshot 
-sftp> ls -l 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 0200 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 0800 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 1400 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 2000 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Fri 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Mon 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Sat 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Sun 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Thu 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Tue 
-dr-xr-xr-x ​   1 root     ​root ​           0 Mar 18 13:42 Wed 
-</​code>​ 
-As ''​plankton''​ is currently configured, snapshots are made on a six hour interval each day starting at 2 a.m.  A daily snapshot is taken at 11 p.m. each day and is named to match the day of the week.  So the snapshot directory ''​0200''​ represents the filesystem at 2 a.m. and ''​1400''​ at 2 p.m. -- the last time it was 2 a.m. or 2 p.m.  Likewise, ''​Wed''​ is the snapshot the last time it was 11 p.m. on a Wednesday. 
- 
-Inside a particular snapshot directory you will find all files and directories that existed at that point in time and are still present on disk: 
-<​code>​ 
-sftp> cd Wed 
-sftp> ls -al 
-drwx------ ​   2 frey     ​cadmin ​         2 Feb 27 13:51 . 
-dr-xr-xr-x ​   3 root     ​root ​           3 Mar 18 08:00 .. 
--rw-r--r-- ​   2 frey     ​cadmin ​      4105 Mar 01 09:13 charts.xls 
-</​code>​ 
-From this snapshot directory I can download that (older) copy of the file the same as any other file: 
-<​code>​ 
-sftp> get charts.xls 
-</​code>​ 
-<WRAP center round important 60%> 
-The ''​.zfs''​ directory is accessible when using Samba or SFTP/SCP to access ''​plankton''​. ​ It currently does not work properly for NFS access from ''​squidward''​. 
-</​WRAP>​ 
  
  • cluster/squidward.che/plankton.1395166339.txt.gz
  • Last modified: 2014/03/18 18:12
  • by frey