source: project/release/4/ugarit/trunk/README.txt @ 20271

Last change on this file since 20271 was 20271, checked in by Alaric Snell-Pym, 11 years ago

ugarit: Brought in the kitten-technologies metadata files and updated README.txt

File size: 36.1 KB
Line 
1# Introduction
2
3Ugarit is a backup/archival system based around content-addressible storage.
4
5This allows it to upload incremental backups to a remote server or a local filesystem such as an NFS share or a removable hard disk, yet have the archive instantly able to produce a full snapshot on demand rather than needing to download a full snapshot plus all the incrementals since. The content-addressible storage technique means that the incrementals can be applied to a snapshot on various kinds of storage without needing intelligence in the storage itself - so the snapshots can live within Amazon S3 or on a removable hard disk.
6
7Also, the same storage can be shared between multiple systems that all back up to it - and the incremental upload algorithm will mean that any files shared between the servers will only need to be uploaded once. If you back up a complete server, than go and back up another that is running the same distribution, then all the files in `/bin` and so on that are already in the storage will not need to be backed up again; the system will automatically spot that they're already there, and not upload them again.
8
9## So what's that mean in practice?
10
11You can run Ugarit to back up any number of filesystems to a shared archive, and on every backup, Ugarit will only upload files or parts of files that aren't already in the archive - be they from the previous snapshot, earlier snapshots, snapshot of entirely unrelated filesystems, etc. Every time you do a snapshot, Ugarit builds an entire complete directory tree of the snapshot in the archive - but reusing any parts of files, files, or entire directories that already exist anywhere in the archive, and only uploading what doesn't already exist.
12
13The support for parts of files means that, in many cases, gigantic files like database tables and virtual disks for virtual machines will not need to be uploaded entirely every time they change, as the changed sections will be identified and uploaded.
14
15Because a complete directory tree exists in the archive for any snapshot, the extraction algorithm is incredibly simple - and, therefore, incredibly reliable and fast. Simple, reliable, and fast are just what you need when you're trying to reconstruct the filesystem of a live server.
16
17Also, it means that you can do lots of small snapshots. If you run a snapshot every hour, then only a megabyte or two might have changed in your filesystem, so you only upload a megabyte or two - yet you end up with a complete history of your filesystem at hourly intervals in the archive.
18
19Conventional backup systems usually either store a full backup then incrementals to their archives, meaning that doing a restore involves reading the full backup then reading every incremental since and applying them - so to do a restore, you have to download *every version* of the filesystem you've ever uploaded, or you have to do periodic full backups (even though most of your filesystem won't have changed since the last full backup) to reduce the number of incrementals required for a restore. Better results are had from systems that use a special backup server to look after the archive storage, which accept incremental backups and apply them to the snapshot they keep in order to maintain a most-recent snapshot that can be downloaded in a single run; but they then restrict you to using dedicated servers as your archive stores, ruling out cheap scalable solutions like Amazon S3, or just backing up to a removable USB or eSATA disk you attach to your system whenever you do a backup. And dedicated backup servers are complex pieces of software; can you rely on something complex for the fundamental foundation of your data security system?
20
21## System Requirements
22
23Ugarit should run on any POSIX-compliant system that can run [Chicken Scheme](http://www.call-with-current-continuation.org/). It stores and restores all the file attributes reported by the `stat` system call - POSIX mode permissions, UID, GID, mtime, and optionally atime and ctime (although the ctime cannot be restored due to POSIX restrictions). Ugarit will store files, directories, device and character special files, symlinks, and FIFOs.
24
25Support for extended filesystem attributes - ACLs, alternative streams, forks and other metadata - is possible, due to the extensible directory entry format; support for such metadata will be added as required.
26
27Currently, only local filesystem-based archive storage backends are complete: these are suitable for backing up to a removable hard disk or a filesystem shared via NFS or other protocols. They can also be used to snapshot to local disks, although this is obviously then vulnerable to local system failures; if the computer that's being backed up catches fire, you won't be able to restore it from archives that were also ruined!
28
29However, the next backend to be implemented will be one for Amazon S3, and an SFTP backend for storing archives anywhere you can ssh to. Other backends will be implemented on demand; an archive can, in principle, be stored on anything that can store files by name, report on whether a file already exists, and efficiently download a file by name. This rules out magnetic tapes due to their requirement for sequential access.
30
31Although we need to trust that a backend won't lose data (for now), we don't need to trust the backend not to snoop on us, as Ugarit optionally encrypts everything sent to the archive.
32
33## What's in an archive?
34
35An Ugarit archive contains a load of blocks, each up to a maximum size (usually 1MiB, although other backends might impose smaller limits). Each block is identified by the Tiger hash of its contents; this is how Ugarit avoids ever uploading the same data twice, by checking to see if the data to be uploaded already exists in the archive by looking up the hash. The contents of the blocks are compressed and then encrypted before upload.
36
37Every file uploaded is, unless it's small enough to fit in a single block, chopped into blocks, and each block uploaded. This way, the entire contents of your filesystem can be uploaded - or, at least, only the parts of it that aren't already there! The blocks are then tied together to create a snapshot by upload blocks full of the Tiger hashes of the data blocks, and directory blocks uploaded listing the names and attributes of files in directories, along with the hashes of the blocks that contain the files' contents. Even the blocks that contain lists of hashes of other blocks are subject to checking for pre-existence in the archive; if only a few MiB of your hundred-GiB filesystem has changed, then even the index blocks and directory blocks are re-used from previous snapshots.
38
39Once uploaded, a block in the archive is never again changed. After all, if its contents changed, its hash would change, so it would no longer be the same block! However, every block has a reference count, tracking the number of index blocks that refer to it. This means that the archive knows which blocks are shared between multiple snapshots (or shared *within* a snapshot - if a filesystem has more than one copy of the same file, still only one copy is uploaded), so that if a given snapshot is deleted, then the blocks that only that snapshot is using can be deleted to free up space, without corrupting other snapshots by deleting blocks they share. Bear in mind, however, that not all storage backends may support this - there are certain advantages to being an append-only archive. For a start, you can't delete something by accident! The supplied filesystem backend supports deletion, while the logfile backend does not. However, the actual deletion command hasn't been implemented yet either, so it's a moot point for now...
40
41Finally, the archive contains objects called tags. Unlike the blocks, the tags contents can change, and they have meaningful names rather than being identified by hash. Tags identify the top-level blocks of snapshots within the system, from which (by following the chain of hashes down through the index blocks) the entire contents of a snapshot may be found. Unless you happen to have recorded the hash of a snapshot somewhere, the tags are where you find snapshots from when you want to do a restore!
42
43Whenever a snapshot is taken, as soon as Ugarit has uploaded all the files, directories, and index blocks required, it looks up the tag you have identified as the target of the snapshot. If the tag already exists, then the snapshot it currently points to is recorded in the new snapshot as the "previous snapshot"; then the snapshot header containing the previous snapshot hash, along with the date and time and any comments you provide for the snapshot, and is uploaded (as another block, identified by its hash). The tag is then updated to point to the new snapshot.
44
45This way, each tag actually identifies a chronological chain of snapshots. Normally, you would use a tag to identify a filesystem being archived; you'd keep snapshotting the filesystem to the same tag, resulting in all the snapshots of that filesystem hanging from the tag. But if you wanted to remember any particular snapshot (perhaps if it's the snapshot you take before a big upgrade or other risky operation), you can duplicate the tag, in effect 'forking' the chain of snapshots much like a branch in a version control system.
46
47# Using Ugarit
48
49## Installation
50
51Install [Chicken Scheme](http://www.call-with-current-continuation.org/) using their [installation instructions](http://chicken.wiki.br/Getting%20started#Installing%20Chicken).
52
53Ugarit can then be installed by typing (as root):
54
55    chicken-install ugarit
56
57See the [chicken-install manual](http://wiki.call-cc.org/manual/Extensions#chicken-install-reference) for details if you have any trouble, or wish to install into your home directory.
58
59## Setting up an archive
60
61Firstly, you need to know the archive identifier for the place you'll be storing your archives. This depends on your backend.
62
63### Filesystem backend
64
65The filesystem backend creates archives by storing each block or tag in its own file, in a directory. To keep the objects-per-directory count down, it'll split the files into subdirectories.
66
67To set up a new filesystem-backend archive, just create an empty directory that Ugarit will have write access to when it runs. It will probably run as root in order to be able to access the contents of files that aren't world-readable (although that's up to you), so be careful of NFS mounts that have `maproot=nobody` set!
68
69You can then refer to it using the following archive identifier:
70
71      fs "...path to directory..."
72
73### New Logfile backend
74
75The logfile backend works much like the original Venti system. It's append-only - you won't be able to delete old snapshots from a logfile archive, even when I implement deletion. It stores the archive in two sets of files; one is a log of data blocks, split at a specified maximum size, and the other is the metadata: a GDBM file used as an index to locate blocks in the logfiles and to store the blocks' types, a GDBM file of tags, and a counter file used in naming logfiles.
76
77To set up a new logfile archive, just choose where to put the two sets of files. It would be nice to put the metadata on a different physical disk to the logs, to reduce seeking. Create a directory for each, or if you only have one disk, you can put them all in the same directory.
78
79You can then refer to it using the following archive identifier:
80
81      splitlog "...log directory..." "...metadata directory..." max-logfile-size
82
83For most platforms, a max-logfile-size of 900000000 (900 MB) should suffice. For now, don't go much bigger than that on 32-bit systems until Chicken's `file-position` function is fixed to work with files >1GB in size.
84
85### Old Logfile backend
86
87The old logfile backend works much like the original Venti system. It's append-only - you won't be able to delete old snapshots from a logfile archive, even when I implement deletion. It stores the archive in three files; one is a log of data blocks, one is a GDBM index that remembers where in the log each block resides, one is a GDBM of tags.
88
89This worked well, but exposed a bug in Chicken when dealing with files more than about a gigabyte on 32-bit platforms. I fixed that in short order, but it reminded me that some platforms don't like files larger than 2GB anyway, so I wrote a new logfile backend that splits the log file into chunks at a specified point. You probably want to use the new backend - the old backend is kept for compatability only.
90
91To set up an old logfile archive, just choose where to put the three files. It would be nice to put the index and tags on a different physical disk to the log, to reduce seeking.
92
93You can then refer to it using the following archive identifier:
94
95      log "...logfile..." "...indexfile..." "...tagsfile..."
96
97Neither of the files need to exist in advance; Ugarit will create them.
98
99## Writing a ugarit.conf
100
101`ugarit.conf` should look something like this:
102
103      (storage <archive identifier>)
104      (hash tiger "<A secret string>")
105      [(compression [deflate|lzma])]
106      [(encryption aes <key>)]
107      [(file-cache "<path>")]
108      [(rule ...)]
109
110The hash line chooses a hash algorithm. Currently Tiger-192 (`tiger`), SHA-256 (`sha256`), SHA-384 (`sha384`) and SHA-512 (`sha512`) are supported; if you omit the line then Tiger will still be used, but it will be a simple hash of the block with the block type appended, which reveals to attackers what blocks you have (as the hash is of the unencrypted block, and the hash is not encrypted). This is useful for development and testing or for use with trusted archives, but not advised for use with archives that attackers may snoop at. Providing a secret string produces a hash function that hashes the block, the type of block, and the secret string, producing hashes that attackers who can snoop the archive cannot use to find known blocks. Whichever hash function you use, you will need to install the required Chicken egg with one of the following commands:
111
112    sudo chicken-install tiger-hash  # for tiger
113    sudo chicken-install sha2        # for the SHA hashes
114
115`lzma` is the recommended compression option for low-bandwidth backends or when space is tight, but it's very slow to compress; deflate or no compression at all are better for fast local archives. To have no compression at all, just remove the `(compression ...)` line entirely. Likewise, to use compression, you need to install a Chicken egg:
116
117       sudo chicken-install z3       # for deflate
118       sudo chicken-install lzma     # for lzma
119
120Likewise, the `(encryption ...)` line may be omitted to have no encryption; the only currently supported algorithm is aes (in CBC mode) with a key given in hex, as a passphrase (hashed to get a key), or a passphrase read from the terminal on every run. The key may be 16, 24, or 32 bytes for 128-bit, 192-bit or 256-bit AES. To specify a hex key, just supply it as a string, like so:
121
122      (encryption aes "00112233445566778899AABBCCDDEEFF")
123     
124...for 128-bit AES,
125
126      (encryption aes "00112233445566778899AABBCCDDEEFF0011223344556677")
127
128...for 192-bit AES, or
129
130      (encryption aes "00112233445566778899AABBCCDDEEFF00112233445566778899AABBCCDDEEFF")
131     
132...for 256-bit AES.
133
134Alternatively, you can provide a passphrase, and specify how large a key you want it turned into, like so:
135
136      (encryption aes ([16|24|32] "We three kings of Orient are, one in a taxi one in a car, one on a scooter honking his hooter and smoking a fat cigar. Oh, star of wonder, star of light; star with royal dynamite"))
137
138Finally, the extra-paranoid can request that Ugarit prompt for a passphrase on every run and hash it into a key of the specified length, like so:
139
140      (encryption aes ([16|24|32] prompt))
141
142(note the lack of quotes around `prompt`, distinguishing it from a passphrase)
143
144Again, as it is an optional feature, to use encryption, you must install the appropriate Chicken egg:
145
146       sudo chicken-install aes
147
148A file cache, if enabled, significantly speeds up subsequent snapshots of a filesystem tree. The file cache is a file (which Ugarit will create if it doesn't already exist) mapping filenames to (mtime,hash) pairs; as it scans the filesystem, if it files a file in the cache and the mtime has not changed, it will assume it is already archived under the specified hash. This saves it from having to read the entire file to hash it and then check if the hash is present in the archive. In other words, if only a few files have changed since the last snapshot, then snapshotting a directory tree becomes an O(N) operation, where N is the number of files, rather than an O(M) operation, where M is the total size of files involved.
149
150For example:
151
152      (storage splitlog "/net/spiderman/archive/logs" "/net/spiderman/archive/index" 900000000)
153      (hash tiger "Giung0ahKahsh9ahphu5EiGhAhth4eeyDahs2aiWAlohr6raYeequ8uiUr3Oojoh")
154      (encryption aes (32 "deing2Aechediequohdo6Thuvu0OLoh6fohngio9koush9euX6el9iesh6Aef4augh3WiY7phahmesh2Theeziniem5hushai5zigushohnah1quae1ooXo0eingu1Aifeo1eeSheaz9ieSie9tieneibeiPho0quu6um8weiyagh4kaeshooThooNgeyoul2Ahsahgh8imohw3hoyazai9gaph5ohhaechiedeenusaeghahghipe8ii3oo9choh5cieth5iev3jiedohquai4Thiedah5sah5kohcepheixai3aiPainozooc6zohNeiy6Jeigeesie5eithoo0ciiNae8Nee3eiSuKaiza0VaiPai2eeFooNgeengaif9yaiv9rathuoQuohy0ohth6OiL9aisaetheeWoh9aiQu0yoo6aequ3quoiChi7joonohwuvaipeuh2eiPoogh1Ie8tiequesoshaeBue5ieca8eerah0quieJoNoh3Jiesh1chei8weidixeen1yah1ioChie0xaimahWeeriex5eetiichahP9iey5ux7ahGhei7eejahxooch5eiqu0Pheir9Reiri4ahqueijuchae8eeyieMeixa4ciisioloe9oaroof1eegh4idaeNg5aepeip8mah7ixaiSohtoxaiH4oe5eeGoh4eemu7mee8ietaecu6Zoodoo0hoP5uquaish2ahc7nooshi0Aidae2Zee4pheeZee3taerae6Aepu2Ayaith2iivohp8Wuikohvae2Peange6zeihep8eC9mee8johshaech1Ubohd4Ko5caequaezaigohyai1TheeN6Gohva6jinguev4oox2eet5auv0aiyeo7eJieGheebaeMahshifaeDohy8quut4ueFei3eiCheimoechoo2EegiveeDah1sohs7ezee3oaWa2iiv2Chi1haiS5ahph4phu5su0hiocee3ooyaeghang7sho7maiXeo5aex"))
155      (compression lzma)
156
157Be careful to put a set of parentheses around each configuration entry. White space isn't significant, so feel free to indent things and wrap them over lines if you want.
158
159Keep copies of this file safe - you'll need it to do extractions! Print a copy out and lock it in your fire safe! Ok, currently, you might be able to recreate it if you remember where you put the storage, but when I add the `(encryption ...)` option, there will be an encryption key to deal with as well.
160
161## Your first backup
162
163Think of a tag to identify the filesystem you're backing up. If it's `/home` on the server `gandalf`, you might call it `gandalf-home`. If it's the entire filesystem of the server `bilbo`, you might just call it `bilbo`.
164
165Then from your shell, run (as root):
166
167      # ugarit snapshot <ugarit.conf> [-c] [-a] <tag> <path to root of filesystem>
168
169For example, if we have a `ugarit.conf` in the current directory:
170
171      # ugarit snapshot ugarit.conf -c localhost-etc /etc
172
173Specify the `-c` flag if you want to store ctimes in the archive; since it's impossible to restore ctimes when extracting from an archive, doing this is useful only for informational purposes, so it's not done by default. Similarly, atimes aren't stored in the archive unless you specify `-a`, because otherwise, there will be a lot of directory blocks uploaded on every snapshot, as the atime of every file will have been changed by the previous snapshot - so with `-a` specified, on every snapshot, every directory in your filesystem will be uploaded! Ugarit will happily restore atimes if they are found in an archive; their storage is made optional simply because uploading them is costly and rarely useful.
174
175## Exploring the archive
176
177Now you have a backup, you can explore the contents of the archive. This need not be done as root, as long as you can read `ugarit.conf`; however, if you want to extract files, run it as root.
178
179      $ ugarit explore <ugarit.conf>
180
181This will put you into an interactive shell exploring a virtual filesystem. The root directory contains an entry for every tag; if you type `ls` you should see your tag listed, and within that tag, you'll find a list of snapshots, in descending date order, with a special entry `current` for the most recent snapshot. Within a snapshot, you'll find the root directory of your snapshot, and will be able to `cd` into subdirectories, and so on:
182
183      > ls
184      Test <tag>
185      > cd Test
186      /Test> ls
187      2009-01-24 10:28:16 <snapshot>
188      2009-01-24 10:28:16 <snapshot>
189      current <snapshot>
190      /Test> cd current
191      /Test/current> ls   
192      README.txt <file>
193      LICENCE.txt <symlink>
194      subdir <dir>
195      .svn <dir>
196      FIFO <fifo>
197      chardev <character-device>
198      blockdev <block-device>
199      /Test/current> ls -ll LICENCE.txt
200      lrwxr-xr-x 1000 100 2009-01-15 03:02:49 LICENCE.txt -> subdir/LICENCE.txt
201      target: subdir/LICENCE.txt
202      ctime: 1231988569.0
203
204As well as exploring around, you can also extract files or directories (or entire snapshots) by using the `get` command. Ugarit will do its best to restore the metadata of files, subject to the rights of the user you run it as.
205
206Type `help` to get help in the interactive shell.
207
208## Duplicating tags
209
210As mentioned above, you can duplicate a tag, creating two tags that refer to the same snapshot and its history but that can then have their own subsequent history of snapshots applied to each independently, with the following command:
211
212      $ ugarit fork <ugarit.conf> <existing tag> <new tag>
213
214## `.ugarit` files
215
216By default, Ugarit will archive everything it finds in the filesystem tree you tell it to snapshot. However, this might not always be desired; so we provide the facility to override this with `.ugarit` files, or global rules in your `.conf` file.
217
218Note: The syntax of these files is provisional, as I want to experiment with usability, as the current syntax is ugly. So please don't be surprised if the format changes in incompatible ways in subsequent versions!
219
220In quick summary, if you want to ignore all files or directories matching a glob in the current directory and below, put the following in a `.ugarit` file in that directory:
221
222      (* (glob "*~") exclude)
223
224You can write quite complex expressions as well as just globs. The full set of rules is:
225
226* `(glob "`*pattern*`")` matches files and directories whose names match the glob pattern
227* `(name "`*name*`")` matches files and directories with exactly that name (useful for files called `*`...)
228* `(modified-within ` *number* ` seconds)` matches files and directories modified within the given number of seconds
229* `(modified-within ` *number* ` minutes)` matches files and directories modified within the given number of minutes
230* `(modified-within ` *number* ` hours)` matches files and directories modified within the given number of hours
231* `(modified-within ` *number* ` days)` matches files and directories modified within the given number of days
232* `(not ` *rule*`)` matches files and directories that do not match the given rule
233* `(and ` *rule* *rule...*`)` matches files and directories that match all the given rules
234* `(or ` *rule* *rule...*`)` matches files and directories that match any of the given rules
235
236Also, you can override a previous exclusion with an explicit include in a lower-level directory:
237
238    (* (glob "*~") include)
239
240Also, you can bind rules to specific directories, rather than to "this directory and all beneath it", by specifying an absolute or relative path instead of the `*`:
241
242    ("/etc" (name "passwd") exclude)
243
244If you use a relative path, it's taken relative to the directory of the `.ugarit` file.
245
246You can also put some rules in your `.conf` file, although relative paths are illegal there, by adding lines of this form to the file:
247
248    (rule * (glob "*~") exclude)
249
250# Questions and Answers
251
252## What happens if a snapshot is interrupted?
253
254Nothing! Whatever blocks have been uploaded will be uploaded, but the snapshot is only added to the tag once the entire filesystem has been snapshotted. So just start the snapshot again. Any files that have already be uploaded will then not need to be uploaded again, so the second snapshot should proceed quickly to the point where it failed before, and continue from there.
255
256Unless the archive ends up with a partially-uploaded corrupted block due to being interrupted during upload, you'll be fine. The filesystem backend has been written to avoid this by writing the block to a file with the wrong name, then renaming it to the correct name when it's entirely uploaded.
257
258## Should I share a single large archive between all my filesystems?
259
260I think so. Using a single large archive means that blocks shared between servers - eg, software installed from packages and that sort of thing - will only ever need to be uploaded once, saving storage space and upload bandwidth.
261
262# Future Directions
263
264Here's a list of planned developments, in approximate priority order:
265
266## Backends
267
268* Support for SFTP as a storage backend. Store one file per block, as
269  per `backend-fs`, but remotely. See
270  http://tools.ietf.org/html/draft-ietf-secsh-filexfer-13 for sftp
271  protocol specs; popen an `ssh -p sftp` connection to the server then
272  talk that simple binary protocol. Tada!
273
274* Support for S3 as a storage backend. What's the best way to get at
275  the S3 API? Write our own client, or find a C library to wrap?
276
277* Support for recreating the index and tags on a backend-log or
278  backend-splitlog if they get corrupted, from the headers left in the
279  log.
280
281* Support for remote backends. This will involve splitting the
282  backends into separate executables, and having the frontend talk to
283  them via a simple protocol over standard input and output. Then it
284  will be possible to use ssh to talk to backends on remote machines,
285  as well as various other interesting integration opportunities.
286
287* Support for replicated archives. This will involve a special storage
288  backend that can wrap any number of other archives, each tagged with
289  a trust percentage and read and write load weightings. Each block
290  will be uploaded to enough archives to make the total trust be at
291  least 100%, by randomly picking the archives weighted by their write
292  load weighting. A local cache will be kept of which backends carry
293  which blocks, and reads will be serviced by picking the archive that
294  carries it and has the highest read load weighting. If that archive
295  is unavailable or has lost the block, then they will be trued in
296  read load order; and if none of them have it, an exhaustive search
297  of all available archives will be performed before giving up, and
298  the cache updated with the results if the block is found. Users will
299  be recommended to delete the cache if an archive is lost, so it gets
300  recreated in usage, as otherwise the system may assume blocks are
301  present when they are not, and thus fail to upload them when
302  snapshotting.
303
304## Core
305
306* More `.ugarit` actions. Right now we just have exclude and include;
307  we might specify less-safe operations such as commands to run before
308  and after snapshotting certain subtrees, or filters (don't send this
309  SVN repository; instead send the output of `svnadmin dump`),
310  etc. Running arbitrary commands is a security risk if random users
311  write their own `.ugarit` files - so we'd need some trust-based
312  mechanism; they'd need to be explicitly enabled in `ugarit.conf`,
313  then a `.ugarit` option could disable all unsafe operations in a
314  subtree.
315
316* Support for FFS flags, Mac OS X extended filesystem attributes, NTFS
317  ACLs/streams, FAT attributes, etc... Ben says to look at Box Backup
318  for some code to do that sort of thing.
319
320* Implement lock-tag! etc. in backend-fs, as a precaution against two
321  concurrent snapshots racing over updating the tag, where concurrent
322  access to the archive is even possible.
323
324* Deletion support - letting you remove snapshots. Perhaps you might
325  want to remove all snapshots older than a given number of days on a
326  given tag. Or just remove X out of Y snapshots older than a given
327  number of days on a given tag. We have the core support for this;
328  just find a snapshot and `unlink-directory!` it, leaving a dangling
329  pointer from the snapshot, and write the snapshot handling code to
330  expect this. Again, check Box Backup for that.
331
332* Some kind of accounting for storage usage by snapshot. It'd be nice
333  to track, as we write a snapshot to the archive, how many bytes we
334  reuse and how many we back up. We can then store this in the
335  snapshot metadata, and so report them somewhere. The blocks uploaded
336  by a snapshot may well then be reused by other snapshots later on,
337  so it wouldn't be a true measure of 'unique storage', nor a measure
338  of what you'd reclaim by deleting that snapshot, but it'd be
339  interesting anyway.
340
341* Option, when backing up, to not cross mountpoints
342
343* Option, when backing up, to store inode number and mountpoint path
344  in directory entries, and then when extracting, keeping a dictionary
345  of this unique identifier to pathname, so that if a file to be
346  extracted is already in the dictionary and the hash is the same, a
347  hardlink can be created.
348
349* Archival mode as well as snapshot mode. Whereas a snapshot record
350  takes a filesystem tree and adds it to a chain of snapshots of the
351  same filesystem tree, archival mode takes a filesystem tree and
352  inserts it into a search tree anchored on the specified tag,
353  indexing it on a list of key+value properties supplied at archival
354  time. An archive tag is represented in the virtual filesystem as a
355  directory full of archive objects, each identified by their full
356  hash; each archive object references the filesystem root as well as
357  the key+value properties, and optionally a parent link like a
358  snapshot, as an archive can be made that explicitly replaces an
359  earlier one and should replace it in the index; there is also a
360  virtual directory for each indexed property which contains a
361  directory for each value of the property, full of symlinks to the
362  archive objects, and subdirectories that allow multi-property
363  searches on other properties. The index itself is stored as a B-Tree
364  with a reasonably small block size; when it's updated, the modified
365  index blocks are replaced, thereby gaining new hashes, so their
366  parents need replacing, all the way up the tree until a new root
367  block is created. The existing block unlink mechanism in the
368  backends will reclaim storage for blocks that are superceded, if the
369  backend supports it. When this is done, ugarit will offer the option
370  of snapshotting to a snapshot tag, or archiving to an archive tag,
371  or archiving to an archive tag while replacing a specified archive
372  object (nominated by path within the tag), which causes it to be
373  removed from the index (except from the directory listing all
374  archives by hash), and the new archive object is inserted,
375  referencing the old one as a parent.
376
377## Front-end
378
379* Better error messages
380
381* Archive transfer: a command to open two archives. From the source
382  one, it lists all tags, then for each tag, walks the history, and
383  for each snapshot, copies it to the destination archive. For
384  migrating archives to a new backend.
385
386* FUSE support. Mount it as a read-only filesystem :-D Then consider
387  adding Fossil-style writing to the `current` of a snapshot, with
388  copy-on-write of blocks to a buffer area on the local disk, then the
389  option to make a snapshot of `current`.
390
391* More explicit support for archival usage: really, a different kind
392  of tag. Rather than having a chain of snapshots of the same
393  filesystem, the tag would have some kind of database of snapshots,
394  with more emphasis on metadata and searchability.
395
396* Filesystem watching. Even with the hash-caching trick, a snapshot
397  will still involve walking the entire directory tree and looking up
398  every file in the hash cash. We can do better than that - some
399  platforms provide an interface for receiving real-time notifications
400  of changed or added files. Using this, we could allow ugarit to run
401  in continuous mode, keeping a log of file notifications from the OS
402  while it does an initial full snapshot. It can then wait for a
403  specified period (one hour, perhaps?), accumulating names of files
404  changed since it started, before then creating a new snapshot by
405  uploading just the files it knows to have changed, while subsequent
406  file change notifications go to a new list.
407
408## Testing
409
410* An option to verify a snapshot, walking every block in it checking
411  there's no dangling references, and that everything matches its
412  hash, without needing to put it into a filesystem, and applying any
413  other sanity checks we can think of en route. Optionally compare it
414  to an on-disk filesystem, while we're at it.
415
416* A more formal test corpus with a unit test script around the
417  `ugarit` command-line tool; the corpus should contain a mix of tiny
418  and huge files and directories, awkward cases for sharing of blocks
419  (many identical files in the same dir, etc), complex forms of file
420  metadata, and so on. It should archive and restore the corpus
421  several times over with each hash, compression, and encryption
422  option.
423
424# Acknowledgements
425
426The original idea came from Venti, a content-addressed storage system
427from Plan 9. Venti is usable directly by user applications, and is
428also integrated with the Fossil filesystem to support snapshotting the
429status of a Fossil filesystem. Fossil allows references to either be
430to a block number on the Fossil partition or to a Venti key; so when a
431filesystem has been snapshotted, all it now contains is a "root
432directory" pointer into the Venti archive, and any files modified
433therafter are copied-on-write into Fossil where they may be modified
434until the next snapshot.
435
436We're nowhere near that exciting yet, but using FUSE, we might be able
437to do something similar, which might be fun. However, Venti inspired
438me when I read about it years ago; it showed me how elegant
439content-addressed storage is. Finding out that the Git version control
440system used the same basic tricks really just confirmed this for me.
441
442Also, I'd like to tip my hat to Duplicity. With the changing economics
443of storage presented by services like Amazon S3 and rsync.net, I
444looked to Duplicity as it provided both SFTP and S3 backends. However,
445it worked in terms of full and incremental backups, a model that I
446think made sense for magnetic tapes, but loses out to
447content-addressed snapshots when you have random-access
448media. Duplicity inspired me by its adoption of multiple backends, the
449very backends I want to use, but I still hungered for a
450content-addressed snapshot store.
451
452I'd also like to tip my hat to Box Backup. I've only used it a little,
453because it requires a special server to manage the storage (and I want
454to get my backups *off* of my servers), but it also inspires me with
455directions I'd like to take Ugarit. It's much more aware of real-time
456access to random-access storage than Duplicity, and has a very
457interesting continuous background incremental backup mode, moving away
458from the tape-based paradigm of backups as something you do on a
459special day of the week, like some kind of religious observance. I
460hope the author Ben, who is a good friend of mine, won't mind me
461plundering his source code for details on how to request real-time
462notification of changes from the filesystem, and how to read and write
463extended attributes!
464
465Moving on from the world of backup, I'd like to thank the Chicken Team
466for producing Chicken Scheme. Felix, Peter, Elf, and Alex have
467particularly inspired me with their can-do attitudes to combining
468programming-language elegance and pragmatic engineering - two things
469many would think un-unitable enemies. Of course, they didn't do it all
470themselves - R5RS Scheme and the SRFIs provided a solid foundation to
471build on, and there's a cast of many more in the Chicken community,
472working on other bits of Chicken or just egging everyone on. And I
473can't not thank Henry Baker for writing the seminal paper on the
474technique Chicken uses to implement full tail-calling Scheme with
475cheap continuations on top of C; Henry already had my admiration for
476his work on combining elegance and pragmatism in linear logic. Why
477doesn't he return my calls? I even sent flowers.
478
479Thanks to the early adopters who brought me useful feedback, too!
480
481And I'd like to thank my wife for putting up with me spending several
482evenings working on this thing...
483
484# Version history
485
486* 0.6: .ugarit support
487
488* 0.5: Keyed hashing so attackers can't tell what blocks you have,
489  markers in logs so the index can be reconstructed, sha2 support, and
490  passphrase support.
491
492* 0.4: AES encryption
493
494* 0.3: Added splitlog backend, and fixed a .meta file typo
495
496* 0.2: Initial public release
497
498* 0.1: Internal development release
Note: See TracBrowser for help on using the repository browser.