Changeset 25527 in project


Ignore:
Timestamp:
11/20/11 18:11:42 (9 years ago)
Author:
Alaric Snell-Pym
Message:

ugarit: Tracking archive space usage stats. Also migrated to using the miscmacros inc! macro to increment all those pesky counters nicely.

Location:
release/4/ugarit/trunk
Files:
6 edited

Legend:

Unmodified
Added
Removed
  • release/4/ugarit/trunk/README.txt

    r25525 r25527  
    797797  hash collisions. Maybe have levels of double-check-ness.
    798798
    799 * Everywhere I use (sql ...) to create an sqlite prepared statement,
    800   don't. Create them all up-front and reuse the resulting statement
    801   objects, it'll save memory and time. (done for backend-fs/splitlog
    802   and backend/cache, file-cache still needs it).
    803 
    804799* Migrate the source repo to Fossil (when there's a
    805800  kitten-technologies.co.uk migration to Fossil), and update the egg
    806801  locations thingy.
    807802
     803* Profile the system. As of 1.0.1, having done the periodic SQLite
     804  commits improvement, Ugarit is doing around 250KiB/sec on my home
     805  fileserver, but using 87% CPU in the ugarit procesa and 25% in the
     806  backend-fs process, when dealing with large files (so full 1MiB
     807  blocks are being processed). This suggests that the main
     808  block-handling loop in `store-file!` is less than efficient; reading
     809  via `current-input-port` rather than using the POSIX egg `file-read`
     810  functions may be a mistake, and there is probably more copying afoot
     811  than we need.
     812
    808813## Backends
    809 
    810 * Look at http://bugs.call-cc.org/ticket/492 - can this help?
    811814
    812815* Extend the backend protocol with a special "admin" command that
     
    833836
    834837* Support for flushing the cache on a backend-cache, via an admin
    835   command.
     838  command, rather than having to delete the cache file.
    836839
    837840* Support for unlinking in backend-splitlog, by marking byte ranges as
     
    845848  interface, along with the option to compact any or all files.
    846849
    847 * Have read-only and unlinkable config flags in the backend-split
    848   metadata file, settable via admin commands.
     850* Have read-only and unlinkable and block size config flags in the
     851  backend-split metadata file, settable via admin commands.
    849852
    850853* For people doing remote backups who want to not hog resources, write
     
    910913## Core
    911914
     915* SIGINFO support. Add a SIGINFO handler that sets a flag, and make
     916  the `store-file!` and `store-directory!` main loops look for the
     917  flag and, if set, display what path we're working on, and perhaps a
     918  quick summary of the bytes/blocks stored/skipped stats.
     919
     920* Look at http://bugs.call-cc.org/ticket/492 - can we now ditch our
     921  own POSIX wrappers and use this egg?
     922
    912923* Log all WARNINGs produced during a snapshot job, and attach them to
    913924  the snapshot object as a text file.
     
    920931
    921932* Clarify what characters are legal in block keys. Ugarit will only
    922   issue hex characters for normal blocks, but may use other characters
    923   for special metadata blocks; establish a contract of what backends
    924   must support (a-z, A-Z, 0-9, hyphen?)
     933  issue [a-zA-Z0-9] for normal blocks, but may use other characters
     934  (hash?) for special metadata blocks; establish a contract of what
     935  backends must support (a-z, A-Z, 0-9, hash?)
    925936
    926937* API documentation for the modules we export
     
    977988  pointer from the snapshot, and write the snapshot handling code to
    978989  expect this. Again, check Box Backup for that.
    979 
    980 * Some kind of accounting for storage usage by snapshot. It'd be nice
    981   to track, as we write a snapshot to the archive, how many bytes we
    982   reuse and how many we back up. We can then store this in the
    983   snapshot metadata, and so report them somewhere. The blocks uploaded
    984   by a snapshot may well then be reused by other snapshots later on,
    985   so it wouldn't be a true measure of 'unique storage', nor a measure
    986   of what you'd reclaim by deleting that snapshot, but it'd be
    987   interesting anyway.
    988990
    989991* Option, when backing up, to not cross mountpoints
     
    11521154
    11531155* 1.0.2: Made the file cache also commit periodically, rather than on
    1154   every write, in order to improve performance.
     1156  every write, in order to improve performance. Counting blocks and
     1157  bytes uploaded / reused, and file cache bytes as well as hits;
     1158  reporting same in snapshot UI and logging same to snapshot metadata.
    11551159
    11561160* 1.0.1: Consistency check on read blocks by default. Removed warning
  • release/4/ugarit/trunk/backend-cache.scm

    r25521 r25527  
    22(use sql-de-lite)
    33(use matchable)
     4(use miscmacros)
    45
    56(define cache-sql-schema
     
    2728     (set! *updates-since-last-commit* 0))
    2829   (define (maybe-flush!)
    29      (set! *updates-since-last-commit*
    30            (+ *updates-since-last-commit* 1))
     30     (inc! *updates-since-last-commit*)
    3131     (when (> *updates-since-last-commit* commit-interval)
    3232           (flush!)))
  • release/4/ugarit/trunk/backend-fs.scm

    r25521 r25527  
    44(use matchable)
    55(use regex)
     6(use miscmacros)
    67
    78(define (backend-fs base)
     
    215216                   (set! *updates-since-last-commit* 0)))
    216217         (maybe-flush! (lambda ()
    217                          (set! *updates-since-last-commit*
    218                                (+ *updates-since-last-commit* 1))
     218                         (inc! *updates-since-last-commit*)
    219219                         (when (> *updates-since-last-commit* commit-interval)
    220220                             (flush!))))
  • release/4/ugarit/trunk/test/run.scm

    r25479 r25527  
    478478                                                     (cons snapshot acc))
    479479                                                   '()))
     480                 (pp result)
    480481                 (test-assert "History has expected form"
    481482                              (match result
    482483                                     (((('previous . sk1*)
    483484                                        ('mtime . _)
    484                                         ('contents . dir2-key*))
     485                                        ('contents . dir2-key*)
     486                                        ('stats . _))
    485487                                       (('mtime . _)
    486                                         ('contents . dir-key*)))
     488                                        ('contents . dir-key*)
     489                                        ('stats . _)))
    487490                                      (and (string=? sk1 sk1*)
    488491                                           (string=? dir2-key dir2-key*)
     
    502505                 (test-define-values "Walk the history of tag 'Test' with fold-archive-node" (tag)
    503506                                     (fold-archive-node a '(tag . "Test") (lambda (name dirent acc) (cons (cons name dirent) acc)) '()))
     507                 (pp tag)
    504508                 (test-assert "Tag history has expected form"
    505509                              (match tag
     
    509513                                        ('previous . sk1*)
    510514                                        ('mtime . _)
    511                                         ('contents . dir-key*))
     515                                        ('contents . dir-key*)
     516                                        ('stats . _))
    512517                                       (dir-key-c**
    513518                                        _
     
    515520                                        ('previous . sk1**)
    516521                                        ('mtime . _)
    517                                         ('contents . dir-key**))
     522                                        ('contents . dir-key**)
     523                                        ('stats . _))
    518524                                       (dir-key-c***
    519525                                        _
    520526                                        'snapshot
    521527                                        ('mtime . _)
    522                                         ('contents . dir-key***)))
     528                                        ('contents . dir-key***)
     529                                        ('stats . _)))
    523530                                      (and
    524531                                       (string=? sk1 sk1*)
  • release/4/ugarit/trunk/ugarit-core.scm

    r25525 r25527  
    44         archive-hash
    55         archive-global-directory-rules
     6         archive-snapshot-blocks-stored
     7         archive-snapshot-bytes-stored
     8         archive-snapshot-blocks-skipped
     9         archive-snapshot-bytes-skipped
    610         archive-file-cache-hits
     11         archive-file-cache-bytes
    712         archive-writable?
    813         archive-unlinkable?
     
    105110;;
    106111;; THE ARCHIVE
     112;; This thing is becoming a bit of a God Object. Figure out how to
     113;; refactor it a bit, perhaps?
    107114;;
    108115
     
    118125  decrypt ; the decryptor, inverse of the above
    119126  global-directory-rules ; top-level directory rules
     127
     128  ; Snapshot counters
     129  (setter snapshot-blocks-stored)              ; Blocks written to storage
     130  (setter snapshot-bytes-stored)               ; Bytes written to storage
     131  (setter snapshot-blocks-skipped)             ; Blocks already in storage and reused (not including file cache wins)
     132  (setter snapshot-bytes-skipped)              ; Bytes already in storage and reused (not including file cache wins)
     133
     134  ; File cache
    120135  file-cache ; sqlite db storing filesystem cache (see store-file! procedure); #f if not enabled
    121136  file-cache-get-query ; sqlite stored procedure
    122137  file-cache-set-query ; sqlite stored procedure
    123   file-cache-updates-uncommitted ; count of updates since last commit
    124   file-cache-hits ; count of file cache hits
     138  (setter file-cache-updates-uncommitted) ; count of updates since last commit
     139  (setter file-cache-hits)              ; count of file cache hits
     140  (setter file-cache-bytes)                 ; count of file cache bytes saved
    125141  )
    126142
     
    133149        (exec (sql (archive-file-cache archive) "commit;"))
    134150        (exec (sql (archive-file-cache archive) "begin;"))
    135         (archive-file-cache-updates-uncommitted-set! archive 0))
     151        (set! (archive-file-cache-updates-uncommitted archive) 0))
    136152  (exec (archive-file-cache-set-query archive)
    137         file-path mtime size key))
     153        file-path mtime size key)
     154  (inc! (archive-file-cache-updates-uncommitted archive)))
    138155
    139156(define (file-cache-get archive file-path mtime size)
     
    295312       decrypt
    296313       *global-rules*
     314       ; Snapshot counters
     315       0 0 0 0
     316       ; File cache
    297317       *file-cache*
    298318       (if *file-cache* (sql *file-cache* "SELECT key FROM files WHERE path = ? AND mtime = ? AND size = ?") #f)
    299319       (if *file-cache* (sql *file-cache* "INSERT OR REPLACE INTO files (path,mtime,size,key) VALUES (?,?,?,?)") #f)
    300        0 0))))
     320       0 0 0))))
    301321
    302322                                        ; Take a block, and return a compressed and encrypted block
     
    327347      (signal (make-property-condition 'exn 'location 'check-archive-unlinkable 'message "This isn't an unlinkable archive - it's append-only"))))
    328348
     349(define (archive-log-reuse! archive data)
     350  (inc! (archive-snapshot-blocks-skipped archive))
     351  (inc! (archive-snapshot-bytes-skipped archive) (u8vector-length data)))
     352
    329353(define (archive-put! archive key data type)
    330   (if (not (archive-writable? archive))
     354  (when (not (archive-writable? archive))
    331355      (signal (make-property-condition 'exn 'location 'archive-put! 'message "This isn't a writable archive")))
    332   ((storage-put! (archive-storage archive)) key (wrap-block archive data) type))
     356  ((storage-put! (archive-storage archive)) key (wrap-block archive data) type)
     357  (inc! (archive-snapshot-blocks-stored archive))
     358  (inc! (archive-snapshot-bytes-stored archive) (u8vector-length data))
     359  (void))
    333360
    334361(define (archive-exists? archive key)
     
    426453
    427454    (if (archive-exists? archive hash)
    428         (values (reusing hash) #t)
     455        (begin
     456          (archive-log-reuse! archive data)
     457          (values (reusing hash) #t))
    429458        (begin
    430459          (archive-put! archive hash data type)
     
    477506                             (set! *key-buffer-bytes* 0)
    478507                             (set! *key-buffer-reused?* #t)
     508                             (archive-log-reuse! archive keys-serialised)
    479509                             (values (reusing hash) #t)) ; We, too, are reused
    480510                           (begin ; We are unique and new and precious!
     
    599629          (if cache-result ;; FIXME: This assumes that the cached file IS in the archive. Give a configurable option to make it check this, making the file-cache a file hash cache rather than also being an archive presence cache like backend-cache as well, for safety.
    600630              (begin
    601                 (archive-file-cache-hits-set! archive
    602                                               (+ (archive-file-cache-hits archive) 1))
     631                (inc! (archive-file-cache-hits archive))
     632                (inc! (archive-file-cache-bytes archive) size)
    603633                (values cache-result #t)) ; Found in cache! Woot!
    604634              (store-file-and-cache! mtime size))) ; not in cache
     
    666696                                 (set! *key-buffer* '())
    667697                                 (set! *key-buffer-reused?* #t)
     698                                 (archive-log-reuse! archive serialised-buffer)
    668699                                 (values (reusing hash) #t)) ; We, too, are reused
    669700                               (begin ; We are unique and new and precious!
     
    9931024
    9941025    (if (archive-exists? archive hash)
    995         (values (reusing hash) #t)
     1026        (begin
     1027          (archive-log-reuse! archive data)
     1028          (values (reusing hash) #t))
    9961029        (begin
    9971030          (for-each (lambda (key)
     
    10171050;; 'notes (user-supplied notes)
    10181051;; 'previous (hash of previous snapshot)
     1052;; 'stats (alist of stats:
     1053;;         'blocks-stored
     1054;;         'bytes-stored
     1055;;         'blocks-skipped
     1056;;         'bytes-skipped
     1057;;         'file-cache-hits
     1058;;         'file-cache-bytes
    10191059;; Returns the snapshot's key.
    10201060(define (tag-snapshot! archive tag contents-key contents-reused? snapshot-properties)
    10211061  (check-archive-writable archive)
    10221062  (archive-lock-tag! archive tag)
    1023   (let ((previous (archive-tag archive tag))
    1024         (snapshot
    1025          (append
    1026           (list
    1027            (cons 'mtime (current-seconds))
    1028            (cons 'contents contents-key))
    1029           snapshot-properties))
    1030         (keys
    1031          (list ; We do not list the previous snapshot - since we are about to overwrite the tag that points to it, which would be a decrement.
    1032           (cons contents-key contents-reused?))))
     1063  (let* ((previous (archive-tag archive tag))
     1064         (stats (list
     1065                 (cons 'blocks-stored (archive-snapshot-blocks-stored archive))
     1066                 (cons 'bytes-stored (archive-snapshot-bytes-stored archive))
     1067                 (cons 'blocks-skipped (archive-snapshot-blocks-skipped archive))
     1068                 (cons 'bytes-skipped (archive-snapshot-bytes-skipped archive))
     1069                 (cons 'file-cache-hits (archive-file-cache-hits archive))
     1070                 (cons 'file-cache-bytes (archive-file-cache-bytes archive))))
     1071         (snapshot
     1072          (append
     1073           (list
     1074            (cons 'mtime (current-seconds))
     1075            (cons 'contents contents-key)
     1076            (cons 'stats stats))
     1077           snapshot-properties))
     1078         (keys
     1079          (list ; We do not list the previous snapshot - since we are about to overwrite the tag that points to it, which would be a decrement.
     1080           (cons contents-key contents-reused?))))
    10331081    (if previous
    10341082        (begin
  • release/4/ugarit/trunk/ugarit.scm

    r25479 r25527  
    274274               (cons 'notes *snapshot-notes*)))))
    275275            (printf "Successfully archived ~A to tag ~A\n" fspath tag)
     276            (printf "Snapshot hash: ~A\n" snapshot-key)
     277            (printf "Written ~A bytes to the archive in ~A blocks, and reused ~A bytes in ~A blocks (before compression)\n"
     278                    (archive-snapshot-bytes-stored archive)
     279                    (archive-snapshot-blocks-stored archive)
     280                    (archive-snapshot-bytes-skipped archive)
     281                    (archive-snapshot-blocks-skipped archive))
    276282            (if (positive? (archive-file-cache-hits archive))
    277                 (printf "File cache has saved us ~A file hashings\n"
    278                          (archive-file-cache-hits archive)))
    279             (printf "Snapshot hash: ~A\n" snapshot-key)
     283                (printf "File cache has saved us ~A file hashings / ~A bytes (before compression)\n"
     284                         (archive-file-cache-hits archive)
     285                         (archive-file-cache-bytes archive)))
    280286            (archive-close! archive))))
    281      
     287
    282288   (("explore" confpath)
    283289      (let ((archive (open-archive
Note: See TracChangeset for help on using the changeset viewer.