Backshift


rsync --link-dest wrappers


Git wrappers


Bup


Lessfs


Opendedup


Tahoe-LAFS (v1.9)


BackupPC


Amanda


Bacula


Burp


obnam
Link backshift
  • rsync
  • EG:Backup.rsync
  • Git
  • EG: Cold Storage
  • Bup Lessfs Opendedup (SDFS) Tahoe-LAFS BackupPC Amanda Bacula Burp obnam
    The gist Hybrid fullsave/incremental, variable-length block, deduplication and compression system File synchronization tool that also supports a form of deduplication; gives hybrid fullsave/incrementals Backups using Git, a Source Code Control System Variable-length block deduplication, uses some of git's internals Deduplicating filesystem POSIX filesystem that supports deduplication Deduplicating LAFS web filesystem, CLI client, FUSE filesystem Deduplicating, multilevel backup system with scheduling layered over rsync and other tools Traditional fullsave/incremental backup system using GNU tar or dump Aimed-at-the-Enterprise traditional fullsave/incremental backup system with scheduling Uses librsync, does deduplication, Windows client support, client-side encryption, multiple retention periods, automatic client upgrades Python, GnuPG encrypted backups, deduplication, push or pull operation, snapshot backups (hybrid full/incrimental)
    Backs up hardlinks? Backs up hardlinks very well Backs up hardlinks with the -H/--hard-links option; takes more memory Git's object store does not grow due to hardlinks, but neither does it preserve them. ? It sounds like it does Probably not - as of Jan 2011 didn't support hardlinks Backs up hardlinks fine, but on restore they are distinct files ? Yes Yes yes yes
    Retransmits a file on permissions/ownership change No No Probably not - I don't think git stores permissions or ownerships ? ? ? ? ? ? ? ? ?
    Re-saves a file on permissions/ownership change No Yes Probably not - I don't think git stores permissions or ownerships ? ? ? ? ? ? ? ? ?
    Uses hardlinks for deduplication? Uses no hardlinks at all, so it is relatively easy to copy to a new server rsync --link-dest often requires a very large number of hardlinks, complicating matters when it comes time to move your backups to another host Probably not. It uses a hash, similar to backshift ? Probably not Probably not No Yes, complicating matters when it comes time to move your backups to another host No ? yes? no?
    Transmitting small changes to large files (EG: Linux DVD torrent, Log files) Pretty good: Only sends changed blocks Very good: Only sends changed regions of the file Probably good Pretty good ? ? Transmits the whole file unless that whole file already exists Very good, at least if the rsync algorithm is used ? ? Uses librsync Only the changed chunks are transmitted
    Storing small changes to large files (EG: Linux DVD torrent, Log files) Pretty good Not so good: Stores the old and new files in their entirety. For a slow torrent, rsync will require space proportionate to the square of the file's final size Does well Pretty good ? ? Transmits the whole file unless that whole file already exists ? ? ? Uses librsync Only the changed chunks are stored
    Compression over the wire Uses xz (with a bzip2 fallback), but if compression makes the file larger, an uncompressed copy will be used instead Uses gzip/zlib, skipping file extensions known to compress poorly Appears to do this reasonably well ? Probably not ? No Can use ssh's compression over the wire with a small modification ? ? zlib ?
    Compression at rest backshift compresses files, and most metadata using xz (with a bzip2 fallback) - IOW slow but hard rsync does not compress files at rest Compresses using the older zlib algorithm ? Yes - it compresses fast, but not hard ? No Has optional gzip/zlib compression ? ? zlib default gzip
    Browseability Pretty good; it allows browsing in a tar-like manner, though you can jump to a specific directory without waiting for earlier files to be read/uncompressed Very browseable; it uses a bunch of hardlink trees, so you just cd and cp Many git browsers exist ? Very good - it's a filesystem Very browseable, it's a filesystem - but depends on what you're doing backups with overtop of SDFS Very browseable; it's a filesystem Via a web GUI. Other ways too? amrecover - a somewhat involved client for browsing tape indexes bconsole - a somewhat involved client for working with bacula not yet? ?
    Data format Custom, but accessible via an open Python API Just another filesystem Packed blobs - mostly git specific but open Packed blobs - shared with git Unique to lessfs on disk (?) Just presents a POSIX filesystem Custom, but accessible via an open Python API Custom, but presumably open Recoverable without Amanda Custom, but accessible via an open API filesystem? repository
    Incremental behavior
  • Always backs up relative to the immediately prior backup, the largest backup, and the largest completed backup - simultaneously (if they exist)
  • Deleted files are always detected
  • A good rsync wrapper will back up relative to the most recent incomplete backup (if any), as well as the most recent completed backup, though not all will.
  • Deleted files are always detected
  • Pretty nice ? No incrementals. It's not a backup product. No incrementals - it's not a backup product deduplicates entire files, keeps a client-side cache of timestamps of backed up files so that it doesn't attempt to backup files that haven't changed since the last backup.
  • Traditional, multilevel fullsave/incrementals
  • BackupPC's rsync incrementals detect deleted files
  • BackupPC's SMB and tar incrementals do not detect deleted files
  • Traditional fullsave/incremental - terminology just like *ix "dump"
  • Deleted files are always detected
  • Full, Differential, Incremental, Consolidation
  • Deleted files are always detected
  • ? All backups are full with deduplication
    Deduplication Deduplicates down to content-based, variable-length blocks of 2 megabytes on average for large files, across the same or different machines Stores a single copy for a given filename if its content doesn't change Pretty nice Deduplicates down to content-based, variable-length blocks of $SOMETHING_OR_OTHER megabytes on average for large files, across the same or different machines Yes - but fixed-width block aligned? Yes, but fixed width blocks? Yes, but whole files Stores a single copy of a given file content across the same or different systems Probably not Kind of, using "base jobs" On server, with bedup command yes
    Implementation language Python: easy to read, easy to write. Optional Cython. Pretty fast with Pypy. Runs on many different Python interpreters C: Relatively inflexible internally, very fast C: Relatively inflexible internally, very fast Mostly Python, C for performance-critical portions C: Relatively inflexible internally, very fast
  • Java: portable, relatively flexible, reasonably fast
  • C: relatively inflexible internally, very fast
  • Python, C, C++
  • Perl: write-mostly
  • C: relatively inflexible internally, very fast
  • Perl: write-mostly
  • C: relatively inflexible internally, very fast
  • C++, but mostly using C features: relatively inflexible internally, very fast C: relatively inflexible internally, very fast python
    Client-side software?
  • Backup: Can back up from a remote filesystem (sshfs, NFS, CIFS, etc.), but will be faster (especially on the initial fullsave) if client-side software is used (especially if there are many duplicate blocks)
  • Restore: No client necessary but tar, but can make use of a local client install to avoid retransmitting duplicate blocks. Can back up from a remote system using a remote filesystem, but remote filesystems (especially on the initial fullsave) will be slower without a client
  • Can back up from a remote system using native rsync, or network filesystem. Network filesystems imply --whole-file. Yes ? Perhaps NFS or similar ? Yes (?) One needs something, but it could be as simple as cp -r ? Unix and Windows (with VSS), supports encryption/SSL, automatic upgrade UNIX only?
    CLI? Yes Yes Yes ? Yes Yes Yes Yes Yes Yes CLI: yes CLI: yes
    GUI? No, backshift is an engine only No, rsync is an engine only Probably depends on what wrapper you're using ? Probably not - it's a filesystem Probably not? - it's a filesystem Yes (Web) Yes: Web Yes (but only in the Enterprise Edition, not the Opensource Edition?) Yes: Qt GUI: ncurses server monitor GUI: no
    DHCP supported (interrupted backup resumption)? Yes, very well Yes, pretty well Probably ? ? ? ? Yes, pretty well ? ? DHCP supported: yes, resumes DHCP supported: ?
    Scheduling? Only via cron or launchd or task scheduler Only via cron or launchd or task scheduler Probably depends on what wrapper you're using ? No (not really a backup product) No (not really a backup product) Only via cron or launchd or task scheduler Yes (builtin or cron) Yes (sounds builtin) Yes (builtin) Scheduling: yes, with cron Scheduling: ?
    Failed backup notices? Only via wrapper scripts Only via wrapper scripts Probably depends on what wrapper you're using ? No (not really a backup product) No (not really a backup product) Only via wrapper scripts Yes ? ? Failed backup notices: yes, via email Failed backup notices: ?
    Documented? Yes, well Yes, well Yes, probably quite well Yes I didn't find much Yes (well?) Yes ? Yes, well ? Documented: good man page Documented: good man page
    Transmits data encrypted? Not on its own, but it will if you use sshfs, Secure NFS or similar
  • ssh: yes
  • rsyncd: no?
  • rsh: obsolete, don't use
  • Maybe ? Not a network transport ? Yes ? Yes ? Transmits data encrypted: yes, SSL Transmits data encrypted: ?
    Stores data encrypted? Only if saving to an encrypted filesystem Only if saving to an encrypted filesystem With some configuration it becomes possible ? Yes ? Yes ? Yes ? Stores data encrypted: implemented at client, disables delta differencing Stores data encrypted: yes
    Means of restores? Writes GNU tar to stdout, creating a tarball on the fly from file chunks and metadata Filesystem to filesystem transfer git pull or similar ? cp, tar, rsync... It's a filesystem cp, tar, rsync... It's a filesystem recursively download the files (therefore hardlinks aren't saved and restored, nor is metadata such as file ownership, permission bits, timestamps)
  • Web GUI: Download .tar or .zip
  • Other?
  • amrestore Console (bconsole?) restore command Means of restores: scripts, more later? Means of restores: obnam saves to filesystem location
    Permissions / ownership Backs up POSIX permissions bits, owners and groups, without requiring root on the backup server Modern versions back up permission bits, owners and groups. No longer requires root on server with --fake-super Commonly not backed up Yes ? ? Not stored ? ? ? Permissions/ownership: Linux acl and xattr support Permissions/ownership: yes, mentioned it man page
    Media (disk to disk, disk to tape) Disk to disk only Disk to disk only Disk to disk Disk to disk Disk to disk - it's a filesystem Disk to disk only Disk to cloud :-) Disk to disk only Disk to tape, Disk to disk too Disk to tape normally, also supports Disk to disk but treats the disk kind of like multiple tapes Media: disk Media: disk
    Supported platforms All manner of *ix, including Mac OS/X, plus Cygwin (Windows) Nearly anything with a C compiler, including Cygwin Windows, Mac, Linux, likely other *ix Linux, OS/X, Solaris, Windows with Cygwin FUSE Linux, Windows All manner of *ix, including Mac OS/X, plus Cygwin (Windows)
  • *ix: Perl, C compiler
  • Windows over CIFS
  • An extensive list A variety of *ix's and Windows Platforms: UNIX server, UNIX or Windows client Platforms: UNIX only?
    Can expire old files? Yes, but only on a repo granularity, not a host granularity Yes, easily Usually not I don't believe so Depends on what you layer over it Depends on what you layer over it Yes, on a per-file granularity ? ? ? Expiration: Supports multiple retention periods Expiration: Supports multiple retention policies
    License(s) Mostly GPLv3, but also UCI, Apache and MIT GPLv3 Git is GPLv2, wrappers may have their own licenses LGPLv2, 2 clause BSD GPLv3 GPLv2 GPLv2, GPLv3, Transitive Grace Period Public Licence GPLv2 or later, at your option Amanda license, GPLv2, LGPLv2.1 GPLv2, LGPLv?, Public domain, registered trademark License(s): AGPLv3 License(s): GNU GPLv3+