• At http://burp.grke.org/burp2/08results1.html you can find a performance comparison between some backup applications.
  • The comparison did not compare backshift, because backshift was believed to have prohibitively slow deduplication.
  • Backshift is truly not a speed-demon. It is designed to:
    1. minimize storage requirements
    2. minimize bandwidth requirements
    3. emphasize parallel (concurrent backups of different computers) performance to some extent
    4. allow expiration of old data that is no longer needed
  • If you have a need for speed, you might check into an rsync wrapper like Backup.rsync. It requires considerably more storage, but it is very fast.
  • Also, it was almost certainly not backshift's deduplication that was slow, it was:
    1. backshift's variable-length, content-based blocking algorithm. This makes python inspect every byte of the backup, one byte at a time.
    2. backshift's use of xz compression. xz packs files very hard, reducing storage and bandwidth requirements, but it is known to be slower than something like gzip that doesn't compress as well.
  • Also, while the initial fullsave is slow, subsequent backups are much faster because they do not reblock or recompress any files that still have the same mtime and size as found in 1 of (up to) 3 previous backups.
  • Also, if you run backshift on Pypy, its variable-length, content-based blocking algorithm is many times faster than if you run it on CPython. Pypy is not only faster than CPython, it's also much faster than CPython augmented with Cython.
  • I received an e-mail from G.P.E Keeling in Feb 2016. He intends to spend a little more time on a similar comparison in the future.