Note: This web page was automatically created from a PalmOS "pedit32" memo.

backup notes

Networker stuff: These -q's, ISTR, make things faster: Recover -q add -q /\/\/\/\/\/\/\/\/\/\ unchanged changetime gets most recent /\/\/\/\/\/\/\/\/\/\ Verifying host backup status (for the host you run the command on), but be careful if a host has been renamed recently: for host in $(cat /dcslib/allsys/packages/nwclient/servers); do echo $host; maxtime 10 /dcs/packages/nwclient/sbin/mminfo -s $host -c $(uname -n); done /\/\/\/\/\/\/\/\/\/\ Using networker to extract data from a different host than the one you're running networker on: recover -q -s -c ...of course, the host you are on needs to be granted permission to do this for the other host by a networker admin. /\/\/\/\/\/\/\/\/\/\ Dave Mussoff on troubleshooting nsrexecd backups, 2005-09-08: ...backups tend to stress a system in many different ways, so "backup problems" are often systemic problems. Backups cause stress on things that include, but are not limited to: CPU, memory, disk drives, network bandwidth, etc. Typical diagnostic things that I suggest people do are: - log what files are open during the backup periodically (something like "lsof -p <nsrexecd PID> > somefile" in a cronjob every 15 mins ) - load monitoring - fsck if it hasn't happened recently - memory module test (either by using bootdisk-type software or swapping modules and waiting to see) - stress test the drives - bandwidth monitoring /\/\/\/\/\/\/\/\/\/\ Adding a host to CCS Networker backups: /\/\/\/\/\/\/\/\/\/\ Sniffing for activity from any of the NACS networker servers: tethereal -t ad host $(python -c 'import string; import sys; list=sys.stdin.readlines(); print string.joinfields(list," or ")' < /dcslib/allsys/packages/nwclient/servers) (Be careful to get the right interface, if you're on a host with more than one)
Date: Wed, 25 Jan 2006 10:12:13 -0800 FYI, I've pushed out Networker 7.1.4 for Solaris and Linux in DCSLIB; the default /dcs/packages/nwclient has been updated to point to this new version. Since we only have 3 Tru54 and 3 IRIX machines being backed up, I suggest we move to a local install of Networker for these machines. The Tru64 version requires V5.0.5; HP only officially supports V 5.1B as of September 30, 2005. If you can't update to the latest client, then you'll have to wrap the Networker client such that only our backup servers can access it. -Francisco
Other backup programs of interest: Encrypts with GPG, based on python and librsync. Transmits only changed bits of a tar archive. Considered unready for production use by its web page maintainer (and more?), 2005-10-29. Requires a POSIX OS, not sure about Cygwin. /\/\/\/\/\/\/\/\/\/\ It stores only one copy of identical files, uses the rsync protocol, and has a web interface. No client-side software needed/used. Written in perl, considered by its authors to be enterprise grade, can perform backups of Linux and windows, can use SMB, ssh or rsync for transmission. "powerful web interface". /\/\/\/\/\/\/\/\/\/\ Hdup also looks interesting - compression, encryption, via ssh... Author lost his mailing list's subscriber list in a disk crash?? Can encrypt with GPG or mcrypt, can chunk up backups, can do remote backups via ssh, designed to run from cron, implementation language not clear yet. /\/\/\/\/\/\/\/\/\/\ Written in bash, based on rsync, only uses standard tools, intended to be run from cron. /\/\/\/\/\/\/\/\/\/\ Rumor has it bacula is modeled after networker: It looks like major overkill for a home backup scenario. Works with *ix and windows. /\/\/\/\/\/\/\/\/\/\ Includes a nice discussion of why disk backups are better than tape backups - but they're wrong about tapes being impossible to RAID together - Er, not for capacity, but you could for redundancy easily. Done via ssh. Maintains multiple complete trees that correspond to backups, using hard links. Doesn't support windows? It's a shell script around rsync. /\/\/\/\/\/\/\/\/\/\ Based on rsync. /\/\/\/\/\/\/\/\/\/\ rdiff-backup is in python, and shares lineage with duplicity. However, duplicity uses forward deltas, and rdiff-backup uses reverse deltas. Both use librsync. rdiff-backup, rather than using rsync, uses rdiff, which is a binary and text-capable diff tool that uses the rsync algorithm.
Locating the 10 directories with the largest number of files in them: find / /boot /export/bill{,1,2,4} -xdev -type d -print | while read dir; do echo "$(ls -f $dir | wc -l) $dir"; done | sort -nr | head -10 sure to tailor the filesystem list at the beginning of the command for the host you're running it on. This is relevant for backups, because many backup systems will slow way down on directories that have a huge number of files - but this is more of an aspect of the filesystem type the directory is in, than of the backup software in most cases.
Backups and encryption: 1) The most straightforward way would be to ufsdump to file "-" instead of a local file, remote host:file, or device file, and then pipe that through some sort of encryption program. 2) Generally speaking, low entropy files will be relatively easy for a cryptographer to decrypt, even if they're encrypted with a strong cipher, because of "known plaintext attacks". 3) You can turn a low entropy file into a high entropy file, by compressing it 4) You can turn a filesystem backup with low entropy files into a filesystem back with high entropy, by piping the entire save through a compression program like gzip, -before- piping it into your encryption program 5) By compressing your data, you make recovery from a partially destroyed backup much, much, much harder. Possibly impossible, even. 6) If you're backing up to a tape drive remotely, and need to use a pipe, then you might want to look at my "reblock" program, which seems to handle the needed reblocking on the far side of a socket better than dd, from what I've been able to discern. 7) If you use gpg for encryption, then with a good choice of encryption algorithm, you can use public key cryptography. This should obviate the need to have a copy of your encryption key on both the host being backed up, and the host you're backing up to. References:
Counting the number of files on a tape: mt -f /dev/rmt/0n eom let COUNT=`mt -f /dev/rmt/0n status|awk '/file/ {print$3}'`
Christopher McCrory on oclug: single machine? farm? single OS? multiple machines? special apps? media? media format? single location? multiple locations? multiple countries? bandwidth? time restrictions? retention policy? encryption policy? access policy?
As with routers, there are many variations on this theme. There are backup systems that * Use standard formats or proprietary formats * are for a single host to another single host, or for a bunch of hosts to a single host, or for a bunch of hosts to a series of backup servers * there are backup systems that emphasize data security, or data integrity, or data compression, or network performance * there are backup systems that emphasize recovering a specific file or homedir, and backup systems that target restoring all files on a given host to recover from a disk failure * Don't compress at all, compress via common file reduction, common file segment reduction on block boundaries, common file segment reduction without any particular boundaries, or use traditional compression tools like gzip/bzip2/rzip. * Get good homedir recovery and disk failure recovery, but at the expense of a huge inode count * Are disk to disk, or disk to tape, or disk to disk to tape ...and so on.
A design for a modern backup system:
2006-01-24 Francisco tells me that John Ward and Phil Orwig are handling NACS' networker backups

Back to Dan's palm memos