• To keep things simple, you might try drsbackup.
  • I usually use something like: You can do basically the same thing with ssh too.  It's slower, but sometimes easier, and always more secure.
  • The &'s above are preferred over semicolons, because with the &'s, if you typo on a directory, the command will end relatively quickly, without doing any actual copying, and won't accidentally copy from the wrong place, or worse, to the wrong place.
  • On linux, of course, tar is really gtar, so you can drop the g and use an alternate path.
  • The impact of compression
  • If you want a running update on how things are progressing, you can add reblock to your pipeline like this:
  • rsync can be really helpful if most of the data in the /dest/dir is identical to the data in the /src/dir.  This opens the door to doing a full copy via tar|rsh 'tar' or something, with users still active in the filesystem, and then doing an rsync with the users kicked out of the filesystem, to just copy over what's changed since the tar pipeline.
    Usage looks like: I should add that native rsync (IE, hanging an rsync off of inetd/xinetd and connecting to that) turns out to be a good choice for fast datatransfers on a gigabit network with jumbo frames enabled. Despite using NFS v3 over TCP with 8k rsize/wsize, native rsync was still 344% faster transferring data from a Redhat 9 system to an RHEL 3 system.
  • Compression helps on slow links, but makes things worse on slow CPU's, same as for the rsh pipelines above.  The progress info from rsync is very different from that of reblock -t, but both are useful.  --rsh defaults to ssh these days.
  • Copying data over remote filesystems like NFS, GFS, AFS, Intermezzo, Lustre or other similar filesystems is best avoided if possible; they tend to be slower than rsh or ssh in most cases. Also, if you're copying over a remote filesystem, compression (apart from that done by the filesystem itself) is unlikely to help speed up the copy. However, some filesystems, including GFS and Lustre, do not allow copying into a disk-based filesystem directly; you must go through the remote filesystem. Also, when copying from one remote filesystem to another, rsync may do excessive reading for files that have already been transferred. At least in the case of NFS, writing tends to be slower than reading, so if you have a choice, choice NFS reads over NFS writes.
  • Copying from one disk to another on the same system can be achieved with a tar pipeline or rsync command:
    1. (cd /src/dir && gtar cflS - .) | (cd /dst/dir && gtar xfp -)
    2. cd /src/dir && rsync -a --numeric-ids --progress --delete . /dest/dir
  • But be careful with your pathnames! This one copies from /spare to /export/home/spare
  • Copying from a disk to another partition on the same disk, can be accomplished like the following: You may even want to use an even larger blocksize on the reblock (the 1048576 is the blocksize).

    Note: at one time I had hypothesized that:

  • Copying from a single partition on a disk to another directory in the same partition of the same disk... If you don't want to just mv :), then you have a couple of options. You can do the same thing as above on copying to a different partition on the same disk, or you have the option of using the following, which creates hard links (one file, two names) for each file you are "copying": (BTW, this one is untested, and I don't use it that often either, so use with caution)
  • OS-specific Notes
  • It's extremely important to check your work
  • If you find that doing a large data transfer is making a machine uncomfortable to use, than you may want to read this

    Thanks.

    -- 
    Dan Stromberg DCS/NACS/UCI <strombrg@dcs.nac.uci.edu>
    



    Hits: 9476
    Timestamp: 2024-04-23 09:13:46 PDT

    Back to Dan's tech tidbits

    You can e-mail the author with questions or comments: