This message is intended to provide a list of ways of making backups (or more generally, large data transfers) from rendering a system very difficult to use, due to the buffer cache and prefetch behavior leading to resource starvation in processes other than the backup/data transfer. All of these things I'm describing should be expected, if used, to slow down backups anywhere from a teensy bit, to a lot, depending on how you use them and which one(s) you choose. 1) One of the easier things you can do, at least on linux kernel 2.6.x systems (RHEL 4, Fedora Core 3), is: seki-root> echo deadline > scheduler Wed Mar 09 17:56:58 seki-root> cat scheduler noop anticipatory [deadline] cfq Wed Mar 09 17:57:00 seki-root> pwd /sys/block/sdb/queue Wed Mar 09 17:58:31 This should greatly reduce the I/O scheduler's tendency to prefetch. Things'll still be painful during a backup, but should be less so. It's really intended for database-like applications, but I don't think it'll mind if we use it for another purpose. :) 2) If you're using a pipeline, I've added an option to "reblock" that allows it to sleep a user-defined number of seconds or fractions of seconds, in between each block, EG: reblock -s 0.5 $(expr 1024 \* 1024) 300 < /dev/dsk/c0t0d0s2 | ssh strombrg@seki.nac.uci.edu 'cat > ~/disk.image' In this example, we're sleeping for 0.5 seconds in between each 1 megabyte block, and we'll time out after 300 seconds of inactivity. My experience so far has been that this (in combination with #1 above) changed a system I was creating a disk image of (though it should work fine with tar, cpio, dump backups &c too) from next to useless, to pretty comfortable. 3) https://stromberg.dnsalias.org/~strombrg/slowdown/ This program should be able to cause a process to pause for a while each time it reads or writes some data. This can be helpful with binary-only programs that don't work well in a pipeline. 4) If you have source code to the application doing the data transfer, and you are using a platform that supports O_DIRECT (Recent Linux, FreeBSD, not sure what else), that may make things much faster by not toasting your machine's buffer cache. For this, you may be interested in https://stromberg.dnsalias.org/~strombrg/libodirect/ 5) If you're on a linux system: Jeff V. Merkey wrote: > Jens Axboe wrote: > >On Mon, Jan 16 2006, Jeff V. Merkey wrote: > >>Max Waterman wrote: > >>>I've noticed that I consistently get better (read) numbers from kernel > >>>2.6.8 than from later kernels. > >> > >>To open the bottlenecks, the following works well. Jens will shoot me > >>-#define BLKDEV_MIN_RQ 4 > >>-#define BLKDEV_MAX_RQ 128 /* Default maximum */ > >>+#define BLKDEV_MIN_RQ 4096 > >>+#define BLKDEV_MAX_RQ 8192 /* Default maximum */ > > > >Yeah I could shoot you. However I'm more interested in why this is > >necessary, eg I'd like to see some numbers from you comparing: > > > >- Doing > > # echo 8192 > /sys/block/<dev>/queue/nr_requests > > for each drive you are accessing. > > > >The BLKDEV_MIN_RQ increase is just silly and wastes a huge amount of > >memory for no good reason. > > Yep. I build it into the kernel to save the trouble of sending it to proc. > Jens recommendation will work just fine. It has the same affect of > increasing the max requests outstanding. Your suggestion doesn't do anything here on 2.6.15, but echo 192 > /sys/block/<dev>/queue/max_sectors_kb echo 192 > /sys/block/<dev>/queue/read_ahead_kb works wonders! I don't know why, but anything less than 64 and more than 256 makes the queue collapse miserably, causing some strange __copy_to_user calls?!?!? Also, it seems that changing the kernel HZ has some drastic effects on the queues. A simple lilo gets delayed 400% and 200% using 100HZ and 250HZ respectively.