Note: This web page was automatically created from a PalmOS "pedit32" memo.

EXT3 filesystem recovery in LVM2


This is the bugzilla bug I started on the fedora buzilla: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737
Very good idea to do something like the following, so that you have an copy of the partition you're trying to recover, in case something bad happens: dd if=/dev/hda2 bs=1024k conv=noerror,sync,notrunc | reblock -t 65536 30 | ssh remote.host.uci.edu 'cat > /recovery/damaged-lvm2-ext3'
e2salvage died with "Terminated". I assume it OOM'd.
e2extract gave a huge list of 0 length files. Doesn't seem right, and it was taking forever, so I decided to move on to other methods. But does anyone know if this is normal behavior for e2extract on an ext3?
I wrote a small program that searches for ext3 magic numbers. It's finding many, EG 438, 30438, 63e438 and so on (hex). The question is, how do I convert from that to an fsck -b number?
Running the same program on a known-good ext3, the first offset was the same, but others were different. However, they all ended in hex 38...
I'm now running an "fsck -vn -b" with the -b argument ranging from 0 to 999999. I'm hoping this will locate a suitable -b for me via brute force.
Sent a post to gmane.linux.kernel 2004-12-16
Robin Green <greenrd@greenrd.org> very helpfully provided the following instructions, which appear to be getting somewhere: 1) Note down what the root= device is that appears on the kernel command line (this can be found by going to boot from hard drive and then examining the kernel command line in grub, or by looking in /boot/grub/grub.conf ) 2) Be booted from rescue disk 3) Sanity check: ensure that the nodes /dev/hda, /dev/hda2 etc. exist 4) Start up LVM2 (assuming it is not already started by the rescue disk!) by typing: lvm vgchange --ignorelockingfailure -P -a y Looking at my initrd script, it doesn't seem necessary to run any other commands to get LVM2 volumes activated - that's it. 5) Find out which major/minor number the root device is. This is the slightly tricky bit. You may have to use trial-and-error. In my case, I guessed right first time: (no comments about my odd hardware setup please ;) [root@localhost t]# ls /sys/block dm-0 dm-2 hdd loop1 loop3 loop5 loop7 ram0 ram10 ram12 ram14 ram2 ram4 ram6 ram8 dm-1 hdc loop0 loop2 loop4 loop6 md0 ram1 ram11 ram13 ram15 ram3 ram5 ram7 ram9 [root@localhost t]# cat /sys/block/dm-0/dev 253:0 [root@localhost t]# devmap_name 253 0 Volume01-LogVol02 In the first command, I listed the block devices known to the kernel. dm-* are the LVM devices (on my 2.6.9 kernel, anyway). In the second command, I found out the major:minor numbers of /dev/dm-0. In the third command, I used devmap_name to check that the device mapper name of node with major 253 and minor 0, is the same as the name of the root device from my kernel command line (cf. step 1). Apart from a slight punctuation difference, it is the same, therefore I have found the root device. I'm not sure if FC3 includes the devmap_name command. According to fr2.rpmfind.net, it doesn't. But you don't really need it, you can just try all the LVM devices in turn until you find your root device. Or, I can email you a statically-linked binary of it if you want. 6) Create the /dev node for the root filesystem if it doesn't already exist, e.g.: mknod /dev/dm-0 b 253 0 using the major-minor numbers found in step 5. Please note that for the purpose of _rescue_, the node doesn't actually have to be under /dev (so /dev doesn't have to be writeable) and its name does not matter. It just needs to exist somewhere on a filesystem, and you have to refer to it in the next command. 7) Do what you want to the root filesystem, e.g.: fsck /dev/dm-0 mount /dev/dm-0 /where/ever As you probably know, the fsck might actually work, because a fsck can sometimes correct filesystem errors that the kernel filesystem modules cannot. 8) If the fsck doesn't work, look in the output of fsck and in dmesg for signs of physical drive errors. If you find them, (a) think about calling a data recovery specialist, (b) do NOT use the drive!
On FC3's rescue disk, what I actually did was: 1) Do startup network interfaces 2) Don't try to automatically mount the filesystems - not even readonly 3) lvm vgchange --ignorelockingfailure -P -a y 4) fdisk -l, and guess which partition is which based on size: the small one was /boot, and the large one was / 5) mkdir /mnt/boot 6) mount /dev/hda1 /mnt/boot 7) Look up the device node for the root filesystem in /mnt/boot/grub/grub.conf 8) A first tentative step, to see if things are working: fsck -n /dev/VolGroup00/LogVol00 9) Dive in: fsck -f -y /dev/VolGroup00/LogVol00 10) Wait a while... Be patient. Don't interrupt it 11) Reboot
Are these lvm1 or lvm2? lvmdiskscan -v vgchange -ay vgscan -P vgchange -ay -P
jeeves:~# lvm version LVM version: 2.01.04 (2005-02-09) Library version: 1.01.00-ioctl (2005-01-17) Driver version: 4.1.0
I think you are making a potentially very dangerous mistake! Type 8e is a partition type. You don't want to use resize2fs on the PARTITION, which is not an ext2 partition, but an lvm partition. You want to resize the filesystem on the logical VOLUME. And yes, resize2fs is appropriate for logical volumes. But resize the VOLUME (e.g. /dev/VolGroup00/LogVol00), not the partition or volume group. On Fri, Mar 04, 2005 at 06:35:31PM +0000, Robert Buick wrote: > I'm using type 8e, does anyone happen to know if resize2fs is > appropriate for this type; the man page only mentions type2.
A method of hunting for two text strings in a raw disk, after files have been deleted. The data blocks of the disk are read once, but grep'd twice. seki-root> reblock -e 75216016 $(expr 1024 \* 1024) 300 < /dev/mapper/VolGroup00-LogVol00 | mtee 'egrep --binary-files=text -i -B 1000 -A 1000 dptutil > dptutil-hits' 'egrep --binary-files=text -i -B 1000 -A 1000 dptmgr > dptmgr-hits' stdin seems seekable, but file length is 0 - no exact percentages Estimated filetransfer size is 77021200384 bytes Estimated percentages will only be as accurate as your size estimate Creating 2 pipes popening egrep --binary-files=text -i -B 1000 -A 1000 dptutil > dptutil-hits popening egrep --binary-files=text -i -B 1000 -A 1000 dptmgr > dptmgr-hits (estimate: 0.1% 0s 56m 11h) Kbytes: 106496.0 Mbits/s: 13.6 Gbytes/hr: 6.0 min: 1.0 (estimate: 0.2% 9s 12m 12h) Kbytes: 214016.0 Mbits/s: 13.3 Gbytes/hr: 5.8 min: 2.0 (estimate: 0.3% 58s 58m 11h) Kbytes: 257024.0 Mbits/s: 13.5 Gbytes/hr: 5.9 min: 2.4 ... references: https://stromberg.dnsalias.org/~strombrg/reblock.html https://stromberg.dnsalias.org/~strombrg/mtee.html egrep --help
Performing the above reblock | mtee, my fedora core 3 system got -very- slow. If I were to suspend the pipeline above, performance would be great. If I resumed it, very quickly, performance would be bad again. This command seems to have left my sytem a little bit jerky, but it's -far- more usable now, despite the pipeline above still pounding the SATA drive my home directory is on. seki-root> echo deadline > scheduler Wed Mar 09 17:56:58 seki-root> cat scheduler noop anticipatory [deadline] cfq Wed Mar 09 17:57:00 seki-root> pwd /sys/block/sdb/queue Wed Mar 09 17:58:31 BTW, I looked into tagged command queuing for this system as well, but apparently VIA SATA doesn't support TCQ on linux 2.6.x.
Eventually the reblock | mtee egrep egrep gave: egrep: memory exhausted ...using GNU egrep 2.5.1. ...so now I'm trying something closer to my classical method: seki-root> reblock -e 75216016 $(expr 1024 \* 1024) 300 < /dev/mapper/VolGroup00-LogVol00 | mtee './bgrep dptutil | ./ranges > dptutil-ranges' './bgrep dptmgr | ./ranges > dptmgr-ranges' Creating 2 pipes popening ./bgrep dptutil | ./ranges > dptutil-ranges popening ./bgrep dptmgr | ./ranges > dptmgr-ranges stdin seems seekable, but file length is 0 - no exact percentages Estimated filetransfer size is 77021200384 bytes Estimated percentages will only be as accurate as your size estimate (estimate: 1.3% 16s 12m 1h) Kbytes: 1027072.0 Mbits/s: 133.6 Gbytes/hr: 58.7 min: 1.0 (estimate: 2.5% 36s 16m 1h) Kbytes: 1913856.0 Mbits/s: 124.5 Gbytes/hr: 54.7 min: 2.0 (estimate: 3.7% 10s 17m 1h) Kbytes: 2814976.0 Mbits/s: 122.1 Gbytes/hr: 53.6 min: 3.0 (estimate: 4.9% 10s 17m 1h) Kbytes: 3706880.0 Mbits/s: 120.6 Gbytes/hr: 53.0 min: 4.0 ...
I've added a -s option to reblock, which makes it sleep for an arbitrary number of (fractions of) seconds between blocks. Between this and the I/O scheduler change, seki has become very pleasant to work on again, despite the hunt for my missing palm memo. :)
From Bryan Ragon <bragon at zapeng dot com> Here is a detailed list of steps that worked: ;; first backed up the first 512 bytes of /dev/hdb # dd if=/dev/hdb of=~/hdb.first512 count=1 bs=512 1+0 records in 1+0 records out ;; zero them out, per Alasdair # dd if=/dev/zero of=/dev/hdb count=1 bs=512 1+0 records in 1+0 records out ;; verified # blockdev --rereadpt /dev/hdb BLKRRPART: Input/output error ;; find the volumes # vgscan Reading all physical volumes. This may take a while... Found volume group "media_vg" using metadata type lvm2 # pvscan PV /dev/hdb VG media_vg lvm2 [111.79 GB / 0 free] Total: 1 [111.79 GB] / in use: 1 [111.79 GB] / in no VG: 0 [0 ] # lvmdiskscan /dev/hda1 [ 494.16 MB] /dev/hda2 [ 1.92 GB] /dev/hda3 [ 18.65 GB] /dev/hdb [ 111.79 GB] LVM physical volume /dev/hdd1 [ 71.59 GB] 0 disks 4 partitions 1 LVM physical volume whole disk 0 LVM physical volumes # vgchange -a y 1 logical volume(s) in volume group "media_vg" now active ;; /media is a defined mount point in fstab, listed below for future archive searches # mount /media # ls /media graphics lost+found movies music Success!! Thank you, Alasdair!!!! /etc/fstab <snip> /dev/media_vg/media_lv /media ext3 noatime 0 0 <snip>
home blee has: hdc1 ext3 /big wdc sda5 xfs /backups 00/00 ext3 hda ibm fc3: too hot? 00/01 swap hda ibm 01/00 ext3 hdd maxtor fc4 01/01 swap hdd maxtor hdb that samsung dvd drive that overheats


Back to Dan's palm memos