Note: This web page was automatically created from a PalmOS "pedit32" memo.
EXT3 filesystem recovery in LVM2


This is the bugzilla bug I started on the fedora buzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737

Very good idea to do something like the following, so that you have
an copy of the partition you're trying to recover, in case something
bad happens:
dd if=/dev/hda2 bs=1024k conv=noerror,sync,notrunc | reblock -t 65536 30 |
ssh remote.host.uci.edu 'cat > /recovery/damaged-lvm2-ext3'

e2salvage died with "Terminated".  I assume it OOM'd.

e2extract gave a huge list of 0 length files.  Doesn't seem right,
and it was taking forever, so I decided to move on to other methods.
But does anyone know if this is normal behavior for e2extract on an ext3?

I wrote a small program that searches for ext3 magic numbers.  It's
finding many, EG 438, 30438, 63e438 and so on (hex).  The question is,
how do I convert from that to an fsck -b number?

Running the same program on a known-good ext3, the first offset was the
same, but others were different.  However, they all ended in hex 38...

I'm now running an "fsck -vn -b" with the -b argument ranging from 0 to
999999.  I'm hoping this will locate a suitable -b for me via brute force.

Sent a post to gmane.linux.kernel 2004-12-16 

Robin Green <greenrd@greenrd.org> very helpfully provided the
following instructions, which appear to be getting somewhere:

1) Note down what the root= device is that appears on the kernel
command line (this can be found by going to boot from hard drive and
then examining
the kernel command line in grub, or by looking in /boot/grub/grub.conf )

2) Be booted from rescue disk

3) Sanity check: ensure that the nodes /dev/hda, /dev/hda2 etc. exist

4) Start up LVM2 (assuming it is not already started by the rescue disk!) by
typing:

  lvm vgchange --ignorelockingfailure -P -a y

Looking at my initrd script, it doesn't seem necessary to run any other
commands
to get LVM2 volumes activated - that's it.

5) Find out which major/minor number the root device is. This is the
slightly tricky
bit. You may have to use trial-and-error. In my case, I guessed right
first time:
(no comments about my odd hardware setup please ;)

[root@localhost t]# ls /sys/block
dm-0  dm-2  hdd    loop1  loop3  loop5  loop7  ram0  ram10  ram12  ram14
ram2  ram4  ram6  ram8
dm-1  hdc   loop0  loop2  loop4  loop6  md0    ram1  ram11  ram13  ram15
ram3  ram5  ram7  ram9
[root@localhost t]# cat /sys/block/dm-0/dev
253:0
[root@localhost t]# devmap_name 253 0
Volume01-LogVol02

In the first command, I listed the block devices known to the kernel. dm-*
are the LVM
devices (on my 2.6.9 kernel, anyway). In the second command, I found
out the major:minor
numbers of /dev/dm-0. In the third command, I used devmap_name to check
that the device
mapper name of node with major 253 and minor 0, is the same as the name
of the root device
from my kernel command line (cf. step 1). Apart from a slight punctuation
difference,
it is the same, therefore I have found the root device.

I'm not sure if FC3 includes the devmap_name command. According to
fr2.rpmfind.net, it doesn't.
But you don't really need it, you can just try all the LVM devices in
turn until you find
your root device. Or, I can email you a statically-linked binary of it
if you want.

6) Create the /dev node for the root filesystem if it doesn't already
exist, e.g.:

  mknod /dev/dm-0 b 253 0

using the major-minor numbers found in step 5.

Please note that for the purpose of _rescue_, the node doesn't actually
have to be under
/dev (so /dev doesn't have to be writeable) and its name does not
matter. It just needs
to exist somewhere on a filesystem, and you have to refer to it in the
next command.

7) Do what you want to the root filesystem, e.g.:

  fsck /dev/dm-0
  mount /dev/dm-0 /where/ever

As you probably know, the fsck might actually work, because a fsck
can sometimes
correct filesystem errors that the kernel filesystem modules cannot.

8) If the fsck doesn't work, look in the output of fsck and in dmesg
for signs of
physical drive errors. If you find them, (a) think about calling a
data recovery
specialist, (b) do NOT use the drive!

On FC3's rescue disk, what I actually did was:

1) Do startup network interfaces
2) Don't try to automatically mount the filesystems - not even readonly
3) lvm vgchange --ignorelockingfailure -P -a y
4) fdisk -l, and guess which partition is which based on size: the small
one was /boot, and the large one was /
5) mkdir /mnt/boot
6) mount /dev/hda1 /mnt/boot
7) Look up the device node for the root filesystem in /mnt/boot/grub/grub.conf
8) A first tentative step, to see if things are working: fsck -n
/dev/VolGroup00/LogVol00
9) Dive in: fsck -f -y /dev/VolGroup00/LogVol00
10) Wait a while...  Be patient.  Don't interrupt it
11) Reboot

Are these lvm1 or lvm2?

lvmdiskscan -v
vgchange -ay
vgscan -P
vgchange -ay -P

jeeves:~# lvm version
  LVM version:     2.01.04 (2005-02-09)
  Library version: 1.01.00-ioctl (2005-01-17)
  Driver version:  4.1.0

I think you are making a potentially very dangerous mistake!

Type 8e is a partition type. You don't want to use resize2fs on the PARTITION,
which is not an ext2 partition, but an lvm partition. You want
to resize the filesystem on the logical VOLUME.

And yes, resize2fs is appropriate for logical volumes. But resize the VOLUME
(e.g. /dev/VolGroup00/LogVol00), not the partition or volume group.

On Fri, Mar 04, 2005 at 06:35:31PM +0000, Robert Buick wrote:
> I'm using type 8e, does anyone happen to know if resize2fs is
> appropriate for this type; the man page only mentions type2.

A method of hunting for two text strings in a raw disk, after files
have been deleted.  The data blocks of the disk are read once, but
grep'd twice.

seki-root> reblock -e 75216016 $(expr 1024 \* 1024) 300 <
/dev/mapper/VolGroup00-LogVol00 | mtee 'egrep --binary-files=text -i -B
1000 -A 1000 dptutil > dptutil-hits' 'egrep --binary-files=text -i
-B 1000 -A 1000 dptmgr > dptmgr-hits'
stdin seems seekable, but file length is 0 - no exact percentages
Estimated filetransfer size is 77021200384 bytes
Estimated percentages will only be as accurate as your size estimate
Creating 2 pipes
popening egrep --binary-files=text -i -B 1000 -A 1000 dptutil > dptutil-hits
popening egrep --binary-files=text -i -B 1000 -A 1000 dptmgr > dptmgr-hits
(estimate: 0.1%  0s 56m 11h) Kbytes: 106496.0  Mbits/s: 13.6  Gbytes/hr:
6.0  min: 1.0
(estimate: 0.2%  9s 12m 12h) Kbytes: 214016.0  Mbits/s: 13.3  Gbytes/hr:
5.8  min: 2.0
(estimate: 0.3%  58s 58m 11h) Kbytes: 257024.0  Mbits/s: 13.5  Gbytes/hr:
5.9  min: 2.4
...

references:
https://stromberg.dnsalias.org/~strombrg/reblock.html
https://stromberg.dnsalias.org/~strombrg/mtee.html
egrep --help

Performing the above reblock | mtee, my fedora core 3 system got -very-
slow.  If I were to suspend the pipeline above, performance would be
great.  If I resumed it, very quickly, performance would be bad again.
This command seems to have left my sytem a little bit jerky, but it's
-far- more usable now, despite the pipeline above still pounding the
SATA drive my home directory is on.

seki-root> echo deadline > scheduler 
Wed Mar 09 17:56:58

seki-root> cat scheduler 
noop anticipatory [deadline] cfq 
Wed Mar 09 17:57:00

seki-root> pwd
/sys/block/sdb/queue
Wed Mar 09 17:58:31

BTW, I looked into tagged command queuing for this system as well,
but apparently VIA SATA doesn't support TCQ on linux 2.6.x.

Eventually the reblock | mtee egrep egrep gave:
egrep: memory exhausted
...using GNU egrep 2.5.1.
...so now I'm trying something closer to my classical method:
seki-root> reblock -e 75216016 $(expr 1024 \* 1024) 300 <
/dev/mapper/VolGroup00-LogVol00 | mtee './bgrep dptutil | ./ranges >
dptutil-ranges' './bgrep dptmgr | ./ranges > dptmgr-ranges'
Creating 2 pipes
popening ./bgrep dptutil | ./ranges > dptutil-ranges
popening ./bgrep dptmgr | ./ranges > dptmgr-ranges
stdin seems seekable, but file length is 0 - no exact percentages
Estimated filetransfer size is 77021200384 bytes
Estimated percentages will only be as accurate as your size estimate
(estimate: 1.3%  16s 12m 1h) Kbytes: 1027072.0  Mbits/s: 133.6  Gbytes/hr:
58.7  min: 1.0
(estimate: 2.5%  36s 16m 1h) Kbytes: 1913856.0  Mbits/s: 124.5  Gbytes/hr:
54.7  min: 2.0
(estimate: 3.7%  10s 17m 1h) Kbytes: 2814976.0  Mbits/s: 122.1  Gbytes/hr:
53.6  min: 3.0
(estimate: 4.9%  10s 17m 1h) Kbytes: 3706880.0  Mbits/s: 120.6  Gbytes/hr:
53.0  min: 4.0
...

I've added a -s option to reblock, which makes it sleep for an  arbitrary
number of (fractions of) seconds between blocks.  Between this and the
I/O scheduler change, seki has become very pleasant to work on again,
despite the hunt for my missing palm memo.  :)

From Bryan Ragon <bragon at zapeng dot com>

Here is a detailed list of steps that worked:

;; first backed up the first 512 bytes of /dev/hdb
# dd if=/dev/hdb of=~/hdb.first512 count=1 bs=512
1+0 records in
1+0 records out
 

;; zero them out, per Alasdair
# dd if=/dev/zero of=/dev/hdb count=1 bs=512
1+0 records in
1+0 records out

;; verified
# blockdev --rereadpt /dev/hdb
BLKRRPART: Input/output error

;; find the volumes
# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "media_vg" using metadata type lvm2

# pvscan
  PV /dev/hdb   VG media_vg   lvm2 [111.79 GB / 0    free]
  Total: 1 [111.79 GB] / in use: 1 [111.79 GB] / in no VG: 0 [0   ]

# lvmdiskscan
  /dev/hda1 [      494.16 MB]
  /dev/hda2 [        1.92 GB]
  /dev/hda3 [       18.65 GB]
  /dev/hdb  [      111.79 GB] LVM physical volume
  /dev/hdd1 [       71.59 GB]
  0 disks
  4 partitions
  1 LVM physical volume whole disk
  0 LVM physical volumes

# vgchange -a y
  1 logical volume(s) in volume group "media_vg" now active

;; /media is a defined mount point in fstab, listed below for future archive
searches
# mount /media
# ls /media
graphics  lost+found  movies  music


Success!!  Thank you, Alasdair!!!!

/etc/fstab
<snip>
/dev/media_vg/media_lv  /media          ext3            noatime
0 0
<snip>

home blee has:
hdc1 ext3 /big wdc
sda5 xfs /backups
00/00 ext3 hda ibm fc3: too hot?
00/01 swap hda ibm
01/00 ext3 hdd maxtor fc4
01/01 swap hdd maxtor
hdb that samsung dvd drive that overheats
Back to Dan's palm memos