Solaris[TM] Volume Manager: Replacing SCSI Disks Under SVM (Solaris[TM] 9 and Above)

Keyword(s):metadevadm, replace disk, svm

Description:

Beginning with Solaris 9, SVM uses a new feature called Device-ID (or DevID), which identifies each disk not only by its c#t#d# name, but by a unique ID generated by the disk's WWN or serial number. SVM relies on the Solaris[TM] Operating Environment to supply it with each disk's correct DevID.

When a disk fails and is replaced, a specific procedure is required for SCSI disks to make sure that Solaris is updated with the new disk's DevID.

If this procedure is not followed, Solaris will not update the DevID until the next reboot, meaning that although a NEW disk is in the system, the DevID being reported by Solaris to SVM is still the OLD disk's DevID. For example, if the DevID of c0t1d0 was: "SSEAGATE_ST318203_LR7943" and that disk is replaced with a new disk (whose DevID would be "SFUJITSU_MAG3182_005268"), Solaris will still report that the c0t1d0 disk has the DevID of "SSEAGATE_ST318203_LR7943" until the host is rebooted.

Although it is possible to replace the disk without running through this procedure, a subsequent system reboot will cause SVM to fail the new disk because the DevID of the disk will have changed and SVM will not have any knowledge of that new DevID.

In order to replace a disk, the cfgadm(1M) command must be used to unconfigure the disk to be replaced and to configure the new disk. This will cause an update of the Solaris device framework such that the new disk's DevID will be inserted and the old one removed.

Document Body:

PROCEDURE FOR REPLACING MIRRORED DISKS

Given all of the above, the following set of commands should work in all cases (although depending on the system configuration, some of the command may not be necessary):

To replace an SVM-controlled disk which is part of a mirror, the following steps must be followed:

1. Run 'metadetach' to detach all submirrors on the failing disk from their respective mirrors, and then run 'metaclear' (**) on those submirror devices (see below). if you don't use -f option you will get the following message : "Attempt an operation on a submirror that has erred component"

    metadetach -f <mirror> <submirror>
    metaclear <submirror>

  You can verify there are no existing metadevices left on the disk by
  running 'metastat -p | grep c#t#d#'.

2. If there are any replicas on this disk, remove them using

    metadb -d c#t#d#s#

  You can verify there are no existing replicas left on the disk by
  running 'metadb | grep c#t#d#'.

3. If there are any open filesystems on this disk (not under SVM control), unmount them.

4. Run the 'cfgadm' command to remove the failed disk.

    cfgadm -c unconfigure c#::dsk/c#t#d#

  NOTE: if the message "Hardware specific failure: failed to
  unconfigure SCSI device: I/O error" appears, check to make sure that
  you cleared all replicas and metadevices from the disk, and that the
  disk is not being accessed.

5. Insert and configure in the new disk.

    cfgadm -c configure c#::dsk/c#t#d#
    cfgadm -al (just to confirm that disk is configured properly)

6. Run 'format' or 'prtvtoc' to put the desired partition table on the new disk

7. Run 'metadevadm' on the disk, which will update the New DevID.

   metadevadm -u c#t#d#

  NOTE: If you get the message "Open of /dev/dsk/c#t#d#s0 failed", you
  can safely ignore the message (this is a known bug pending a fix).

8. If necessary, recreate any replicas on the new disk:

    metadb -a c#t#d#s#

9. Recreate each metadevice to be used as submirrors, and use 'metattach' to attach those submirrors to the mirrors to start the resync. Note: If the submirror was something other than a simple one-slice concat device, the metainit command will be different than shown here.

    metainit <submirror> 1 1 <c#t#d#s#>
    metattach <mirror> <submirror>

PROCEDURE FOR REPLACING DISKS IN A RAID-5 METADEVICE

To replace an SVM-controlled disk which is part of a RAID5 metadevice, the following steps must be followed. If a disk is used in BOTH a mirror and a RAID5, follow the instructions for the MIRRORED devices (above).

1. If there are any open filesystems on this disk (not under SVM control), unmount them.

2. If there are any replicas on this disk, remove them using:

    metadb -d c#t#d#s#

  You can verify there are no existing replicas left on the disk by
  running 'metadb | grep c#t#d#'.

3. Run the 'cfgadm' command to remove the failed disk.

    cfgadm -c unconfigure c#::dsk/c#t#d#

4. Insert a new disk and configure in the new disk.

    cfgadm -c configure c#::dsk/c#t#d#
    cfgadm -al (just to confirm that disk is configured properly)

5. Run 'format' or 'prtvtoc' to put the desired partition table on the new disk

6. Run 'metadevadm' on the disk, which will update the New DevID.

    metadevadm -u c#t#d#

7. If necessary, recreate any replicas on the new disk:

    metadb -a c#t#d#s#

8. Run metareplace to enable and resync the new disk.

    metareplace -e <raid5-md> c#t#d#s#

EXAMPLES

The following two examples illustrate the commands and sample outputs of the above procedures.

Example 1: Replacing a Mirrored Disk

In this example, a Netra t1400 has only one SCSI controller with 4 disks. SVM is used to mirror both the root and the swap devices between c0t0d0 and c0t2d0. The disk c0t2d0 is failing and needs to be replaced.

Here is the 'format' display before the submirror disk replacement:

  format

  AVAILABLE DISK SELECTIONS:
         0. c0t0d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@0,0
         1. c0t1d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@1,0
         2. c0t2d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@2,0
         3. c0t3d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@3,0

Here is the 'cfgadm' display for controller c0:

  cfgadm -al
  Ap_Id                   Type         Receptacle   Occupant     Condition
  c0                      scsi-bus     connected    configured   unknown
  c0::dsk/c0t0d0          disk         connected    configured   unknown
  c0::dsk/c0t1d0          disk         connected    configured   unknown
  c0::dsk/c0t2d0          disk         connected    configured   unknown
  c0::dsk/c0t3d0          disk         connected    configured   unknown

Here is the output of the 'metadb' command, showing the locations of the SVM database replicas. There is one on each disk.

  metadb
       flags        first blk     block count
    a        u      16            8192          /dev/dsk/c0t0d0s7
    a        u      16            8192          /dev/dsk/c0t1d0s7
    a        u      16            8192          /dev/dsk/c0t2d0s7
    a        u      16            8192          /dev/dsk/c0t3d0s7

Here is the SVM configuration before the submirror disk replacement. Note: the DevID information at the bottom.

  metastat
  d0: Mirror
      Submirror 0: d10
 
State: Okay         
      Submirror 1: d20
 
State: Needs maintenance         
      Pass: 1
      Read option: roundrobin (default)
      Write option: parallel (default)
      Size: 6295232 blocks (3.0 GB)

  d10: Submirror of d0
 
State: Okay         
      Size: 6295232 blocks (3.0 GB)
      Stripe 0:
          Device     Start Block  Dbase        State Reloc Hot Spare
          c0t0d0s0          0     No            Okay   Yes

  d20: Submirror of d0
 
State: Needs maintenance         
      Invoke: metareplace d20 c0t2d0s0 <new device>
      Size: 6295232 blocks (3.0 GB)
      Stripe 0:
          Device     Start Block  Dbase        State Reloc Hot Spare
          c0t2d0s0          0     No     Maintenance   Yes

  d1: Mirror
      Submirror 0: d11
 
State: Okay         
      Submirror 1: d21
 
State: Needs maintenance         
      Pass: 1
      Read option: roundrobin (default)
      Write option: parallel (default)
      Size: 2101552 blocks (1.0 GB)

  d11: Submirror of d1
 
State: Okay         
      Size: 2101552 blocks (1.0 GB)
      Stripe 0:
          Device     Start Block  Dbase        State Reloc Hot Spare
          c0t0d0s1          0     No            Okay   Yes

  d21: Submirror of d1
 
State: Needs maintenance         
      Invoke: metareplace d21 c0t2d0s1 <new device>
      Size: 2101552 blocks (1.0 GB)
      Stripe 0:
          Device     Start Block  Dbase        State Reloc Hot Spare
          c0t2d0s1          0     No     Maintenance   Yes

  Device Relocation Information:
  Device   Reloc  Device ID
  c0t2d0   Yes    id1,sd@SFUJITSU_MAG3182L_SUN18G_00526202____
  c0t0d0   Yes    id1,sd@SSEAGATE_ST318203LSUN18G_LR795377000010210UN3

Since c0t2d0 is the drive that needs to be replaced, use 'metadetach' and 'metaclear' to detach and remove the bad submirrors from that disk.

  metadetach -f d0 d20
  d0: submirror d20 is detached

  metadetach -f d1 d21
  d1: submirror d21 is detached

  metaclear d20
  d20: Concat/Stripe is cleared

  metaclear d21
  d21: Concat/Stripe is cleared

Here is the 'metastat' output after detaching and removing d20 and d21:

  d0: Mirror
      Submirror 0: d10
 
State: Okay         
      Pass: 1
      Read option: roundrobin (default)
      Write option: parallel (default)
      Size: 6295232 blocks (3.0 GB)

  d10: Submirror of d0
 
State: Okay         
      Size: 6295232 blocks (3.0 GB)
      Stripe 0:
          Device     Start Block  Dbase        State Reloc Hot Spare
          c0t0d0s0          0     No            Okay   Yes

  d1: Mirror
      Submirror 0: d11
 
State: Okay         
      Pass: 1
      Read option: roundrobin (default)
      Write option: parallel (default)
      Size: 2101552 blocks (1.0 GB)

  d11: Submirror of d1
 
State: Okay         
      Size: 2101552 blocks (1.0 GB)
      Stripe 0:
          Device     Start Block  Dbase        State Reloc Hot Spare
          c0t0d0s1          0     No            Okay   Yes

  Device Relocation Information:
  Device   Reloc  Device ID
  c0t2d0   Yes    id1,sd@SFUJITSU_MAG3182L_SUN18G_00526202____
  c0t0d0   Yes    id1,sd@SSEAGATE_ST318203LSUN18G_LR795377000010210UN3

Since we have a replica on the disk to be removed, we remove it using:

  metadb -d c0t2d0s7

and then remove the failed disk from the system using:

  cfgadm -c unconfigure c0::dsk/c0t2d0

After the disk has been physically replaced, we use the 'cfgadm' command to configure the new disk:

  cfgadm -c configure c0::dsk/c0t2d0

and then confirm that the new disk has been configured:

  cfgadm -al
  Ap_Id              Type         Receptacle   Occupant     Condition
  c0                 scsi-bus     connected    configured   unknown
  c0::dsk/c0t0d0     disk         connected    configured   unknown
  c0::dsk/c0t1d0     disk         connected    configured   unknown
  c0::dsk/c0t2d0     disk         connected    configured   unknown
  c0::dsk/c0t3d0     disk         connected    configured   unknown

We then run 'format' to put the appropriate partition table onto the disk.

  format

  [ the steps to create a valid partition table have been left
    out for brevity ]

We run 'metadevadm' to update the SVM database with the new DevID information. Here we can see the old DevID and the new DevID.

   metadevadm -u c0t2d0
   Updating Solaris Volume Manager device relocation information for c0t2d0
   Old device reloc information:
           id1,sd@SFUJITSU_MAG3182L_SUN18G_00526202____
   New device reloc information:
           id1,sd@SSEAGATE_ST318203LSUN18G_LR7943000000W70708e0

We run 'metadb' to recreate the replica that we removed from the disk:

  metadb -a c0t2d0s7

and run 'metainit' to recreate the metadevices that were previously removed and 'metattach' to reattach them to their respective mirrors.

  metainit d20 1 1 c0t2d0s0
  d20: Concat/Stripe is setup

  metainit d21 1 1 c0t2d0s1
  d21: Concat/Stripe is setup

  metattach d0 d20
  d0: submirror d20 is attached

  metattach d1 d21
  d1: submirror d21 is attached

Running a 'metastat' command now will show the NEW DeviceID for disk c0t2d0:

  metastat
d0: Mirror
   Submirror 0: d10
 
State: Okay         
   Submirror 1: d20
 
State: Okay         
   Pass: 1
   Read option: roundrobin (default)
   Write option: parallel (default)
   Size: 6295232 blocks (3.0 GB)

d10: Submirror of d0
State: Okay Size: 6295232 blocks (3.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t0d0s0 0 No Okay Yes

d20: Submirror of d0
State: Okay Size: 6295232 blocks (3.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t2d0s0 0 No Okay Yes

d1: Mirror Submirror 0: d11
State: Okay Submirror 1: d21
State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 2101552 blocks (1.0 GB)

d11: Submirror of d1
State: Okay Size: 2101552 blocks (1.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t0d0s1 0 No Okay Yes

d21: Submirror of d1
State: Okay Size: 2101552 blocks (1.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t2d0s1 0 No Okay Yes

  Device Relocation Information:
  Device   Reloc  Device ID
  c0t0d0   Yes    id1,sd@SFUJITSU_MAG3182L_SUN18G_00526873____
  c0t2d0   Yes    id1,sd@SSEAGATE_ST318203LSUN18G_LR7943000000W70708e0

After the new disk is attached to the mirror disk, it will be resynchronized. Once resynchronization process is completed, the mirror disk will be back to fully redundant mode.

Example 2: Replacing a Disk used in only RAID5 metadevice(s)

In this example, a Netra t1400 has only one SCSI controller with 4 disks. A RAID5 SVM configuration is setup across three disks - c0t1d0, c0t2d0 and c0t3d0. The disk c0t2d0 is failing and needs to be replaced.

Here is the 'format' display before the submirror disk replacement:

  format
  AVAILABLE DISK SELECTIONS:
         0. c0t0d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@0,0
         1. c0t1d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@1,0
         2. c0t2d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@2,0
         3. c0t3d0 <SUN18G cyl 7506 alt 2 hd 19 sec 248>
            /pci@1f,4000/scsi@3/sd@3,0

Here is the 'cfgadm' display for controller c0:

  cfgadm -al
  Ap_Id                   Type         Receptacle   Occupant     Condition
  c0                      scsi-bus     connected    configured   unknown
  c0::dsk/c0t0d0          disk         connected    configured   unknown
  c0::dsk/c0t1d0          disk         connected    configured   unknown
  c0::dsk/c0t2d0          disk         connected    configured   unknown
  c0::dsk/c0t3d0          disk         connected    configured   unknown

Here is the output of the 'metadb' command, showing the locations of the SVM database replicas. There is one on each disk.

  metadb
       flags        first blk     block count
    a        u      16            8192          /dev/dsk/c0t0d0s7
    a        u      16            8192          /dev/dsk/c0t1d0s7
    a        u      16            8192          /dev/dsk/c0t2d0s7
    a        u      16            8192          /dev/dsk/c0t3d0s7

Here is the SVM configuration before the disk replacement. Note: The DevID information at the bottom.

  metastat
  d3: RAID
 
State: Needs Maintenance 
      Invoke: metareplace d3 c0t2d0s7 <new device>
      Interlace: 32 blocks
      Size: 2077992 blocks (1014 MB)
  Original device:
      Size: 2081984 blocks (1016 MB)
          Device     Start Block  Dbase        State Reloc  Hot Spare
          c0t1d0s5       9754        No         Okay   Yes 
          c0t2d0s5       9754        No  Maintenance   Yes 
          c0t3d0s5       9754        No         Okay   Yes

  Device Relocation Information:
  Device   Reloc  Device ID
  c0t0d0   Yes    id1,sd@SSEAGATE_ST318203LSUN18G_LR795377000010210UN3
  c0t1d0   Yes    id1,sd@SFUJITSU_MAG3182L_SUN18G_00526873____
  c0t2d0   Yes    id1,sd@SFUJITSU_MAG3182L_SUN18G_00526202____
  c0t3d0   Yes    id1,sd@SFUJITSU_MAG3182L_SUN18G_00526842____

Since c0t2d0 is the drive that needs to be replaced, and since the only other thing on this disk is the SVM replica, we remove the existing replica on disk c0t2d0 using:

  metadb -d c0t2d0s7

and use the 'cfgadm' command to remove the failed disk from the system:

  cfgadm -c unconfigure c0::dsk/c0t2d0

After the disk has been physically replaced, we use 'cfgadm' to configure the new disk:

  cfgadm -c configure c0::dsk/c0t2d0

and then confirm that the new disk has been configured:

  cfgadm -al
  Ap_Id              Type         Receptacle   Occupant     Condition
  c0                 scsi-bus     connected    configured   unknown
  c0::dsk/c0t0d0     disk         connected    configured   unknown
  c0::dsk/c0t1d0     disk         connected    configured   unknown
  c0::dsk/c0t2d0     disk         connected    configured   unknown
  c0::dsk/c0t3d0     disk         connected    configured   unknown

We then run 'format' to put the appropriate partition table onto the disk.

  format

  [ the steps to create a valid partition table have been left
    out for brevity ]

We run 'metadevadm' to update the SVM database with the new DevID information. Here we can see the old DevID and the new DevID.

  metadevadm -u c0t2d0
  Old device reloc information:
          id1,sd@SFUJITSU_MAG3182L_SUN18G_00526202____
  New device reloc information:
          id1,sd@SSEAGATE_ST318203LSUN18G_LR7943000000W70708e0

We run 'metadb' to recreate the replica that we removed fromthe disk:

  metadb -a c0t2d0s7

and run 'metareplace' to enable the new disk into the RAID5 and for a resync to happen.

  metareplace -e d3 c0t2d0s5

Running a 'metastat' command now will show the NEW DeviceID for disk c0t2d0:

  metastat
  d3: RAID
 
State: Okay         
      Interlace: 32 blocks
      Size: 2077992 blocks (1014 MB)
  Original device:
      Size: 2081984 blocks (1016 MB)
          Device     Start Block  Dbase        State Reloc  Hot Spare
          c0t1d0s5       9754        No         Okay   Yes 
          c0t2d0s5       9754        No         Okay   Yes 
          c0t3d0s5       9754        No         Okay   Yes

Device Relocation Information: Device Reloc Device ID c0t1d0 Yes id1,sd@SFUJITSU_MAG3182L_SUN18G_00526873____ c0t2d0 Yes id1,sd@SSEAGATE_ST318203LSUN18G_LR7943000000W70708e0 c0t3d0 Yes id1,sd@SFUJITSU_MAG3182L_SUN18G_00526842____

If the above procedure was not followed, the below errors may be seen:

Jun 22 18:22:57 host1 metadevadm: [ID 209699 daemon.error] Invalid device relocation information detected in Solaris Volume Manager

In order to recover from this error, run the metadevadm -u command and then proceed with any metadb and/or metareplace commands necessary:

  metadevadm -u c0t2d0
   Updating Solaris Volume Manager device relocation information for c0t2d0
   Old device reloc information:
           id1,sd@SFUJITSU_MAG3182L_SUN18G_00526202____
   New device reloc information:
           id1,sd@SSEAGATE_ST318203LSUN18G_LR7943000000W70708e0