• Table of contents
    1. Dan on RHEL 3 + md + striping
    2. Dan on Ubuntu Gutsy + md + mirroring
    3. Harry on Ubuntu Hoary preview + md + RAID 5
    4. Neil Brown on what "non-fresh" means in an mdadm context
    5. Some external links
    6. Monitoring
    7. Attributes of Linux software RAID
  • The methods
    1. striping
      • This was done on an RHEL 3 system.
      • Edit /etc/mdadm.conf with lines similar to the following:
          [root@esmfsn05 root]# egrep '^DEVICE|^MAILADDR' /etc/mdadm.conf
          DEVICE /dev/sd[ab]1
          MAILADDR root@esmft1
      • Create the array:
          mdadm --create /dev/md0 --level=stripe --chunk=4096 --raid-devices=2 /dev/sda1 /dev/sdb1
      • Finish setting up /etc/mdadm.conf
          mdadm --examine --scan >> etc/mdadm.conf
      • Next verify that your /etc/mdadm.conf has a DEVICE, ARRAY and MAILADDR line
      • Edit /etc/fstab with something like this:
          [root@esmfsn05 root]# grep md /etc/fstab
          /dev/md0                /metadata               ext3    defaults        1 3
      • Create a filesystem:
          mkfs -t ext3 /dev/md0
      • Mount the filesystem
          mount /metadata
      • Add a script like /etc/rc5.d/S76md-assemble that contains something like:
          #!/bin/sh
          mdadm --assemble --uuid=9cb23d0b:ca6af799:778dca49:a0a9019c /dev/md0
          		
        • Note: It seems like there should be some Redhat-supplied startup script for this, but mdmpd and mdmonitor don't appear to be doing it.
        • A promising tip from Dan Pritts: "You need to set the partition ID to 'fd' ('linux raid autodetect') instead of 83 ('linux')."
        • Another note: The "uuid" option gets its value from the --scan above, which was concatenated onto /etc/mdadm.conf. These uuid's identify drives as being part of a RAID array, so you can't accidentally add the wrong disk and lose data)
        • Make sure you chmod 755 this script, as Redhat ignores rc scripts that aren't marked executable.
      • reboot to see if it comes up OK (assuming reboot isn't a hardship - some production systems shouldn't be rebooted!).


        Thanks.

        -- 
        Dan Stromberg DCS/NACS/UCI <strombrg@dcs.nac.uci.edu>
        		
    2. Dan on mirroring with md
      • umount the partitions (assuming the have filesystems on them you don't care about anymore - but is one preserved?)
      • /sbin/mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb5 /dev/sdc5
      • fdisk "t" option to set /dev/sdb5 and /dev/sdc5 to type "fd"
      • it doesn't have to be done sync'ing (cat /proc/mdstat) to use it
      • mkfs -t ext3 /dev/md0
    3. Harry Mangalam's letter about md and RAID 5
        Hi All,
        
        FYI, the machine platform is a 2xOpteron, running ubuntu hoary preview 
        (64bit), 4GB RAM, system running off a single IDE drive.
        
        The raid drives are running on the on-board 4way Silicon Image SATA 
        controller.  The drives are identical 250GB WD SATAs Model: WDC WD2500JD-00G, 
        each partitioned for 232GB on /dev/sdx1 and 1.8GB on /dev/sdx2 (for 
        parallel swap partitions).
        
        I'm using the mdadm suite to set them up and control the raid:
        1 - create the raid:
        $  mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 
        --spare-devices=0 -c128 /dev/sd{a,b,c,d}1 
        mdadm: layout defaults to left-symmetric
        mdadm: /dev/sda1 appears to contain a reiserfs file system
        	 size = 242220004K
        mdadm: /dev/sdb1 appears to contain a reiserfs file system
        	 size = 242220004K
        mdadm: /dev/sdc1 appears to contain a reiserfs file system
        	 size = 242220004K
        mdadm: /dev/sdd1 appears to contain a reiserfs file system
        	 size = 242220004K
        mdadm: size set to 242219904K
        Continue creating array? y
        mdadm: array /dev/md0 started.
        
        2 - make sure we monitor it.
         $ nohup mdadm --monitor --mail='hjm@tacgi.com' --delay=300 /dev/md0 &
        
        3 - make the reiserfs on md0 (it was made on the individual partitions before, 
        but apparantly it needs to be made on the virtual device)
        $ mkreiserfs /dev/md0 
        
        4 - # then mount it
        $ mount -t reiserfs /dev/md0 /r
        
        
        5- #then admire it
         $ df
        Filesystem           1K-blocks      Used Available Use% Mounted on
        ...
        /dev/md0             726637532     32840 726604692   1% /r
        
        so for a raid5 array, we end up with about 78% of the input space (more than I 
        expected) - the rest is lost to the parity info which is striped across all 
        the disks, giving the redundancy.
        
        When the raid initialized, mdadm immediately sent me an email warning of a 
        degraded array - this was not welcome news, but it turns out that this is 
        normal - in building the parity checksums, it essentially fakes a dead disk 
        and rebuilds all the parity info.  This took about 8 hrs to do for 1 TB, 
        however, the array was available and pretty peppy without waiting for it to 
        finish.  And the message did confirm that mdadm was actually monitoring the 
        array.
        
        I immediately tried a few cp's to and from it - and on the 'degraded' array, 
        got ~40MB/s to and from the IDE drive on some 100-600MB files.  There was not 
        much difference after it finished doing the parity rebuild - possibly it was 
        deferring the parity calculations until afterwards?  If anything it's 
        slightly slower now that the parity info is complete - maybe 38-40MB/s.  
        (this measure includes the sync time - with 4GB of RAM, GB files can be 
        buffered to RAM and so appear to be copied in a few sec).
        
        On my home 2xPIII system with IDE drives, I only get ~7-8MB/s between drives, 
        so 40MB/s sounds pretty good.  Bonnie++ reports (a bunch of confusing #s) but 
        seems to indicate that depending on CPU utilization, type of io, and size of 
        file, disk io will range from ~80MB/s to 24MB/s on the SATA raid.
        
        On my old IDE laptop (but with a newer disk), bonnie returns numbers that are 
        surprisingly good - about 1/3 to 1/4 the RAID speed.
        
        On the 2xPII home IDE system, bonnie returns numbers that are not much better 
        than the laptop.
        
        So there you have it - linux SW SATA raid is pretty easy to set up, can be 
        configured to be reasonably informative via email, is pretty cheap (relative 
        to the true HW raid cards that go for $300-$400 each) and seems to be pretty 
        fast.  Long term, I can't say yet.
        
        Also note that this is using an md device without any further wrapping with 
        lvm - we just need a huge data space, not much needed in the way of 
        administering different group allocations etc.
        
        Would like to hear others' experiences.
        
  • Neil Brown on what "non-fresh" means in an mdadm context:
  • Monitoring linux software RAID for problems
  • Some links about linux RAID on other sites
  • Some information about Linux software RAID - good points and bad


    Hits: 10828
    Timestamp: 2024-12-26 07:29:28 PST

    Back to Dan's tech tidbits

    You can e-mail the author with questions or comments: