RAID6 using Linux Logical Volume Manager

  • Posted on: 15 November 2019
  • By: hapebe

Build a RAID6 LVM volume, then test failure and replacement of one (or two) of the drives

While learning about different RAID levels and their configurations, I wanted to mess with the RAID abilities of the Linux Logical Volume Manager (LVM), in my case using the Debian lvm2 package in SparkyLinux on a virtual machine setup using VirtualBox.

In real-life scenarios you should also (rather?) consider making use of hardware-based RAID, MD-based RAID or even DRBD setups!

Before you jump into following the steps described here, please make sure you understand the LVM basics (PVs, VGs, LVs, and the related basic commands).

Most commands in this experiment need root privileges.

Resources:

Build the logical RAID6

 

Prepare all physical disks

  1. cfdisk {device}, e.g.
    cfdisk /dev/sdb
    • create a partition filling the whole drive,
    • define its type as “Linux LVM”
  2. pvcreate {partition/device}, e.g.
    pvcreate /dev/sdb1

 

Create a volume group and the logical volume

  1. vgcreate {volume group, e.g. vg1} {list of pvs}, e.g.
    vgcreate /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
  2. lvcreate --type raid6 -l100%vg -n{name} {volume group}, e.g.
    lvcreate --type raid6 -l100%vg -n data6 vg1

 

Create a filesystem on the logical volume

  1. mke2fs -t ext4 -L {FS label, e.g. data} {device, e.g. /dev/vg1/data6}, e.g.
    mke2fs -t ext4 -L data /dev/vg1/data6

 

Mount & use the logical volume:

mkdir -p /mnt/data
mount /dev/mapper/vg1-data6 /mnt/data

Optional: Change ownership and privileges inside the new file system:

chown -R root:users /mnt/data
chmod -R 770 /mnt/data

Check the perfect health of your new system:

lvs -o name,lv_health_status,sync_percent

Start something with your file system

dd if=/dev/urandom of=/mnt/data/random.dat bs=1M

(You might cancel / stop this process at some point, unless you really need to fill the whole file system)
In parallel, you can observe I/O performance (in another shell), if you like:

iotop

 

Take down one of the physical disks

You should consider doing this while random data writing is going on!

echo 1 > /sys/block/{device}/device/delete, e.g.
echo 1 > /sys/block/sde/device/delete

Check the not-so-perfect system health at this point:

lvs -o name,lv_health_status

Health of the logical volume should now be “partial”, as described in the lmvraid man page.

At this time, kill the dd process and shut down the system.

Replace the disk you had taken down with an empty / fresh / new one.

 

Rebuild the logical RAID6 system

Re-activate the logical volume manually after reboot (maybe it is not automatically activated in the “partial” state):

  1. lvchange -ay {logical volume}, e.g.
    lvchange -ay /dev/vg1/data6

 

Prepare the replacement drive

  1. cfdisk /dev/sdf (see above)
    pvcreate /dev/sdf1
  2. Add it to the volume group:
    vgextend {volume group} {physical volume}, e.g.
    vgextend {vg1} {/dev/sdf1}

Show the current configuration:

pvscan / vgscan / vgdisplay {volume group}

 

Repair the RAID6 system

  1. lvconvert --repair {logical volume, e.g. /dev/vg1/data6} {physical volume, e.g. /dev/sdf1}, e.g.
    lvconvert --repair /dev/vg1/data6 /dev/sdf1
    • Confirm: Attempt to replace failed RAID images (requires full drive sync)?

This starts a background rebuild of the RAID6, which you can watch using:

lvs -o name,lv_health_status,sync_percent

 

  1. vgreduce --removemissing {volume group}, e.g.
    vgreduce --removemissing vg1

Congrats, after rebuild you are back to a fully functioning RAID6 system.