Breaking RAID arrays (aka: testing nerves)

Contents:

Overview
In two previous posts it was demonstrated how to create a RAID 5 device with mdadm and format it with ext4. Using RAID5 and ext4 together can increase the fault tolerance levels of your storage, but not to forget a good administrative practice is to test the failure process and become familiar with the recovery steps.

The steps that follow will show how to verify an array is running healthy, simulate a failure, identify a failure has occurred, and how to recover from it.

This guide was tested on a Debian 6.0 "Squeeze" system (Linux debian 2.6.32-5-amd64) with a RAID 5 array created out of five 5GB disks, one as a hot spare, with the ext4 filesystem as the primary partition. root access is required for the steps in this guide.

The steps outlined below increase the risk of data loss. Do not run them on a production system without fully understanding the process and testing in a development environment.

These instructions are not meant to be exhaustive and may not be appropriate for your environment. Always check with your hardware and software vendors for the appropriate steps to manage your infrastructure.

Formatting:
Instructions and information are detailed in black font with no decoration.

Code and shell text are in black font, gray background, and a dashed border.
Input is green.
Literal keys are enclosed in brackets such as [enter], [shift], and [ctrl+c].

Warnings are in red font.

Steps to [test] fail a RAID array and fix it

Change your shell to run as root:

user@debian~$: su -
Password:
root@debian~#:

Check current configuration and status of the array:

root@debian~#: cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde[5] sdf[4](S) sdd[2] sdc[1] sdb[0]
15724032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>
root@debian~#: mdadm --detail /dev/md0 | grep dev
/dev/md0:

0	8	16	0	active sync	/dev/sdb
1	8	32	1	active sync	/dev/sdc
2	8	48	2	active sync	/dev/sdd
5	8	64	3	active sync	/dev/sde
4	8	80	-	spare	/dev/sdf

The /proc/mdstat file shows that md0 is the active RAID5 device composed of 5 devices: sdb, sdc, sdd, sde and sdf. Device sdf is a hot spare. The end of the third line states that all 4 data disks are online and up with the [4/4] and [UUUU].

Output from the mdadm command provides similar details as mdstat. First column is device number, second is number of major events, next is number of minor events, fourth is the RAID device number, then the status, and the device name. The first four disks are all healthy because they are active and sync'd. The last disk is a hot spare and currently does not hold any data from the array.

Everything is green checkmarks and with this information the array could be rebuilt in the event of a failure.

Mark a data disk as faulty:
root@debian~#: mdadm --manage --set-faulty /dev/md0 /dev/sdc
mdadm: set /dev/sdc faulty in /dev/md0
root@debian~#:
This mdadm command changes the application to manage mode, says to mark the device target as faulty, specify the md device as /dev/md0, and the target device to set faulty is /dev/sdc.
The second line confirms the command has been processed.

Check the status of the array and filesystem:

root@debian~#: cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde[5] sdf[4] sdd[2] sdc[1](F) sdb[0]
15724032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [U_UU]
[>. . . . . . . . . . .] recovery = 2.6% (140032/5241344) finish=4.2min speed=20004K/sec

unused devices: <none>
root@debian~#:
root@debian~#: mdadm --detail /dev/md0 | grep dev
/dev/md0:

0	8	16	0	active sync	/dev/sdb
4	8	80	1	active rebuilding	/dev/sdf
2	8	48	2	active sync	/dev/sdd
5	8	64	3	active sync	/dev/sde
1	8	32	-	faulty spare	/dev/sdc

root@debian~#:
root@debian~#: tail -n 21 /var/log/messages
RAID5 conf printout:
- - - rd:4 wd:3
disk 0, o:1, dev:sdb
disk 1, o:0, dev:sdc
disk 2, o:1, dev:sdd
disk 3, o:1, dev:sde
RAID5 conf printout:
- - - rd:4 wd:3
disk 0, o:1, dev:sdb
disk 2, o:1, dev:sdd
disk 3, o:1, dev:sde
RAID5 conf printout:
- - - rd:4 wd:3
disk 0, o:1, dev:sdb
disk 1, o:1, dev:sdf
disk 2, o:1, dev:sdd
disk 3, o:1, dev:sde
md: recovery of RAID array md0
root@debian~#: df -h | grep md0
/dev/md0p1 15G 176M 15G 2% /data
root@debian~#: ls /data
lost+found testfile1 testfile2

The mdstat file shows that the sdc disk has failed and sdf is no longer marked as a spare. From [4/3] and the [U_UU] on the next line it confirms the overall status that one of the disks is not fully snyc'd and it is the second disk marked as down. The next line is new and shows the overall recovery process, currently 2.6% complete, 140032 out of 5241344 blocks have been rebuilt, the estimated completion time is 4.2 minutes, and the I/O speed.

mdadm --detail displays that the RAID disk 1 has been changed to /dev/sdf (it was /dev/sdc), and is currently rebuilding. The spare, which was /dev/sdf, has been changed to /dev/sdc and it is marked as faulty.

The kernel has reported to syslog that changes have occurred with md0. The first grouping shows that in the original configuration a device is no longer online, dev:sdc. The second grouping shows the disk has been removed from the array while the last group shows the disk dev:sdf has been added and that it is online. The final line shows that md is currently recovering the RAID array md0.
Not shown are the timestamps (makes everything ugly). It is interesting to note that on my machine from the time the kernel detected the error to when recovery started was about one-hundredth of a second, .011.

Output from df and ls show that the filesystem is still mounted and usable.

Wait for the array to fully rebuild and verify the status:

root@debian~#: cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde[5] sdf[4] sdd[2] sdc[1](F) sdb[0]
15724032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>
root@debian~#:
root@debian~#: mdadm --detail /dev/md0 | grep dev
/dev/md0:

0	8	16	0	active sync	/dev/sdb
4	8	80	1	active sync	/dev/sdf
2	8	48	2	active sync	/dev/sdd
5	8	64	3	active sync	/dev/sde
1	8	32	-	faulty spare	/dev/sdc

root@debian~#:
root@debian~#: tail -n 7 /var/log/messages
md: md0 recovery done.
RAID5 conf printout:
- - - rd:4 wd:3
disk 0, o:1, dev:sdb
disk 1, o:1, dev:sdf
disk 2, o:1, dev:sdd
disk 3, o:1, dev:sde

/proc/mdstat shows sdc is faulty, but the array itself is fully active and up, [4/4] and [UUUU].

mdadm --detail reports that sdb, sdf, sdd, and sde are all active and sync'd while /dev/sdc is marked faulty and placed as a spare.

The system messages report that md0 recovery is done, and the array is configured with 4 disks, sdb, sdf, sdd, and sde.

At this time the array is fully functional and ready for use. It would be a good idea to replace the failed drive incase a new disk in the array fails.

Remove the faulty drive:
root@debian~#: mdadm --manage /dev/md0 --remove /dev/sdc
mdadm: hot removed /dev/sdc from /dev/md0
root@debian~#:
mdadm was placed in manage mode, the target RAID device is /dev/md0, and it is told to remove /dev/sdc.

Confirm the disk is removed:
root@debian~#: cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde[5] sdf[4] sdd[2] sdb[0]
15724032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>
root@debian~#:
root@debian~#: mdadm --detail /dev/md0 | grep dev
/dev/md0:

0 8 16 0 active sync /dev/sdb

4 8 80 1 active sync /dev/sdf

2 8 48 2 active sync /dev/sdd

5 8 64 3 active sync /dev/sde

root@debian~#:
root@debian~#: tail -n 2 /var/log/messages
md: unbind<sdc>
md: export_rdev(sdc)
mdstat shows only four devices comprising the md0 device, none of them are /dev/sdc.

mdadm also shows four devices for md0, none of them are /dev/sdc, there are no spare devices.

/var/log/messages has two new entries from md, it unbinded /dev/sdc and then removed the device.

At this time it is safe to physically to remove the faulty or "faulty" hard drive if it is bare metal or unattach if it is a virtual machine.

Attach a new disk:
root@debian~#: mdadm --manage /dev/md0 --add /dev/sdc
root@debian~#: mdadm: re-added /dev/sdc root@debian~#:
mdadm is ran in manage mode with the target RAID device as /dev/md0, it is told to add /dev/sdc. The second line confirms the task is completed.

Verify the new disk is available:

root@debian~#: cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdc[1](S) sde[5] sdf[4] sdd[2] sdb[0]
15724032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>
root@debian~#:
root@debian~#: mdadm --detail /dev/md0 | grep dev
/dev/md0:

0	8	16	0	active sync	/dev/sdb
4	8	80	1	active sync	/dev/sdf
2	8	48	2	active sync	/dev/sdd
5	8	64	3	active sync	/dev/sde
1	8	32	3	spare	/dev/sdc

root@debian~#: tail -n 1 /var/log/messages
md: bind<sdc>
root@debian~#:

mdstat now lists /dev/sdc as a spare drive.

The details of /dev/md0 with mdadm lists /dev/sdc as a spare drive.

/var/log/messages has an entry from md that it binded /dev/sdc.

The array is remained untouched on this step but mdadm now has /dev/sdc designated as a hot spare drive in the event another disk does fail.

Conclusion
If you performed the steps from this guide you would have verified the status of a RAID array is healthy, broken it by manually failing a drive, monitored the rebuild process, and re-added a drive in to the hot spare pool.

Testing and becoming familiar with the process to recover from failures can help you respond in a level headed manner and increase the likelihood of successful problem resolution. You will also be able to see how long the process takes so you know if something might be the matter when you have to perform these actions in a production environment.

References

Written by Eric Wamsley
Posted: July 28th, 2011 4:15pm

Topic: breaking RAID5
Tags: RAID5, Debian, Linux, mdadm,

ewams