Linux software RAID

Posted by: RobotCaleb

Linux software RAID - 25/01/2012 14:56

I have 5 drives in a RAID 5 array. I am swapping them all out for new drives. Rather than waiting for the long rebuild time incurred by adding each drive, I'm curious as to whether I can just clone each drive using dd. I don't know exactly how mdadm knows which drives belong in the array and whether simply cloning a drive will allow it to exist in the array without a rebuild.

I suspect cloning each drive will be faster than rebuilding each drive. Faster means less uptime for the drives being replaced, which is ideal.
Posted by: TigerJimmy

Re: Linux software RAID - 25/01/2012 18:30

why not create the new array with the 5 new drives, leaving the existing array running, then copy the filesystem over? Then mount the new array at the old mount point.
Posted by: RobotCaleb

Re: Linux software RAID - 25/01/2012 18:46

I only have one box in which to do this work. It only has six SATA slots. (One of which is currently the OS installation)
Posted by: Roger

Re: Linux software RAID - 26/01/2012 05:33

Originally Posted By: RobotCaleb
I only have one box in which to do this work. It only has six SATA slots. (One of which is currently the OS installation)


You can just restore from backups, right?
Posted by: LittleBlueThing

Re: Linux software RAID - 26/01/2012 06:55

You can clone the drives with dd - I've done it a lot and it's the preferred solution. Instead of dd use gnu ddrescue (not dd_rescue) as it's just as fast, gives progress and, should errors occur, it will both retry them and bisect error areas.

If you *do* get errors then you need to know what blocks are not copied and you can tell md to resync just those blocks.

Why are you replacing them? Do you have SMART errors or have you been having problems? I ask as you *may* find that newer disks are less reliable than older ones - I get the impression that consumer grade disks run closer to tolerance nowadays as the margins get thinner. No, I have no reference to stats smile

Having said that I'd think about getting larger disks and find a way to use raid6.

When a drive fails in a raid5 and you remove it and add a spare. When that happens you then rely on a full read of every bit on the remaining 4 disks - if even 1 block fails you are SOL and relying on fsck smile
Posted by: Roger

Re: Linux software RAID - 26/01/2012 08:00

Originally Posted By: LittleBlueThing
Having said that I'd think about getting larger disks and find a way to use raid6.


If you've got a machine with only 5 (or, particularly in my case, 4) disks, RAID6 starts to look a lot like RAID10...
Posted by: LittleBlueThing

Re: Linux software RAID - 26/01/2012 08:35

yeah, I nearly said that smile

And I'd use the system slot for an extra big disk, partition them up and run the OS from an initrd-assembled raid1.

If you have 6 slots then a 2-way mirror of the rootfs, a 2-way mirror of rootfs-backup and a 2-way mirror for swap (just because you can).
Posted by: andy

Re: Linux software RAID - 26/01/2012 09:04

Is there any actual benefit with RAID10 over just running a RAID1 array with four devices in it ?
Posted by: wfaulk

Re: Linux software RAID - 26/01/2012 14:34

If you have four disks and you're mirroring them, unless you're talking about a 4-way mirror (triply redundant), you have to have some RAID0 in there somewhere, so it's either going to be RAID 0+1 or RAID 1+0. With 4 disks, it probably doesn't make a lot of difference.

Well, I suppose you could just have a mirrored set of pairs of concatenated disks, but concatenation isn't common any more. If you are talking about that, then, yes, adding the stripe in there is going to help performance.
Posted by: andy

Re: Linux software RAID - 26/01/2012 15:07

I was talking about a RAID1 array with 4 discs, as you say triply redundant. I was wondering why you wouldn't just do that rather than RAID10.

With a plain RAID1 array you gain the ability to take out one disc and rebuild the machine elsewhere at any time with zero hassle.
Posted by: RobotCaleb

Re: Linux software RAID - 26/01/2012 15:35

Originally Posted By: Roger
You can just restore from backups, right?


The data that I care about backing up is backed up, yes. However, that only represents about 3% of the total storage used. Sure I could do that, but it would be nice to keep the data that is less important.
Posted by: RobotCaleb

Re: Linux software RAID - 26/01/2012 15:42

Originally Posted By: LittleBlueThing
You can clone the drives with dd - I've done it a lot and it's the preferred solution. Instead of dd use gnu ddrescue (not dd_rescue) as it's just as fast, gives progress and, should errors occur, it will both retry them and bisect error areas.

If you *do* get errors then you need to know what blocks are not copied and you can tell md to resync just those blocks.

Do I need to do anything special after cloning one drive to a new drive to get it recognized as being part of the array?

Originally Posted By: LittleBlueThing
Why are you replacing them? Do you have SMART errors or have you been having problems? I ask as you *may* find that newer disks are less reliable than older ones - I get the impression that consumer grade disks run closer to tolerance nowadays as the margins get thinner. No, I have no reference to stats smile

They are all throwing a SMART error. The error thrown appears to be claiming that a SMART log write was aborted. I have RMAed the drives and will have the replacements at my door this afternoon. The error being thrown by all of the disks is below.

Code:
Error 3 occurred at disk power-on lifetime: 10238 hours (426 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
- -- -- -- -- -- -
04 51 01 00 00 00 a0 Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
- -- -- -- -- -- -- -- -------------- --------------------
b0 d6 01 e0 4f c2 a0 02 00:03:21.785 SMART WRITE LOG
80 45 01 01 44 57 a0 02 00:03:21.761 [VENDOR SPECIFIC]
ec ff 01 01 00 00 a0 02 00:03:20.742 IDENTIFY DEVICE


Originally Posted By: LittleBlueThing
Having said that I'd think about getting larger disks and find a way to use raid6.

When a drive fails in a raid5 and you remove it and add a spare. When that happens you then rely on a full read of every bit on the remaining 4 disks - if even 1 block fails you are SOL and relying on fsck smile

The drives are about as big as I want them to get. Any bigger and data loss starts to hurt more on failure. Both for loss of data and recovery time. I have 5 2TB drives that I want to get duplicated out. I have rebuilt one and would like to clone the rest so as to tax the existing drives less and finish sooner.

I do have a sixth drive available and would like to expand the array to RAID 6 after the swap has finished.
Posted by: wfaulk

Re: Linux software RAID - 26/01/2012 17:52

Originally Posted By: andy
I was talking about a RAID1 array with 4 discs, as you say triply redundant. I was wondering why you wouldn't just do that rather than RAID10.

To have 2n storage space instead of n.
Posted by: andy

Re: Linux software RAID - 26/01/2012 21:17

Ah yes, good point. I completely missed that point.

Luckily my storage needs don't exceed 1.5TB, so I'll stick to 3 drives in a RAID1 array.
Posted by: LittleBlueThing

Re: Linux software RAID - 27/01/2012 09:32

Originally Posted By: RobotCaleb

Do I need to do anything special after cloning one drive to a new drive to get it recognized as being part of the array?

No. As long as you shutdown, clone, replace, reboot; then you'll be fine. The idea is to avoid entering degraded state.