I am using Apple’s software raid in my Mac Pro. I have pair of 3TB Barracuda drives. So far there has been no problem. Then I got this.
What “Failed” means is a mystery to me. Could it be some kind of data corruption or physical failure of the drive? It seems unlikely that hard drive mechanism would just die suddenly. Those things are ultra reliable these days. SMART status showed no issue.
What I did first was to verify which physical disk is the failed one. Disk Utility told me that the failed one is living in Bay 4.
So I shut down the machine and pulled the drive out, and connected it to my Macbook Pro via my USB3 interface. The disk worked perfectly. I was able to even run disk utility in it and check for logical errors. There was none, whatsoever.
Now, actually I think the proper way to do this would have been to demote the failed one before pulling it out. This way there would be absolutely no confusion which drive is which. When I was accessing the drive with my Macbook the raid array showed up there as well which was a bit scary.. But anyway, I thought since the disk is “failed” it won’t matter to the actual system anyway and drives can go totally dead.
So I attempted to format the damaged one. If the disk really is bad, it might show up during formatting. Formatting went totally fine, just as you would expect with a normal working drive.
So I put it back in my Mac Pro and booted up. DU said that the drive is now missing, which is to be expected. I demoted the missing drive, and dragged the freshly formatted one back to the RAID set. Then simply “Rebuild”. It takes about 7 hours. What’s really cool is that diskutil appleraid list command in terminal actually gives you the percentage of the progress as well as other useful information. Some people recommend to do this stuff entirely from terminal which is fine, that stuff doesn’t look too complicated, but I thought to test the DU’s demote / rebuild buttons. It seems no problem.
Good thing is that I am able to access and work with the data while it’s rebuilt. There’s no downtime. I have my Time Machine backup anyway, so I’m feeling quite carefree and gay. However it’s good to note this time december-22 in my calendar, should the data become corrupted somehow due to me being total amateur in RAID setups, I would be able to recover from a backup that was made earlier than this moment. And my most important treasures, my present day Lightroom Catalog and image data is also on another external drive which normally disconnected. (I wrote a little shell script which uses rsync to update the disk whenever I plug it in so it’s super convenient)
Couple of things worth noting:
- OS X gave me no warning whatsoever of the failing RAID slice. I was lucky to just find out by accident the failing drive when using DU.
- If we compare this situation to the non-raid scenario, the obvious benefit is that I am able to keep working with the data, non-stop, even while rebuilding the RAID. If I was relying on Time Machine backup solely, I would first need to transfer the data in order to work with it. 2TB+ takes a very long time especially since my backup drive is connected via FW800.
- I could add one more disk to the pack to act as a hot spare for extra safety. I am actually considering this since I have my Bay 3 empty at the moment. If one would fail I would still have two good ones.
- One more obvious benefit of the RAID mirroring is that the repair can wait; as long as I have extra backup, there’s no immediate need to do anything. I can just order replacement drive from Amazon and when it arrive, swap the bad one. Of course this is because I do have TM in which I can rely in total catastrophe. This flexibility is definite benefit of RAID.
It is interesting question why the disk in Bay 4 became corrupted. I suspect it’s just something that happens every now and then with things like Apple’s Software RAID. If the same disk would fail again, then it’s likely to be hardware issue.