|
Forum List
>
Café LA
>
Topic
RAID AdvicePosted by Andy Neil
Hey Everyone,
Last year (around Sept), I set up a RAID 5 for someone. I used a ATTO ExpressSAS RAID card and a 4 drive chassis with 4 - 2TB Hitachi 7200 drives. RAIDed together for about 5.6TB. Worked great at first; seemed very solid. Then in December, the RAID unmounted and wouldn't re-mount. Using the ATTO RAID config software I was able to determine that there didn't seem to be any real problem with the drives. They showed up in the config tool, but the group wouldn't mount. After talking with tech support, I was able to force-mount the drives, copy the media over to a backup drive and rebuild the group. But then only a couple of months later, the RAID spontaneously unmounted again. Again I got into it with ATTO tech support. They confirmed via logs that there's nothing really wrong with the drives. According to them, the problem lies with the nature of using desktop drives in a RAID configuration. Apparently, when a desktop drive comes across a sector it has trouble reading, it'll make a "heroic" attempt to re-read the sector, tying up access to the drive until it recovers the information. This can take as much as a couple of minutes depending on the severity, but most times is just seconds. Their RAID controller however, has an extremely low tolerance for a drive that doesn't respond to it, and after about 14 seconds, will unmount the volume, causing the group to go offline. This doesn't happen with Enterprise drives as apparently, their recovery tactics for hard-to-read sectors are different and are never out of contact with the controller for more than 7-10 seconds. But there are a LOT of RAID configs on the market that are using desktop drives in place of Enterprise drives, seemingly without the troubles I'm having. My question (finally) is: Is there a RAID controller out there that I should replace my ExpressSAS with that is more forgiving to desktop drives? Those of you with RAIDs, what configs are working for you? When I initially put this RAID together, I had intended to get a CalDigit HDOne or HDPro2 RAID, but they were back-ordered and there was a time element in setting up the system. It looks like CalDigit RAIDs use desktop drives, so I feel like there is a solution out there that won't require scrapping the entire set-up. Thought? Andy
Are you using an internal or external RAID setup?
Shane had his popsicle RAID running on Hitachis. I've been working off largely Seagate Barricuda drives on external RAID configurations, and I haven't had any issues with it, aside from a firewire chip issue, but I got that replaced. Many of us use desktop class drives in our RAIDs, mainly because enterprise level drives are so expensive. You may have to swap a RAID controller. Someone like Jon Shilling may be able share more about it. ![]() www.strypesinpost.com
Yeah, I don't think buying Enterprise level drives is possible due to their expense. I'm using an external RAID setup.
So is your RAID a firewire RAID? I don't think I could build anything slower than a SATA RAID because this was created for a RED film. We're offlining it in 1080p ProRes, but I need it to be able to handle 2K files in the coloring stage. I'm not familiar with the term popsicle RAID. I was going to ask what it was, but decided to google it. ROLF! That was awesome. I'm waiting to hear back from ATTO regarding some setting changes to the card that MAY help (adjusting the amount of time the controller will retry an unresponsive drive before unmounting). But if their solution doesn't work, I need to know what I can go to after. I'd like to stay SAS if possible, but if all the RAID SAS controllers are as finicky as this one, I might need to change to an eSATA RAID. A friend of mine uses a RocketRAID and says he has no problems either. Andy
That raid served me well. But I am VERY glad I didn't rely on it for long. I have since taken the better route of spending the extra cash to get a more secure RAID. Caldigit HD One. Although now I need it to be larger than the 2TB I have.
But building that Popsicle Raid was fun... ![]() www.shanerosseditor.com Listen to THE EDIT BAY Podcast on iTunes [itunes.apple.com]
Popsicle RAID:
![]() Nobody knows how important it is to choose the right popsicle sticks. You need to make sure you get the right tension on the stick when you bite into the ice cream. Too loose and the RAID will fall apart, literally. So here's Shane testing the hardware: ![]() [lfhd.net] ![]() www.strypesinpost.com
>So is your RAID a firewire RAID?
The one i'm on at home is a Firewire Raid 1. I was working off a FW800 RAID 5 on a show, which got whittled down to USB RAID 5 when the Firewire chip short circuited, but that was sorted out in the end. I'm not sure if FW800 will be able to handle ProRes HQ at 2K, but I think it actually can. Ideally you'll be on eSATA or Fibre. I've had Seagates on FC RAID 5, as well as Hitachis on another FC RAID. No issues on them. ![]() www.strypesinpost.com
I've been testing a HighPoint 3522 and a 4322 alongside a Areca 1680x they all work well (if you get the correct firmware) with all of the drives I've tested from all manufacturers (never Maxtor) and work well with both desktop and enterprise class. I've not tested SSDs though.
The RAID 5 I have with Samsung Spinpoint F1 on the 3522 is about 2 years old and rock solid except for the odd directory fix with diskwarrior when I've accidentally turned off the RAID without unmounting and using the RAID management software to remove the RAID. My suggestion would be to run the Disk-tester from Lloyd Chambers [macperformanceguide.com] on each of the disks and thoroughly test them all - it's likely one of them has a fault causing all of the RAID to fail. It's also worth checking all the HDDs have the same firmware installed - check which is the latest from Hitachi. If the RAID worked well and the problem is intermittent maybe its a setting on the management software? However if you want to avoid the hassle of testing and testing get the new HDpro2 - at 800MBps over 8 HDDs its pretty damn good and with less to worry about. ![]() For instant answers to more than one hundred common FCP questions, check out the LAFCPUG FAQ Wiki here : [www.lafcpug.org]
UPDATE from ATTO:
Tech support suggests increasing the command timeout time for the controller to accommodate the desktop drives better. The logs suggest that nothing is wrong with the drives themselves (though I think I'll double check that with your suggestion Ben), but the controller is too sensitive to the call/response times of desktop drives versus enterprise drives. They've even said that they're working on a firmware update that will, "better accomodate a wider range of drive classes and compensate better for command timeouts and retries." Thanks Ben for the controller info. I'll keep them in mind if these suggestions from ATTO don't pan out. Andy
I think eating all those popsicles is what made me fat. Finally lost most of that popsicle weight.
![]() www.shanerosseditor.com Listen to THE EDIT BAY Podcast on iTunes [itunes.apple.com]
Was it Wall's? Lol. Someday I'll make my own popsicle RAID running off a Firewire chip.
![]() www.strypesinpost.com
I don't think so.. I test all my drives in the RAIDs as JBODs first to check each individual HDD or I do it in an external case via eSATA or FW800 or occasionally internally in the Mac - especially if I need to update firmware using something like freeDOS on a CD-ROM. Once that is done I RAID them as a single Volume - usually RAID 5 or 6 but if its a critical project I might opt for RAID 10. I've written an article for the SuperMag about BYO RAID but its not really for people who already build their own. It will be out hopefully in time for the NAB SuperMeet. ![]() For instant answers to more than one hundred common FCP questions, check out the LAFCPUG FAQ Wiki here : [www.lafcpug.org]
I think not - but let me know what Lloyd says as that would be useful - however due to the way many Hardware RAIDs are formed it is highly unlikely.
You would need to break the RAID to test individual disks properly. It was more aimed at Andy who would need to make a new RAID or reformat his old one anyway. ![]() For instant answers to more than one hundred common FCP questions, check out the LAFCPUG FAQ Wiki here : [www.lafcpug.org]
Like Ben I have been running an 8 drive RAID (as a RAID 5) for a couple of years and it has been rock solid. I use the HighPoint 3522 Raid controller card. You have to get a couple of miniSAS cables and a box that has a SAS controller in it. This protocol that this HighPoint card uses is about as fast as it gets short of going to a fiber link solution. The price jump to fiber is about double the cost (or more) last time I checked.
The ATTO R380 is an SAS controller. I've already got all the components necessary and when the RAID is up and running, I've been very happy with its speed and performance. I'm glad to hear another proponent of the HighPoint 3522 card. I believe I looked into that card when doing my initial do-diligence. If the fixes don't work with the ATTO, I may purchase the HighPoint and swap it out. Thanks, Andy
Hi Andy! I dunno if you solved this as I WAS just having the same "command timeout" issue from the R380 for the past 2 week now withe enterprise level Hitachi AK72000 ULtrastars HUA722020ALA330 and the ATTO Expresssas R380 @ sept 2010 firmware and driver. (Driving me mad) My config may be similar to you.
These "Command Timeouts" appear only on WRITE activity.. followed by ATTO retries then failure..... on random RBAs btw... (can rule out bad patch or area on disk(s)) Some info that might help diagnose it a bit. My Config
History:
Issues with WD20EARS greenie disks (this was enlightening!)
The Findings (easy to find via google)
As fate would have i I bumped into the ATTO guys at IBC. Had a chat and asked the lads on the booth in Hall 7 if they had a magic CLI config command to alter the command timeout on the R380 so it would wait around much longer for the interupts for an I/O.. and yes there is.. you need to contact them to get it as they will have to exponge me from the earth if I say so.. (although the SONNET manual has this command documented. UYou can look in that .PDF online. I believe the default for the R380 was 5000 ms (5 secs). You can use this magic CLI commadn in teh configcli to increase it as much as 10 minutes! Anyway I had some limited success with this for a few days, but in the end this constant offline of the raid set was too much to bear and I swapped out the drived for Hitachi's (2.5 times the cost of the Wd cheapies) as above. FIXED?
Summary:
HTH's someone. Will report back (Hmm... still working ok on dev system.... ... )
Sorry, only registered users may post in this forum.
|
|