QRecall Community Forum
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Top Downloads] Top Downloads   [Groups] Back to home page 
[Register] Register /  [Login] Login 

Repair failed with POSIX error 5 - should I give up on this archive? RSS feed
Forum Index » Problems and Bugs
Author Message
Kurt Liebezeit


Joined: Nov 28, 2014
Messages: 5
Offline
Hi James,

My tale of woe (small woe, really):

I had been doing weekly backups with QRecall to a ZFS dataset on a single 4 TB drive in an OWC USB 3 enclosure. Everything seemed to be working for some months. Then I decided to do some changes in my system to eliminate spinning drives in the Mac Pro. Since I was losing redundancy in the data source, I wanted to add hardware redundancy in the backup to compensate (sorta). I bought another 4 TB drive and an OWC dual enclosure (ESATA/USB3). I put the new and old backup disks in the dual enclosure, set it up as independent disks. After some fooling around with ZFS I was able to convert the single drive to a ZFS mirror; during the resilver process (which copies data from original backup drive to redundant backup drive, creating the mirror) I got a message saying that one file had an error during the process: /Volumes/ELITE/QRECALL_BACKUP/QRecall eRaid archive.quanta/repository.data.

That didn't sound good, so I ran a qrecall verify, which failed, and a repair, which also failed. It looked like there were a lot of POSIX error 5 problems, which probably means the disk or the enclosure is going south. The odd thing is that I have DriveDX, which checks SMART data; the drive SMART data shows no hint of a problem, not a single re-allocated sector or unrecoverable read failure, even though there seem to be plenty of parameters shown. Unfortunately, for some reason a self test is not an option for this disk; I might try moving it to another enclosure and see if that option shows up in DriveDX.

I've sent a diagnostic report to Technical Support; wondering if it is worth more effort to try and recover this archive (it's all backup data, after all, the source is currently still intact). If the worst comes to pass I just pull the bad disk, buy another, resilver the ZFS mirror again, and start fresh.

Thanks for your advice.

Kurt
Kurt Liebezeit


Joined: Nov 28, 2014
Messages: 5
Offline
After thinking a bit more, I wonder if this was a ZFS filesystem error. Seems unlikely given that the log seemed to have had a lot of entries that looked like errors that recovered on retry, but I'm still bothered by the pristine SMART data. I wonder if it would be possible to surface the hardware error by doing a Unix dd command read, sending data to /dev/null?
James Bucanek


Joined: Feb 14, 2007
Messages: 1568
Offline
Kurt Liebezeit wrote:After thinking a bit more, I wonder if this was a ZFS filesystem error.

It could be. I don't have much experience with ZFS, but on the Mac native filesystems, a corrupt volume structure can present itself as I/O errors, even when there's nothing physically wrong with the drive. (That's why the Repair command prompts you to repair the volume directory structure on the archive volume before proceeding.)

Seems unlikely given that the log seemed to have had a lot of entries that looked like errors that recovered on retry, but I'm still bothered by the pristine SMART data.

SMART drives have never been very smart. SMART really is little more that a guess about the drive's health, and I've had drives declared to be in perfect condition by SMART only to die completely the next day. So a negative SMART warning should be taken as a sign something is wrong, but a positive SMART status doesn't mean your drive has years of error-free service ahead of it.

I wonder if it would be possible to surface the hardware error by doing a Unix dd command read, sending data to /dev/null?

Yes, that will at a least perform a read on the entire surface. Note that this is effectively what a QRecall verify action does; ditto for repair.

There are a (precious) few disk utilities that will perform a read/write surface test of your drive. The last one I used is DiskTester from the diglloyd Tools suite. There are so few disk utilities that will actually perform a surface test, I've considered adding it as a utility to QRecall.

I did get your logs, but they're basically 40MB worth of "I/O error reading repository.data file" messages, which you already knew.

- QRecall Development -
[Email]
 
Forum Index » Problems and Bugs
Go to:   
Mobile view
Powered by JForum 2.8.2 © 2022 JForum Team • Maintained by Andowson Chang and Ulf Dittmer