QRecall Community Forum
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Top Downloads] Top Downloads   [Groups] Back to home page 
[Register] Register /  [Login] Login 

Diagnostic help needed RSS feed
Forum Index » Problems and Bugs
Author Message
Ralph Strauch


Joined: Oct 24, 2007
Messages: 194
Offline
I've been backing up both a MBP and an iMac to an alternating set of archives, keeping one offsite. My primary backup drive was getting old, so I decided to replace it in October with a new one. The new drive failed in December so I sent it back, and received a replacement on Monday, 2/2. I then copied my remaining archive (call it #2) over to the new drive (call it #3) to continue with my alternating backups. Copying was done using USB3 on the MBP. The copy looked fine, so I backed up the MBP to archive #3. That backup completed successfully. I then moved the drive over to the iMac (where it is generally mounted) and when to bed.

At 2015-02-03 01:00 the iMac attempted a scheduled backup, which failed. In the morning I saw that it had failed but didn't look closely at the log and reran it manually (starting at 2015-02-03 09:18 in the log).That capture also failed, apparently disastrously, so at 10:32 I ran a repair. The repair found large amounts of invalid data, and after it was over, all layers in the archive containing data from the iMac show up as damaged. I'm sending you a report from the iMac.

The most obvious potential source of this problem is my replacement drive, which Disk Utility shows as good. I don't know how much diagnostics the manufacturer can do if I send it back, since I encrypt my backup drives and would rather not give them the key. Can you tell if this problem was caused by my replacement drive, or by something else, and not the drive, what should I do next. If it was caused by the drive, can you suggest anything I can pass on the the manufacturer?

Other possibly relevant info. The MBP backed up successfully to the replacement drive, and I assumed that took care of the potential identity problem I asked about in <http://forums.qrecall.com/posts/list/495.page>. The MBP is USB3 while the iMac is USB2, if that matters for any reason. Both machines have been routinely backed up to my other backup drive without issue, and the replacement should have been an exact copy of that drive.

Any light you can shed on this would be helpful.

Thanks, Ralph
James Bucanek


Joined: Feb 14, 2007
Messages: 1567
Offline
Ralph,

It's hard to tell what's going on, but here are a few thoughts.

The two captures that failed failed in exactly the same way: The checksum of the record at file offset 992,333,119,488 was inconsistent with the data. The position was the same and the (bad) data was the same both times. This would indicate a permanent media failure. The data stored on that portion of the drive is incorrect, and remains incorrect, after being repeatedly re-read.

Another type of error is a transient error, where the data is stored on the media correctly but gets randomly scrambled on its way to QRecall. This kind of error isn't repeatable.

When you ran the repair, you got a massive number of data corruption errors. Since you ran disk diagnostics on the volume, we can assume these are not the result of cross-linked files.

I conclude that either the drive is experiencing rampant media failures, or the archive data was scrambled while it was being duplicated. (This latter theory could be explained by transient errors, but you'd have to get a lot of them.) I would bet on the former, since you successfully captured data to the new archive immediately after the copy. It seems highly unlikely that the copy wouldn't result in any damaged records that belonged to the first computer, but munged thousands of records belonging to the second.

Regular disk diagnostics (i.e. Disk Utility) will only tell you if the volume directory structure is correct. There are extremely few utilities that will perform a surface test. A surface test writes a pattern of data to every sector on the drive, and then reads it back to make sure it's still correct. These tests can take hours, if not days, to complete.

It's immaterial whether the data on your drive is encrypted or not. It only matters that it's written and read correctly. I don't need to know anything about the data to write it and look for any discrepancies when it's read back.

Having said that, you can easily perform a surface test yourself, since your archive is so large and would cover a significant portion of the volume. Erase the new drive and start over. Copy archive #2 to it again. After writing it, verify the archive or use the command-line cmp tool to perform a byte-by-byte comparison of the original and the copy. (Make sure you've placed your QRecall actions on hold so the original doesn't get modified during the copy.)

If this is successful, move on to trying to use the new archive again and note at what point (using it on the laptop, for example) the problems reappear.

If the comparison test fails, you've narrowed down the problem to either the drive or the busses (USB) you're using to transfer the data. Which is to say, you haven't really narrowed it down at all. The next step is to use a different interface. If the drive supports both FireWire and USB, switch to FireWire, or eSATA, or whatever you've got. It's a little geeky, but I keep a spare external drive enclosure around just for testing drives in an enclosure with an interface I trust.

If, after performing the copy and compare again using a different interface, you get the same kind of data corruption, my money would be on a bad drive. If switching interfaces cures the problem, then that's where you need to look next. It could be a motherboard issue with the computer or (more likely) the interface controller in the drive's enclosure.


- QRecall Development -
[Email]
Ralph Strauch


Joined: Oct 24, 2007
Messages: 194
Offline
Thanks for the suggestions. I?m back up and running now, but had one questionable event in the process. I wiped drive 3 (the new drive) and copied and renamed the archive from drive 2, this time deleting the status.plist to deal with the identity issue. I then mounted drive 3 on the iMac and went to bed, allowing the scheduled overnight backup to run. When I got up this morning it had run successfully.

I then attempted to backup the MBP over wifi (my normal backup procedure). This failed three times in a row, reporting ?cannot open negative hash map.? I dismounted the drive from the iMac and rwmounted it directly on the MBP, and the backup ran successfully. I then moved the drive back to the iMac and ran another MBP backup over the network, which was also successful.

I?m running a surface scan now using TechTool Pro, which is reporting no problems. I?m sending you a report from the MBP FYI in case the ?negative hash map? reports are of value to you. At this point I?m back to normal and happy, so I don?t need any more of your time on this.

Thanks again for the superb support you provide. It?s one of the best features of this great app.
James Bucanek


Joined: Feb 14, 2007
Messages: 1567
Offline
Ralph,

The error I'm seeing in the log for the negative hash map is 13 (Permission denied). You might want to check the permissions and ownership of the archive package and files to make sure the MacBook Pro user has sufficient rights. Also, did you previously set the "Ignore ownership" option on that volume when mounted over the network? If so, you might want to check that as reformatting the drive can cause the OS to forget that setting.

I'm glad to hear it's not your drive.

- QRecall Development -
[Email]
Ralph Strauch


Joined: Oct 24, 2007
Messages: 194
Offline
I did forget to check "ignore ownership" when I mounted the new drive, so that probably was the problem. It all seems to be working OK now, though. Thanks.
 
Forum Index » Problems and Bugs
Go to:   
Mobile view
Powered by JForum 2.8.2 © 2022 JForum Team • Maintained by Andowson Chang and Ulf Dittmer