QRecall Community Forum
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Top Downloads] Top Downloads   [Groups] Back to home page 
[Register] Register /  [Login] Login 

Frequent reindexing and rebuilding RSS feed
Forum Index » Problems and Bugs
Author Message
Steven Arnold


Joined: May 15, 2009
Messages: 4
Offline
First of all, I love QRecall as a product. It's very sophisticated and allows me to backup to encrypted disk images, which is in fact what I do. (This may be part of the issue, bear this in mind. My archives reside on large encrypted disk images.) I have version 1.1 (1.1.0.42). If I should be trying a later version, let me know. The updater says I have the latest version.

My problem comes because of one particular archive, my music and video archive. I have about 50,000 music files and 4500 video files. The archive I use for this tends to work correctly for a couple weeks, but then the index becomes corrupted. QRecall attempts to reindex the archive, but this inevitably fails, and it then has to rebuild the archive. I am not sure what the difference is, but both processes take about equally long -- and a very long time.

I could separate this into two separate archives, one for video and one for music. But in terms of pure number of files, I probably add the same number of files to both sets per unit time, so I am concerned that, e.g. if the problem is related to the sheer number of files in the archive, maybe the video archive would be OK, but I'd encounter the same problem with the music archive after a couple weeks of updates.

I don't really think it's about number of files, though, because I have a home directory archive that includes a source code directory and has nearly 300,000 files, and my documents directory, a separate archive, contains nearly that many again. So maybe it's about the size of the archives. Music and video is 781 GB; documents is 345 GB; home directory is 127.6 GB.

Any suggestions on what I can do to fix, work around, or help the debugging process for this issue?
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
Steven,

Thanks for posting. I'm glad that you've found QRecall useful, despite the occasional problem.

I have two questions. First, what exactly is the problem that's causing your archiving to require repair? The answer should be in your log files, which I'd like to look at. You can send a diagnostic report or just attach/e-mail your recent log files (~/Library/Logs/QRecall). The failures that would require a repair are typically data corruption problems.

Which leads me to my second question. Is this archive on the same volume and/or physical drive as your other archives?

- QRecall Development -
[Email]
Steven Arnold


Joined: May 15, 2009
Messages: 4
Offline
Hi James, I was not notified of your reply on the forum, but I did receive your private message, and I have successfully submitted my report to the QRecall Report Server using the "Send Report" mechanism.

Also, in answer to your question, yes, all the archives are on a single encrypted disk image.

steven
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
Steven,

Thanks for sending a diagnostic report.

I've looked at your log files, and you're encountering a consistent problem. Your archive is occationally encountering an I/O error when QRecall tries to close the archive. This happened on 4-30, and again 5-14:

2009-04-30 07:19:38.672 -0400 Failure Problem closing archive

2009-04-30 07:19:38.672 -0400 Details could not set file size
2009-04-30 07:19:38.672 -0400 #debug# IO exception
2009-04-30 07:19:38.672 -0400 Details Path: /Volumes/encrypted_backups 2/music_and_video.quanta/hash.index
2009-04-30 07:19:38.672 -0400 #debug# OSErr: -36
2009-04-30 07:19:38.672 -0400 Details Pos: 3221225528


2009-05-14 07:19:49.005 -0400 Failure Problem closing archive

2009-05-14 07:19:49.005 -0400 Details could not set file size
2009-05-14 07:19:49.005 -0400 #debug# IO exception
2009-05-14 07:19:49.005 -0400 Details Path: /Volumes/encrypted_backups 2/music_and_video.quanta/hash.index
2009-05-14 07:19:49.005 -0400 #debug# OSErr: -36
2009-05-14 07:19:49.005 -0400 Details Pos: 3221225528

Once an error like this occurs, the archive is considered suspect until it can be reindexed or repaired. Until that happens, QRecall will refuse to use it.

The problem (i.e. "curse") of QRecall is that it's absolutely fastidious about data integrity and it checks its work afterwards. The "invalid header length" errors you encounter afterwards is QRecall's way of saying that the file size that expected to find doesn't agree with the actual file size. This makes sense, since both errors occurred when trying to set the file size before closing the archive.

QRecall attempts to reindex the archive, but this inevitably fails, and it then has to rebuild the archive. I am not sure what the difference is, but both processes take about equally long -- and a very long time.

All the important data in an archive is stored in a single file. Most of the remaining files in an archive are "index" files that provide rapid access to archive data. If the data file is undamaged, a reindex reconstructs the index files by scanning the entire data file. A reindex assumes the single data file is completely valid; any problems or inconsistencies will cause it to fail.

A repair is very similar to a reindex, except that it make no assumptions about the information in the data file. It too scans the entire data file and rebuilds the indexes, but automatically "fixes" any problems that it finds in the data file.

So as long as you don't have any problems with your master data file, a reindex will fix your archive. However, there's no way of knowing that in advance. If reindexing your archive takes a very long time, just repair it instead. If the data file is OK, repair and reindex are essentially identical. If the data file isn't OK, the repair will fix it.

Now, back to the real question of why this is happening. Alas, I don't know. An I/O error (-36) is the generic "something when horribly wrong and I can't perform this file operation" error. It really doesn't say what the problem is or what could be done about it.

Two I/O errors performing the same operation at the same point in processing is unlikely to be a coincidence. "Real" I/O errors are typically random events caused by hardware glitches or by accidentally pulling out a FireWire cable.

I've never seen this problem in testing, so I suspect it's something specific to your environment. You say that the archive is on an encrypted volume? Are your other archives also encrypted? Are the as large? Have/could you try moving it to a non-encrypted volume to see if you have the same problem?

- QRecall Development -
[Email]
Steven Arnold


Joined: May 15, 2009
Messages: 4
Offline
Hi James, thanks for the reply. I will move the music and video archive to a non-encrypted volume and see how that works for some time. It may take 2-4 weeks before you hear back, since the period over which the process breaks is usually about that long.

Here's a theory: maybe iTunes changes something about one of the files being backed up during the backup process. There are two cases of potential importance, I think. One is that the file is changed after QRecall backs it up. This might cause QRecall to look at the file and die because it sees it's different.

The second, probably more serious case, is where the actual file itself changes partway through the process of QRecall backing it up. In other word, the file contains n bytes. QRecall backs up x bytes, where x < n. Prior to QRecall finishing its backup, some program changes one of the bytes less than x, or worse, inserts multiple bytes or removes bytes (thereby making the file smaller or larger). QRecall completes the backup process and checks the file, and determines that the backup is inconsistent.

Just some thoughts...

steven
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
Steven,

Steven Arnold wrote:Here's a theory: maybe iTunes changes something about one of the files being backed up during the backup process. There are two cases of potential importance, I think. One is that the file is changed after QRecall backs it up. This might cause QRecall to look at the file and die because it sees it's different.

QRecall does not compare the data it captures with the original files in order to verify it's validity. It's not a definitive test, for exactly the reason you mentioned. Instead, QRecall reads the original file and adds additional checksum and data consistency information to the archive. This allows QRecall to determine if the file data in the archive is OK, even after the original has changed or been deleted.

The second, probably more serious case, is where the actual file itself changes partway through the process of QRecall backing it up.

Another very valid concern, but that's not your problem. The error you got was an I/O error trying to finalize the archive. This happens long after all of the files have been captured, so it can't have anything to do with the source data files.

QRecall tries to deal with mutating data files in three ways:
- QRecall reads files in large chunks, very quickly. For most files (<12MB) this gets a "snapshot" of the file as quickly as possible, so any future modifications don't influence what's captured.
- If the file being captured is replaced, QRecall will stop capturing the file and start over. This gets logged as minutia.
- If the file is extended or truncated during the capture, QRecall will record that as a warning in the log. You'll want to recapture the file.

These are, admittedly, stop-gap measures that try to make as accurate of a capture as possible. They are all part of the larger issue of trying to copy files that are open and being modified, for which there is no complete solution. This is just a consequence of the way the operating system works and there are no backup solutions available that can completely eliminate the possibility of capturing a partially modified file.

If you have groups of files that are being modified regularly, and you're running OS X 10.5 or later, you might consider adding a "pick-up" capture that immediately follows your regular capture. Starting with Leopard, QRecall will very quickly recapture any files that were modified after the first capture started.

- QRecall Development -
[Email]
Steven Arnold


Joined: May 15, 2009
Messages: 4
Offline
To followup on this issue, I have now had an archive on a regular disk -- not a DMG, encrypted or otherwise -- since June 18th. That's about three weeks. Normally the problem would have come up by now, so I would conclude that QRecall has a problem with large archives on DMGs. Note that the problem could be an artifact of encryption, or it could just be an issue with DMGs. This particular archive is 830 GB and it contains about 55,000 files. As I mentioned before, I have other archives with many more files -- ballpark of 300,000 -- but they are much smaller, the bigger being 350 GB.

To reproduce this on your end, try writing a script to create a set of 55,000-ish files, spread throughout multiple directories. Place the archive on an encrypted DMG. About 5000 files, emulating video, will vary between 500 MB and 1.2 GB. 50,000 files will emulate MP3 or AAC files, and will average between 5-10 MB each. Make the complete archive about 800 GB in size. Then start adding a few files every day to the archive, and after a couple weeks, you should see the problem. Maybe you can accelerate it by running the backup, adding files, then running the backup again, over and over.

If you need some help writing a script to do this, I can probably knock something out in Ruby pretty fast to create the original files to be backed up, and then add a few files at random locations.
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
Steven Arnold wrote:To followup on this issue, I have now had an archive on a regular disk -- not a DMG, encrypted or otherwise -- since June 18th. That's about three weeks. Normally the problem would have come up by now, so I would conclude that QRecall has a problem with large archives on DMGs. Note that the problem could be an artifact of encryption, or it could just be an issue with DMGs.


Thanks for following up on this.

If you need some help writing a script to do this, I can probably knock something out in Ruby pretty fast to create the original files to be backed up, and then add a few files at random locations.


Thanks for the offer, but I've got plenty of QRecall test cases already set up. I'll put this on my list of things to test and see if I can't get it to happen here.

- QRecall Development -
[Email]
 
Forum Index » Problems and Bugs
Go to:   
Mobile view
Powered by JForum 2.8.2 © 2022 JForum Team • Maintained by Andowson Chang and Ulf Dittmer