Author |
Message |
9 years ago
|
#1
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
Please be aware that there's still a bug in the current QRecall 2.0 beta that may cause the verify command to report that the archive's catalog tree is invalid or that there are "unreachable" directories. This issue is still being investigated, but early exploration would indicate that this is an erroneous error; the archive is not damaged and does not need to be repaired. For the time being, please ignore this error. That is, you don't need to repair your archive if the only action complaining is the verify, and it is reporting one of these two failures. But do please send a diagnostic report should it happen to you. In the comments, please indicate if the archive was created with QRecall version 2.0 or if it is an older archive that you converted to 2.0.
|
- QRecall Development - |
|
|
9 years ago
|
#2
|
Ming-Li Wang
Joined: Jan 12, 2015
Messages: 82
Offline
|
As I mentioned earlier, only the archive in charge of backing up my system partition would fail to verify, and it is so even after I started over with a brand new archive. After some digging just now I found my user profile folder was the culprit. Now, remember I said I had a dedicated partition for my documents which is mounted at boot at ~/Documents? I unmounted the partition and tried again and sure enough the test archive verified without error. Note that I've been using the same arrangement for a long time and neither QRecall v1.2.x nor early v2 betas had any trouble with it. They only started since beta 16, or 15 at the earliest. A diag report has been filed.
|
|
|
9 years ago
|
#3
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
Ming-Li Wang wrote:After some digging just now I found my user profile folder was the culprit. Now, remember I said I had a dedicated partition for my documents which is mounted at boot at ~/Documents? I unmounted the partition and tried again and sure enough the test archive verified without error.
I can't imagine any reason why your environment would make any difference to the verify. In theory, the verify simply compares the structure defined by the individual directory records with the index of the directory structure. It shouldn't matter what your computer environment is, and should succeed even if ran on a completely different machine. Having said that, I should also say that I should never say "never".
They only started since beta 16, or 15 at the earliest.
This I agree with. I strongly suspect that a subtle change in how volume identifiers and mount points are dealt with is the root cause of this issue. I've been poring over the code change logs looking for a cause, but have yet to identify it. I have a new beta that I'll probably release today. It includes some code to log more information about these verify errors, in the hopes that I can get more examples to look at. If you encounter this issue with 2.0.0b18, please send a diagnostic report and (if practical) stop using that archive. I may have some diagnostic commands for you to run on it. The search continues in ernest.
|
- QRecall Development - |
|
|
9 years ago
|
#4
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
James Bucanek wrote:
Ming-Li Wang wrote:After some digging just now I found my user profile folder was the culprit. Now, remember I said I had a dedicated partition for my documents which is mounted at boot at ~/Documents? I unmounted the partition and tried again and sure enough the test archive verified without error.
I can't imagine any reason why your environment would make any difference to the verify. In theory, the verify simply compares the structure defined by the individual directory records with the index of the directory structure. It shouldn't matter what your computer environment is, and should succeed even if ran on a completely different machine. Having said that, I should also say that I should never say "never".
(It's lucky I added that caveat.) I do have one more question: did you unmount the volume (mount point) at ~/Documents before you made the capture, or before you performed the verify? Because I'm now beginning to suspect that the former could make a difference...
|
- QRecall Development - |
|
|
9 years ago
|
#5
|
Ming-Li Wang
Joined: Jan 12, 2015
Messages: 82
Offline
|
James Bucanek wrote:I do have one more question: did you unmount the volume (mount point) at ~/Documents before you made the capture, or before you performed the verify?
Sorry for not making it clear earlier. By "trying again after unmounting the ~/Documents partition" I meant the whole thing: creating a new test archive, capturing my user profile directory tree, and then verifying the archive.
|
|
|
9 years ago
|
#6
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
Thank you, everyone, for patiently reporting this problem. I think I finally figured it out. QRecall 2.0.0b18 is out and contains the fix. I kept thinking that it was a problem with the logic that finds and compares volume identifiers. As it turns out, I was going in the wrong direction. The issue was that the volume identifier was doing too good of a job, identifying mountpoint directories as volumes when they should have been treated as sub-directories. The details are explained in the release notes. Happy capturing!
|
- QRecall Development - |
|
|
9 years ago
|
#7
|
Ming-Li Wang
Joined: Jan 12, 2015
Messages: 82
Offline
|
Thank you. The old, problematic system archive of mine is now back in service.
|
|
|
9 years ago
|
#8
|
Bruce Giles
Joined: Dec 5, 2007
Messages: 95
Offline
|
I had the same problem with the Verify command. I updated to Beta 18, did a repair, and as far as I know, the problem is fixed. As mentioned in the release notes, I opened the archive, turned on the option to display invisible items, and noted that I had four "dev"s, a "home", and a "net". I command-clicked them to select them all, and attempted to delete them. For the next ten minutes or so, nothing happened, except that the window said something about waiting for permission to open the archive. I left the room for a while, and when I came back, it had finally deleted them. Here's what the log file says: Action 2015-10-21 22:51:06 ------- Delete items in Macintosh HD.quanta Action 2015-10-21 22:51:06 archive: /Volumes/Seagate 2GB Backup/QRecall Backups/Macintosh HD.quanta Action 2015-10-21 22:51:06 Minutia Waiting for permission to open archive Action 2015-10-21 23:00:06 Minutia Broke stale lock file Action 2015-10-21 23:00:06 died: 2015-10-22 02:51:06 +0000 Action 2015-10-21 23:00:06 Minutia Acquired permission to open archive Action 2015-10-21 23:00:08 Delete 6 items Action 2015-10-21 23:00:08 Deleting dev Action 2015-10-21 23:00:08 Deleting dev Action 2015-10-21 23:00:08 Deleting dev Action 2015-10-21 23:00:08 Deleting dev Action 2015-10-21 23:00:08 Deleting home Action 2015-10-21 23:00:08 Deleting net Action 2015-10-21 23:00:10 ------- Delete finished (00:03) This is actually at least the second time this has happened. Shortly before, I had attempted to delete an incomplete layer. When nothing happened after a few minutes, I got impatient, canceled the delete, and rebooted the computer. I haven't yet tried again to delete that incomplete layer. At the moment, I'm running another verify. When that completes, I'll try again to delete the layer and I'll report back here with the details. -- Bruce
|
|
|
9 years ago
|
#9
|
Bruce Giles
Joined: Dec 5, 2007
Messages: 95
Offline
|
Bruce Giles wrote:At the moment, I'm running another verify. When that completes, I'll try again to delete the layer and I'll report back here with the details.
The verify completed and reported no problems of any sort. Also, I was able to delete the last incomplete layer, and it deleted almost immediately. But now there's something odd going on with the Merge command. I have a number of incomplete layers going back several weeks. I attempted to merge them with a subsequent complete layer, but after doing so, I end up with a layer marked as "unknown". For example, before the merge: Layer 26: Monday, October 19, 2015 at 1:00 AM 34.4 MB 121 items Layer 27: Monday, October 19, 2015 at 2:48 AM 36.9 GB Incomplete Layer 28: Monday, October 19, 2015 at 7:00 AM 102.5 MB 144 items Layer 29: Monday, October 19, 2015 at 1:00 PM 40.8 MB 138 items Layer 30: Monday, October 19, 2015 at 7:02 PM 105.3 MB 132 items Then I selected both Layer 27 and 28, and merged them. After the merge: Layer 26: Monday, October 19, 2015 at 1:00 AM 34.4 MB 121 items Layer 27: Unknown Unknown Repaired Layer 28: Monday, October 19, 2015 at 1:00 PM 40.8 MB 138 items Layer 29: Monday, October 19, 2015 at 7:02 PM 105.3 MB 132 items So now I select the new Layer 27 (Unknown) and Layer 28 (1:00 PM) and try to merge these. After the merge: Layer 26: Monday, October 19, 2015 at 1:00 AM 34.4 MB 121 items Layer 27: Unknown Unknown Repaired Layer 29: Monday, October 19, 2015 at 7:02 PM 105.3 MB 132 items So, it seems that and Unknown layer can't be eliminated by merging, but if you try, it appears to "eat" the subsequent good layer. After trying this a few times, I closed and re-opened the archive, and did another Verify. The verify found no problems, and subsequent scheduled captures are proceeding without error.
|
|
|
9 years ago
|
#10
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
Bruce Giles wrote:For the next ten minutes or so, nothing happened, except that the window said something about waiting for permission to open the archive.
QRecall 2.0 uses a new method for arbitrating concurrent access to an archive. Inside the archive package are some invisible semaphore files ( .lock, .share and .semiphore). These allow QRecall to play nice on filesystems that don't support the read-shared/write-exclusive access modes, and on networked volumes, NAS devices, and so on. An outstanding bug, that I'm still investigating, is that the .lock file doesn't get "unlocked" when it should. This causes QRecall to think that a remote process is using, or updating, the archive. QRecall performs a series of tests, which take about 9-10 minutes, to ensure that no other process is modifying the archive. Once it's convinced that the semaphore files are stale, it resets them, acquires the access it needs, and proceeds to perform the action. If you haven't done so, please send a diagnostic report. I'm still looking for combinations of actions the leave the .lock file locked.
|
- QRecall Development - |
|
|
9 years ago
|
#11
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
Bruce Giles wrote:So, it seems that and Unknown layer can't be eliminated by merging, but if you try, it appears to "eat" the subsequent good layer.
You can "fix" a layer by merging, but only under specific circumstances. When merging with a damaged or incomplete layer, the later (good, complete) layer must have captured anything marked as damaged or incomplete in the earlier layer for the merged layer to be considered complete again. For example, on Monday you capture your Documents, Music, and Pictures folders. On Tuesday you recapture your Music and Documents folders. On Wednesday, a media failure damages your archive requiring it to be repaired. During the repair, the Pictures and Music folder in Monday's layer are marked as damaged because data from one or more files in those folders was lost. You now have two layers. Monday's layer is marked as "damaged". If you merge these two layers, the merged layer will still be marked as "damaged" because Tuesday's layer did not recapture the Pictures folder from Monday. It did recapture the Music folder, so that folder is no longer damaged in the layer, but it inherits the damaged Pictures folder, so the layer is still "damaged". If you recapture the Pictures folder in a new layer, and merge that with the previously merged layer, the new layer will be complete. (Note that when a capture action sees that the latest layer(s) are damaged or incomplete, it automatically ignores filesystem change history and forces the recapture of all files and folders marked as damage or incomplete. This guarantees that if you then merge the newly captured layer with the previous one(s), the resulting layer will be complete.) And this actually gives me an idea for a new feature. I think we need a "damage" report or search mode that will show you just those directories or files that have been damaged.
|
- QRecall Development - |
|
|
9 years ago
|
#12
|
Bruce Giles
Joined: Dec 5, 2007
Messages: 95
Offline
|
James Bucanek wrote:If you haven't done so, please send a diagnostic report. I'm still looking for combinations of actions the leave the .lock file locked.
Just now sent a report. -- Bruce
|
|
|
9 years ago
|
#13
|
Bruce Giles
Joined: Dec 5, 2007
Messages: 95
Offline
|
OK, so if I want to get rid of those damaged ("Unknown Unknown Repaired") layers I need to keep merging them with subsequent layers until I eventually hit a layer that recaptures everything that was still missing, and at that point, the damaged layer disappears. I probably won't do that, as I'd rather the archive history, in case I need to recover something else from back then, but the temptation is strong. (Something in my nature just doesn't like to see error messages in the browse window, even when I understand what they mean and why they're there. You REALLY don't want to see what the System Log file in the Console app does to me. ) By the way, two more things. With regard to the verify errors discussed in this topic, when the Verify command reports an error and you choose to repair it, you get the option of verifying or repairing your volume first, which is good. But no matter whether you choose verify volume or repair volume, the top of the sheet says "Repairing volume". That confused me a little, when I was sure I had clicked the Verify button, not the Repair button. Second, after I got past the volume verify/repair, when the Repair Archive sheet comes up, the only option not grayed-out was "Reindex only". I had already done enough of these that I was pretty confident that "Reindex only" wasn't going to work, but I had to go though that first, and wait for it to fail, before I could go on with the "Use auto-repair information" box checked, and that one actually did fix the archive. Is that the way it's supposed to work?
|
|
|
9 years ago
|
#14
|
James Bucanek
Joined: Feb 14, 2007
Messages: 1572
Offline
|
Bruce Giles wrote:OK, so if I want to get rid of those damaged ("Unknown Unknown Repaired") layers I need to keep merging them with subsequent layers until I eventually hit a layer that recaptures everything that was still missing, and at that point, the damaged layer disappears.
That is correct.
You REALLY don't want to see what the System Log file in the Console app does to me.
I feel your pain ... especially when I look at my server.
With regard to the verify errors discussed in this topic, when the Verify command reports an error and you choose to repair it, you get the option of verifying or repairing your volume first, which is good. But no matter whether you choose verify volume or repair volume, the top of the sheet says "Repairing volume". That confused me a little, when I was sure I had clicked the Verify button, not the Repair button.
That's static text (it doesn't change, except for the name of the archive). I've changed it to "Check volume containing ...".
Second, after I got past the volume verify/repair, when the Repair Archive sheet comes up, the only option not grayed-out was "Reindex only".
When you click on the handy "repair" button in a error message or dialog, QRecall pre-configures the repair options based on what it thinks the problem is. For problems with the index files, it assumes a reindex is all that's necessary. It the error is in the primary data file, the options are pre-configured for a full repair. In your case, the first problem discovered was in an index file and QRecall assumed a reindex would fix it. It did not. The reindex discovered a problem in the primary data file, so the second time the repair options were pre-configured for a full repair. But you could have changed it to a repair on the first go. Once in the dialog, you can reconfigure the options to whatever you want. Just be aware that the options are highly interdependent. The "Reindex only" option is incompatible with all of the other options, so as long as "Reindex only" is checked, all of the other options are disabled. Uncheck "Reindex only" and you'll have the opportunity to select a different set of options. The "Use auto-repair information" and "Copy recovered content to new archive" options are also mutually exclusive; selecting one disables the other. And once you've picked any other option, the "Reindex only" option will be disable, because, you know, mutually exclusive.
|
- QRecall Development - |
|
|
9 years ago
|
#15
|
Bruce Giles
Joined: Dec 5, 2007
Messages: 95
Offline
|
James Bucanek wrote:Once in the dialog, you can reconfigure the options to whatever you want. Just be aware that the options are highly interdependent. The "Reindex only" option is incompatible with all of the other options, so as long as "Reindex only" is checked, all of the other options are disabled. Uncheck "Reindex only" and you'll have the opportunity to select a different set of options. The "Use auto-repair information" and "Copy recovered content to new archive" options are also mutually exclusive; selecting one disables the other. And once you've picked any other option, the "Reindex only" option will be disable, because, you know, mutually exclusive.
OK, now I understand, after I played with it a bit. My first impression was that this is a dialog that's crying out for radio buttons instead of checkboxes, but the "Recover lost files" and "Recover incomplete files" options don't fit that mold either. They can be a primary choice, if you don't pick any of the top three, or a secondary choice, if you pick either of the first two. So I have to say I don't really like the way those options are laid out, because it's not inherently clear what you can and can't do. You have to turn off the checked "Reindex only" option before it becomes clear that you can do something else. But honestly, I can't come up with a better way to do it.
|
|
|
|
|
|