QRecall

These are pretty minor issues, and certainly don't warrant the reinstallation of any software.

The CrashPlan.app error is a curious one, but probably inconsequential. The error indicates that during the recall QRecall was trying to restore the last modified date, creation date, last attributes change date, and the permissions (access mask) for the app bundle folder. For some reason, the operation was prohibited by the operation system. It might be the case that the Crash Plan app has a special ACL that prohibits changing one or more of these attributes.

But since we're talking about the app bundle folder, as long as it's readable that's just about all that matters.

The second error is equally mysterious, but also doesn't really matter. During the restore, QRecall determined that there were ACLs attached to the Documents folder, but the last time it captured the Documents folder there were none, so it tried to delete the ACLs. Again, this was prohibited by the OS for some reason. This might be a new behavior in 10.11 due to SIP. I'll have to look into it.

Overall, pretty trivial issues for a full restore of El Capitan.

Bob Tyson wrote:James, the capture of a new archive from the 1tb external HD is complete and I have sent the report.

Thanks!

Here is an oddity. I just now tried to copy a folder from my Mac to that 1tb external disk and could not do so without entering my admin password for the Mac's system. I did so, made the copy to the external HD, and then checked the latter's permissions. They were read-only for 'everyone' and read-write for 'system'. I changed the 'everyone' permission to read-write. But I am noticing various disks and folders seem to have had their permissions changed (?) to 'read-only ' for 'everyone'.

A Finder copy should not (normally) change the ownership of items, unless you're copying to/from a volume that has the "Ignore ownership and permissions" property set. But even then, the ownership of items on those volumes will always be "you" (the logged in user).

Another thing to be aware of is that creating, moving, renaming, or deleting an item also involves the ownership of the folder that contains (or will contain) that item. In other words, creating a file is considered a "change" to the folder that contains the new file, and you must have write permission for that folder to accomplish that.

And in the "really obscure" department, there are special attributes (call Access Control Lists, or ACLs) that can be set on a folder that to do things like automatically set the ownership of items added to that folder and such, but it's unlikely that's the problem here.

Bob,

I'd still appreciate a diagnostic report. I'm interested in all of the details that don't appear in the log window.

Bob Tyson wrote:James, here are notes I added to the report. And **sigh** it appears to be just that one new archive. At least the other, which I had captured successfully afterwards, is now being captured again. (Sorry -- I get befuddled with terminology here: 'Capture' meaning in this case, capturing to a pre-existing archive, from an HD previously backed up to that archive; distinct from 'Recapture'?? Or what am I missing, still after 'x' years?

It doesn't matter what you call it—capture, recapture, backup, vacuuming, ...—under the hood it's all the same thing.

At any rate it does appear the large, new archive is the culprit.

Yes and no, at least from the information I have. Most of your QRecall app crashes appear to be related to a race condition in the browser code. One thread is trying to load a large folder in the background at the same moment that you're closing the window/archive. That's what's causing the crash. So it's related to the new archive only in the fact that the browser was open to a particularly large folder when you closed the window. These are all very similar to crashes other users have reported, and all are under investigation.

EDIT - I attempted 'REPAIR' from the QR FILE menu. After confirming the volume OK I chose REPAIR without checking any other box. As soon as the repair began it failed - here are the log entries:

Action 2015-11-26 09:02:30 ------- Repair 1TB_FOTO_ARCHIVIO_PORTABLE_HD
Action 2015-11-26 09:02:30 archive: /Volumes/TO BACK 03 3T/1TB_FOTO_ARCHIVIO_PORTABLE_HD.quanta
Action 2015-11-26 09:02:30 Failure Failed
Action 2015-11-26 09:02:30 problem setting EOF
Action 2015-11-26 09:02:30 Failure Repair failed
Action 2015-11-26 09:02:30 A network or disk error was encountered.
Action 2015-11-26 09:02:30 ------- Repair incomplete (00:00)

Now that sounds serious.

But it got worse. As a double-check I tried to repair the second archive, the one I successfully updated and that seems ok. The repair started ok, and after a couple of minutes I clicked 'STOP'. After perhaps another minute a message came forth saying the archive index was damaged needed to be reindexed. This seems odd to me, but I am presently reindexing that archive, following the prompts.

It's not surprising to me at all.

When you start a repair (or reindex), QRecall blows away the existing index files and starts rebuilding them from scratch. The archive will be unusable until the repair has finished. Canceling it pretty much guarantees that your archive will remain "damaged".

During the backup from the external HD to the new archive I trashed several large files from that HD. QR completed the capture, then went into a verify and reported problems with both the backup and the verify.

I see from the log included in the diagnostic report where the new archive was created, you captured new files to it, deleted a few, and then performed a capture on another, existing, archive.

From the logs, all of that went very well. There were some warnings in the first capture about missing items because (I suspect) you were either moving or trashing some OS installer apps while the capture was in progress. It's a non-issue unless you were planning on recalling those, partially captured, items.

But that's all I've got. I assume the repair and some of the other issues you mention occurred after you sent the report. Please send another so I can review what is/was going on with the repair actions.

Bob,

Yikes! I thought most of those issues had been dealt with in b11.

Send a diagnostic report and I'll take a look. In the report, let me know if it's just the one archive or all archives the cause the crash.

Bryan Derman has just released rev C of his optical media backup scripts for QRecall 1.x.

Rev C adds some important new features, including support for multiple backup sets (for rotating off-site storage) and the ability to reconstruct a set following changes to the archive (e.g. a repair), intelligently rewriting the minimum set of discs required.

You can read more about the changes and download the latest scripts from his website: Off-Site and Archival Storage for QRecall Backups

Bob Tyson wrote:(They are on media stored in California; I am presently in Italy.)

That is about as unreachable as it gets.

There should be very little to do, beyond just making sure you launch the QRecall application again once you and your California-bound archives are reunited.

The upgrade process is primarily concerned with updating your actions and archive settings. The items to exclude have been moved from the capture action to the archive's settings in 2.0. When you launch QRecall, it will transfer the settings you had in your capture action to the archive's settings, which is why it needs your archive to be reachable. Also, QRecall will update the alias in each action (the structure that tells the action where the archive is located) to a modern URL Bookmark.

Changes added in QRecall 2.0.0b20 improve how these updates are performed, and makes them incremental. Now, QRecall can detect which actions have had their settings and bookmarks updated. If your archive is reachable when you launch QRecall 2.0, it will update those actions and your archive's settings. If the archive isn't available, it does nothing to those actions and tries again the next time you launch QRecall.

When you return to California, connect your archives and run QRecall again. The actions it couldn't update before will be updated then. Once all of your actions have been updated, it will mark the upgrade process complete and stop trying. To confirm the update was successful, review the excluded items settings for your archive.

The archives themselves don't require any kind of upgrade; archives created with 1.x are can still be used with 2.0 as is. If you modify an archive with 2.0, however, it will be unreadable by QRecall 1.x because 2.0 writes new record formats the 1.x doesn't understand.

Ralph Strauch wrote:I do notice that closing the archive seems to be taking longer, whichh I assume is related the error correction?

Closing the app at the end of an action, or closing an archive browser window in the QRecall app?

If you're talking about at the end of an action, the answer is "probably". Error correction (and encryption) generally make every modification a little slower.

If you're talking about closing a browser window, it's likely due to the resource leak I fixed in b23. In b21, closing a browser window failed to properly release all of the memory and resources for the archive. This would result in a memory leak of thousands, or even hundreds of thousands, of objects. But fixing the bug has its own downside; it takes a moment to destroy and cleanup after all of those objects.

If it's annoying, let me know. There may be a way to dispose of these objects on a background thread so it doesn't hang up the app.

Ralph,

Thanks, as always, for the diagnostic reports. When you have a chance update to the 2.0.0b23 beta that was just released and see if that resolves any of these issues.

Ralph Strauch wrote:The repair that I just did had one probably insignificant glitch you should know about. I shut the lid about 1/3 of the way through, then opened it again about an hour later and it seemed to pick up OK. I'm in the habit of shutting the lid when I'm done working, and it's hard to remember not to do that when a long process I haven't been paying any attention to is running in the background.

Ideally, this shouldn't ever be a problem. OS X is pretty good about recovering everything when it wakes up and picking up right where it left off. Problems can occur if something happens while the laptop is asleep that it can't recover from, such as shutting down the file server it was using, unplugging an external drive, losing a network connection, or having another system modify the archive. But if any of those things do occur, QRecall should fail with an appropriate message.

I use power management on the iMac for a scheduled 1am backup and it works fine. I somehow never thought of that on the MBP. One question that occurs to me there, though, is that backing up the MBP involves 2 computers, not one. I'm running the backup from the MBP, but the drive is mounted to the iMac. Does that cause any problems?

It could, but QRecall should deal with it. There are basically two scenarios: the iMac's file server is responding or it isn't. If it is, the laptop will connect and perform the capture. If it isn't, the laptop will either fail to mount the volume or fail to open the archive. Either way, there's no harm done.

Also, if I command the MBP backup to wake the computer before the backup but not to sleep it after the backup will it then go to sleep according to the Energy Saver inactivity settings?

Yes. When QRecall starts a time consuming activity (like a capture), it registers with the power management service to tell it not to put the filesystem to sleep until it's finished. Once finished, it releases the power manager. If there are no other services that want to prevent sleep, your system will go to sleep on its normal schedule.

I ask that because I'm hesitant to tell Qrecall to sleep the computer when I might sometimes run that backup command during the day when I'm working on other things. I'll probably keep running the MBP backups as I've been doing, but it will be nice to have a clearer understanding of my options.

Also not a problem. The sleep request following an action is just that—a request. When the action is finished, it will post a request for your system to go to sleep in about 10 minutes. A dialog will appear notifying you that a request to put your system to sleep has been made. If you ignore that dialog, your system will go to sleep (or shutdown, or restart, or whatever the request was). Dismiss the dialog to cancel the request and continue using your computer.

Ralph Strauch wrote:I've now repaired my archive and it completed successfully, though the log still shows Data Problems -- Invalid data at the same two locations that have been showing up. Is this now a permanent feature of this archive?

Not any more

. One of the last things the repair action does is erase any invalid data it found. This wasn't happening before, because the repair actions weren't finishing. But now that one has, your archive should be free of invalid data.

After the repair completed I ran a Merge Layers action followed by a backup, which seems to have run fine with one minor glitch. I clicked on the backup while the Merge was in process, figuring it would run after the Merge completed, which it did. But the Monitor window only showed the running Merge and did not show the waiting backup, as it has done in the past. When I came back to the computer after the backup had started, the Monitor window showed "Idle" and the only indication I could find that the the backup was actually running was the "Archive busy" window that came up when I tried to open the archive.

That's a mystery. Send a diagnostic report and I'll look into it. It might be another scheduler issue (*sigh*).

Ralph,

Here's a link to the 2.0.0a22 alpha release of QRecall. If you have the time, please download, install, and use it to repair your "2nd backup" archive again. It should die exactly the same way the earlier repairs did, but when it does it will log some information about the failure that I'm missing. After it fails, please send another diagnostic report so I can pick through the wreckage.

Either way, to quickly get your archive back up and running, do this. In the Finder, Right/Control+click on the archive and choose Show Package Contents. Inside the archive you'll see a repository.data file and a slew of "companion" files with names like these:

repository_8k.checksum32

repository_p8w8k16m2.0.anvin_reed_sol

repository_p8w8k16m2.0_8k.checksum32

repository_p8w8k16m2.1.anvin_reed_sol

repository_p8w8k16m2.1_8k.checksum32

Trash all of these "companion" files, just don't delete the repository.data file (that's where all of the good stuff is). Repair the archive and go about your business. If you want to back to using correction codes again, use the archive settings to generate new codes.

Ralph Strauch wrote:I ran the drive through Disk Utility, Diskwarrior, and TechTool Pro (full surface scan) and everything looked fine.

Since you did a full surface scan, and given your other observations, let's assume the hard drive is not the source of the I/O errors.

looking more carefully at when it happens suggests that the problem a byproduct of a scheduled backup trying to run when both computers involved are normally asleep.
...
I have a scheduled backup on the MBP for 3am, and the MBP is usually asleep with the lid closed at that time. Until I upgraded to El Cap that backup usually didn't run and I would just start a manual backup when I turned the computer on in the morning. More recently, though, that scheduled backup is trying to run, and that's when my "problem closing archive" occurs. See the backup failures at 2015-11-16 03:04:33 and 2015-11-19 03:37:19.

I think you've nailed it. It appears that QRecall is running while one or both systems are asleep. Of course, they're not completely asleep. This is undoubtedly a byproduct of OS X's "Power Nap" mode. During a power nap, certain (Apple) processes, like Mail and Time Machine, are allowed to perform limited maintenance tasks. Other processes (like QRecall) aren't supposed to run at a all, but apparently they are.

Since both of your systems are asleep, it's difficult to ascertain which one is responsible for the I/O errors (although I suspect the MBP, for reasons I'll detail in a moment).

In both cases, the backup seem to run for a while, but not actually complete, and then fail about 4 hours later.

Looking carefully at your capture log from 11-19 I see that the capture started at 03:37:58 and ran for about a half minute before being suspended again at 03:38:37. (I know it was suspended, because the beta version of QRecall dumps status messages to the log almost continusously, so if there are any big gaps in the log it's because absolutely nothing was happening.)

An hour later (04:39:06) QRecall starts logging again and immediately encounters an I/O error. This is undoubtedly because hardware and/or network connections are not fully functional during a power nap. I suspect that either the last I/O before it was put back to sleep, or the first I/O when it started the next power nap, was interrupted and caused the error.

QRecall retries the I/O, recovers from the error, and runs a few more minutes (last log entry was 04:40:16). At this point it's suspended again until 05:43:04, and the cycle repeats: I/O error, retry, recover, run a minute or two, get suspended, wash, rinse, repeat.

This pattern continues until 07:51.12, when the system was legitimately woken up again, which I can tell because the scheduler logged receipt of a "system woke up" event. QRecall continues with the capture, and almost finishes when ...

The system apparently goes back to sleep around 07:54:42 while closing the archive. The system wakes back up again at 08:47:22 (another "woke up" event was logged), and at that point is when the timeout failure that caused the "Problem closing archive" occurred. I can't tell if this was because the laptop's network interface hadn't fully woken up again, the forced sleep interrupted the request, or if the server was now asleep and not responding. Regardless, that was the sequence of events.

There are several things that can be done to address this. If you're willing to change your habits, you could add power management requests to the 3 AM capture action so it wakes the system up when it starts and puts it back to sleep again when it's finished. This would require you to leave the laptop lid open (the system won't wake up if the screen is closed). This would ensure the system is completely awake, allowing the capture to complete.

Approach number two would be to stop the laptop from power napping by turning off the Power Nap feature in the Energy Saver preference pane (under Schedule). That would probably restore the behavior you saw under Yosemite.

Additionally, I'm going to explore the possibility of modifying QRecall so it can suspend itself during a "power nap" and also talk to Apple as to why it's running in the first place.

Since yesterday?s failure I don?t seem to be able to repair the archive. Two consecutive Repair cycles both show ?Invalid data? at the same locations, and the archive will not open because the index is incorrect. Is this archive totally lost, or is there a way of getting rid of the invalid data and repairing the rest of the archive?

The archive isn't totally lost, but you are stuck. The logs reveal a problem/situation/bug in the code that regenerates the error correction codes which is preventing the repair from completing. I'm looking into that today.

You can wait until I have a fix, or I can tell you how to manually strip the error correction codes from the archive so you can repair it and resume using it.

Ralph Strauch wrote:I've thought from time to time that it would sometimes be convenient to be able to attach notes to layers, to keep track of significant events attached to the layer -- major system upgrades, perhaps, or a significant reshuffling of my filling system.

Great suggestion.

This has been suggested before, and is fairly high on the to-do list, possibly as early as the next feature release. Still working out implementation and interface details...

Ralph Strauch wrote:My scheduled backup last night looks from the log like it completed the backup then quit with a "problem closing archive." When I tried to rerun it, the archive needed repair. After the repair the archive includes last night's layer, showing the same number of files and size the log for last night showed, so apparently the backup had completed and then something screwed up as it was closing. The repair log showed two locations of invalid data.

You're problems started a little before that.

Earlier in this capture, and in previous captures, QRecall has been warning of failed I/O. Most of these it successfully recovered from. Here's an example:

2015-11-16 04:05:55.842 Recovered from archive data read failures

2015-11-16 04:05:55.843 cannot read envelope content length

2015-11-16 04:05:55.843 POSIXErr: 6

2015-11-16 04:05:55.843 ErrDescription: Device not configured

2015-11-16 04:05:55.843 Path: /Volumes/BUD2/2nd backup.quanta/repository.data

2015-11-16 04:05:55.843 Position: 1230587166720

2015-11-16 04:05:55.843 Length: 131072

This indicates that QRecall encounter a fatal error (Device not configured) reading from your primary data file. It tried again, and was successful, so it logged this as a transient error.

If you start to see a lot of transient errors, it's an indication that something is failing. It could be a degraded cable or network connection. More likely, it's a hard drive on its last legs.

At the end of the capture, QRecall tried to update the index files and failed with another I/O error. This one it couldn't recover from:

2015-11-16 07:42:48.430 Problem closing archive

2015-11-16 07:42:48.430 POSIXErr: 60

2015-11-16 07:42:48.430 ErrDescription: Operation timed out

2015-11-16 07:42:48.430 Path: /Volumes/BUD2/2nd backup.quanta/repository_p8w8k16m2.1_8k.checksum32

This failure corrupted the structure of your archive, requiring you to repair it. The data errors you encountered during the repair indicate permanent media failures. This is another indication that your drive may be knocking on death's door.

I would highly recommend that you investigate the health of your backup drive and its connection with your system. You might need a new drive.

Bruce Giles wrote:It actually works surprisingly well, although it does tend to bog down a bit when QRecall is doing things while I'm running Windows 10 in a virtual machine, since each of them only gets 2 GB of the total 4 GB installed RAM. I've replaced the spinning hard drive with an SSD, so I assume contention for drive resources probably isn't much of a factor. Because of the slowness, sometimes I put QRecall actions on hold for a while, using the QRecall system menu on the right side of the menubar.

You also might consider adding either "Ignore if Application <Your VM Software> is open" or "Hold if Application <Your VM Software> is open" conditions to the scheduler of your more demanding actions.