Message |
|
Bob Tyson wrote:(They are on media stored in California; I am presently in Italy.)
That is about as unreachable as it gets. There should be very little to do, beyond just making sure you launch the QRecall application again once you and your California-bound archives are reunited. The upgrade process is primarily concerned with updating your actions and archive settings. The items to exclude have been moved from the capture action to the archive's settings in 2.0. When you launch QRecall, it will transfer the settings you had in your capture action to the archive's settings, which is why it needs your archive to be reachable. Also, QRecall will update the alias in each action (the structure that tells the action where the archive is located) to a modern URL Bookmark. Changes added in QRecall 2.0.0b20 improve how these updates are performed, and makes them incremental. Now, QRecall can detect which actions have had their settings and bookmarks updated. If your archive is reachable when you launch QRecall 2.0, it will update those actions and your archive's settings. If the archive isn't available, it does nothing to those actions and tries again the next time you launch QRecall. When you return to California, connect your archives and run QRecall again. The actions it couldn't update before will be updated then. Once all of your actions have been updated, it will mark the upgrade process complete and stop trying. To confirm the update was successful, review the excluded items settings for your archive. The archives themselves don't require any kind of upgrade; archives created with 1.x are can still be used with 2.0 as is. If you modify an archive with 2.0, however, it will be unreadable by QRecall 1.x because 2.0 writes new record formats the 1.x doesn't understand.
|
|
|
Ralph Strauch wrote:I do notice that closing the archive seems to be taking longer, whichh I assume is related the error correction?
Closing the app at the end of an action, or closing an archive browser window in the QRecall app? If you're talking about at the end of an action, the answer is "probably". Error correction (and encryption) generally make every modification a little slower. If you're talking about closing a browser window, it's likely due to the resource leak I fixed in b23. In b21, closing a browser window failed to properly release all of the memory and resources for the archive. This would result in a memory leak of thousands, or even hundreds of thousands, of objects. But fixing the bug has its own downside; it takes a moment to destroy and cleanup after all of those objects. If it's annoying, let me know. There may be a way to dispose of these objects on a background thread so it doesn't hang up the app.
|
|
|
Ralph, Thanks, as always, for the diagnostic reports. When you have a chance update to the 2.0.0b23 beta that was just released and see if that resolves any of these issues.
|
|
|
Ralph Strauch wrote:The repair that I just did had one probably insignificant glitch you should know about. I shut the lid about 1/3 of the way through, then opened it again about an hour later and it seemed to pick up OK. I'm in the habit of shutting the lid when I'm done working, and it's hard to remember not to do that when a long process I haven't been paying any attention to is running in the background.
Ideally, this shouldn't ever be a problem. OS X is pretty good about recovering everything when it wakes up and picking up right where it left off. Problems can occur if something happens while the laptop is asleep that it can't recover from, such as shutting down the file server it was using, unplugging an external drive, losing a network connection, or having another system modify the archive. But if any of those things do occur, QRecall should fail with an appropriate message.
I use power management on the iMac for a scheduled 1am backup and it works fine. I somehow never thought of that on the MBP. One question that occurs to me there, though, is that backing up the MBP involves 2 computers, not one. I'm running the backup from the MBP, but the drive is mounted to the iMac. Does that cause any problems?
It could, but QRecall should deal with it. There are basically two scenarios: the iMac's file server is responding or it isn't. If it is, the laptop will connect and perform the capture. If it isn't, the laptop will either fail to mount the volume or fail to open the archive. Either way, there's no harm done.
Also, if I command the MBP backup to wake the computer before the backup but not to sleep it after the backup will it then go to sleep according to the Energy Saver inactivity settings?
Yes. When QRecall starts a time consuming activity (like a capture), it registers with the power management service to tell it not to put the filesystem to sleep until it's finished. Once finished, it releases the power manager. If there are no other services that want to prevent sleep, your system will go to sleep on its normal schedule.
I ask that because I'm hesitant to tell Qrecall to sleep the computer when I might sometimes run that backup command during the day when I'm working on other things. I'll probably keep running the MBP backups as I've been doing, but it will be nice to have a clearer understanding of my options.
Also not a problem. The sleep request following an action is just that—a request. When the action is finished, it will post a request for your system to go to sleep in about 10 minutes. A dialog will appear notifying you that a request to put your system to sleep has been made. If you ignore that dialog, your system will go to sleep (or shutdown, or restart, or whatever the request was). Dismiss the dialog to cancel the request and continue using your computer.
Ralph Strauch wrote:I've now repaired my archive and it completed successfully, though the log still shows Data Problems -- Invalid data at the same two locations that have been showing up. Is this now a permanent feature of this archive?
Not any more . One of the last things the repair action does is erase any invalid data it found. This wasn't happening before, because the repair actions weren't finishing. But now that one has, your archive should be free of invalid data.
After the repair completed I ran a Merge Layers action followed by a backup, which seems to have run fine with one minor glitch. I clicked on the backup while the Merge was in process, figuring it would run after the Merge completed, which it did. But the Monitor window only showed the running Merge and did not show the waiting backup, as it has done in the past. When I came back to the computer after the backup had started, the Monitor window showed "Idle" and the only indication I could find that the the backup was actually running was the "Archive busy" window that came up when I tried to open the archive.
That's a mystery. Send a diagnostic report and I'll look into it. It might be another scheduler issue (*sigh*).
|
|
|
Ralph, Here's a link to the 2.0.0a22 alpha release of QRecall. If you have the time, please download, install, and use it to repair your "2nd backup" archive again. It should die exactly the same way the earlier repairs did, but when it does it will log some information about the failure that I'm missing. After it fails, please send another diagnostic report so I can pick through the wreckage. Either way, to quickly get your archive back up and running, do this. In the Finder, Right/Control+click on the archive and choose Show Package Contents. Inside the archive you'll see a repository.data file and a slew of "companion" files with names like these:
repository_8k.checksum32
repository_p8w8k16m2.0.anvin_reed_sol
repository_p8w8k16m2.0_8k.checksum32
repository_p8w8k16m2.1.anvin_reed_sol
repository_p8w8k16m2.1_8k.checksum32 Trash all of these "companion" files, just don't delete the repository.data file (that's where all of the good stuff is). Repair the archive and go about your business. If you want to back to using correction codes again, use the archive settings to generate new codes.
|
|
|
Ralph Strauch wrote:I ran the drive through Disk Utility, Diskwarrior, and TechTool Pro (full surface scan) and everything looked fine.
Since you did a full surface scan, and given your other observations, let's assume the hard drive is not the source of the I/O errors.
looking more carefully at when it happens suggests that the problem a byproduct of a scheduled backup trying to run when both computers involved are normally asleep. ... I have a scheduled backup on the MBP for 3am, and the MBP is usually asleep with the lid closed at that time. Until I upgraded to El Cap that backup usually didn't run and I would just start a manual backup when I turned the computer on in the morning. More recently, though, that scheduled backup is trying to run, and that's when my "problem closing archive" occurs. See the backup failures at 2015-11-16 03:04:33 and 2015-11-19 03:37:19.
I think you've nailed it. It appears that QRecall is running while one or both systems are asleep. Of course, they're not completely asleep. This is undoubtedly a byproduct of OS X's "Power Nap" mode. During a power nap, certain (Apple) processes, like Mail and Time Machine, are allowed to perform limited maintenance tasks. Other processes (like QRecall) aren't supposed to run at a all, but apparently they are. Since both of your systems are asleep, it's difficult to ascertain which one is responsible for the I/O errors (although I suspect the MBP, for reasons I'll detail in a moment).
In both cases, the backup seem to run for a while, but not actually complete, and then fail about 4 hours later.
Looking carefully at your capture log from 11-19 I see that the capture started at 03:37:58 and ran for about a half minute before being suspended again at 03:38:37. (I know it was suspended, because the beta version of QRecall dumps status messages to the log almost continusously, so if there are any big gaps in the log it's because absolutely nothing was happening.) An hour later (04:39:06) QRecall starts logging again and immediately encounters an I/O error. This is undoubtedly because hardware and/or network connections are not fully functional during a power nap. I suspect that either the last I/O before it was put back to sleep, or the first I/O when it started the next power nap, was interrupted and caused the error. QRecall retries the I/O, recovers from the error, and runs a few more minutes (last log entry was 04:40:16). At this point it's suspended again until 05:43:04, and the cycle repeats: I/O error, retry, recover, run a minute or two, get suspended, wash, rinse, repeat. This pattern continues until 07:51.12, when the system was legitimately woken up again, which I can tell because the scheduler logged receipt of a "system woke up" event. QRecall continues with the capture, and almost finishes when ... The system apparently goes back to sleep around 07:54:42 while closing the archive. The system wakes back up again at 08:47:22 (another "woke up" event was logged), and at that point is when the timeout failure that caused the "Problem closing archive" occurred. I can't tell if this was because the laptop's network interface hadn't fully woken up again, the forced sleep interrupted the request, or if the server was now asleep and not responding. Regardless, that was the sequence of events. There are several things that can be done to address this. If you're willing to change your habits, you could add power management requests to the 3 AM capture action so it wakes the system up when it starts and puts it back to sleep again when it's finished. This would require you to leave the laptop lid open (the system won't wake up if the screen is closed). This would ensure the system is completely awake, allowing the capture to complete. Approach number two would be to stop the laptop from power napping by turning off the Power Nap feature in the Energy Saver preference pane (under Schedule). That would probably restore the behavior you saw under Yosemite. Additionally, I'm going to explore the possibility of modifying QRecall so it can suspend itself during a "power nap" and also talk to Apple as to why it's running in the first place.
Since yesterday?s failure I don?t seem to be able to repair the archive. Two consecutive Repair cycles both show ?Invalid data? at the same locations, and the archive will not open because the index is incorrect. Is this archive totally lost, or is there a way of getting rid of the invalid data and repairing the rest of the archive?
The archive isn't totally lost, but you are stuck. The logs reveal a problem/situation/bug in the code that regenerates the error correction codes which is preventing the repair from completing. I'm looking into that today. You can wait until I have a fix, or I can tell you how to manually strip the error correction codes from the archive so you can repair it and resume using it.
|
|
|
Ralph Strauch wrote:I've thought from time to time that it would sometimes be convenient to be able to attach notes to layers, to keep track of significant events attached to the layer -- major system upgrades, perhaps, or a significant reshuffling of my filling system.
Great suggestion. This has been suggested before, and is fairly high on the to-do list, possibly as early as the next feature release. Still working out implementation and interface details...
|
|
|
Ralph Strauch wrote:My scheduled backup last night looks from the log like it completed the backup then quit with a "problem closing archive." When I tried to rerun it, the archive needed repair. After the repair the archive includes last night's layer, showing the same number of files and size the log for last night showed, so apparently the backup had completed and then something screwed up as it was closing. The repair log showed two locations of invalid data.
You're problems started a little before that. Earlier in this capture, and in previous captures, QRecall has been warning of failed I/O. Most of these it successfully recovered from. Here's an example:
2015-11-16 04:05:55.842 Recovered from archive data read failures
2015-11-16 04:05:55.843 cannot read envelope content length
2015-11-16 04:05:55.843 POSIXErr: 6
2015-11-16 04:05:55.843 ErrDescription: Device not configured
2015-11-16 04:05:55.843 Path: /Volumes/BUD2/2nd backup.quanta/repository.data
2015-11-16 04:05:55.843 Position: 1230587166720
2015-11-16 04:05:55.843 Length: 131072 This indicates that QRecall encounter a fatal error (Device not configured) reading from your primary data file. It tried again, and was successful, so it logged this as a transient error. If you start to see a lot of transient errors, it's an indication that something is failing. It could be a degraded cable or network connection. More likely, it's a hard drive on its last legs. At the end of the capture, QRecall tried to update the index files and failed with another I/O error. This one it couldn't recover from:
2015-11-16 07:42:48.430 Problem closing archive
2015-11-16 07:42:48.430 POSIXErr: 60
2015-11-16 07:42:48.430 ErrDescription: Operation timed out
2015-11-16 07:42:48.430 Path: /Volumes/BUD2/2nd backup.quanta/repository_p8w8k16m2.1_8k.checksum32 This failure corrupted the structure of your archive, requiring you to repair it. The data errors you encountered during the repair indicate permanent media failures. This is another indication that your drive may be knocking on death's door. I would highly recommend that you investigate the health of your backup drive and its connection with your system. You might need a new drive.
|
|
|
Bruce Giles wrote:It actually works surprisingly well, although it does tend to bog down a bit when QRecall is doing things while I'm running Windows 10 in a virtual machine, since each of them only gets 2 GB of the total 4 GB installed RAM. I've replaced the spinning hard drive with an SSD, so I assume contention for drive resources probably isn't much of a factor. Because of the slowness, sometimes I put QRecall actions on hold for a while, using the QRecall system menu on the right side of the menubar.
You also might consider adding either "Ignore if Application <Your VM Software> is open" or "Hold if Application <Your VM Software> is open" conditions to the scheduler of your more demanding actions.
|
|
|
Bruce, The problem is that OS X isn't starting the scheduler. You have the scheduler set to run in the background (the "Start and run actions while logged out" scheduler option is checked). Currently, OS X is not starting the scheduler the way it should when you start up. When the monitor can't connect with the scheduler, most of the commands related to the scheduler will be disabled, and the special menu item that displays the next scheduled action is broken. When you ran QRecall, it saw that the scheduler wasn't running, (redundantly) reinstalled it, and told OS X to immediately start the service. So now the scheduler is running, and the monitor menus all work. Short-term solution is to uncheck the "Start and run actions when logged out" and the scheduler will get installed as a user agent, which works reliably (knock wood). I've spent the better part of this week running this problem to ground, and I think I'm really close to a (partial) solution. Details in the release notes of 2.0.0b21...
|
|
|
Charles Watts-Jones wrote:Running QRecall b20, I tried to run a back-up Action this morning after my attempts to run one to a NAS when logged out, failed. The Action refused to run because it was seeing an 'invalid identity key'. When I looked at Preferences > Identity Key, the Key Status reported 'Valid permanent key'.
That's very odd. The only thing I can think of is that OS X was having problems accessing one of your QRecall preference files (which is where your identity key is stored). This can happen occasionally in OS X 10.7 and later, but it's usually a temporary situation.
I tried re-entering my key only to be warned that doing so would change ownership of the archive. Odd?
That's a new warning in QRecall 2.0. If you change your identity key, it's warning you that items captured with the new key will belong to a different owner inside the archive—so don't be surprised if you don't see the new items inside the existing owner.
|
|
|
Ralph Strauch wrote:I'm seeing log entries for both recaptured files and capture issues attributed to "file length changed during capture" for various log files. Is this normal? Should I just exclude these logs from the capture?
This is a new message in QRecall 2.0. It lets you know that a file was being written to at the same time QRecall was trying to capture it. For things like log files, it's not an issue. Log files are always appended to, so what QRecall captures is always valid, even if it didn't get everything. If it's something more complicated, like a database or document file, it may be a concern when restoring it. It might mean QRecall did not capture a stable copy of the file, and you should careful evaluate such files after restoring them. You might need to restore an earlier, or later, version in order to obtain a usable document.
I also see that my qrecall logs for September and October are both under 2MB, while the November log is already over 2.6GB. Has the amount of stuff being logged shot way up for some reason, should it be set back to a lessor level?
I can't say for sure without looking at the log file, but the beta writes a lot more in the log than the release version.
|
|
|
Bruce Giles wrote:I have two separate archive files: one called "iMac 24.quanta" which is my main backup file. The second one, called "Virtual Machines.quanta" is used to backup my Virtual Machines folder. Usually, that's for VMware Fusion, but on occasion, I have VirtualBox files in there too. The settings for the "iMac 24" archive has all the checkboxes in the Exclude section checked, but nothing listed in the box where custom locations can be added. The settings for the "Virtual Machines" folder are exactly the same in that respect. Now, the "Virtual Machines" folder on my hard drive has QRecall capture preferences -- the "Do not capture" button is the only one selected.
From this description, I can tell you that your "Virtual Machines" folder will never be captured. The crucial setting you set in both of your archives is the "Exclude: Individually Excluded Items" option. This setting honors the "Do Not Capture" preference you set on individual filesystem items. With the "Do Not Capture" preference set on your "Virtual Machines" folder, that folder won't ever be captured. You have a couple of choices. Probably the simplest is to uncheck the "Exclude: Individually Excluded Items" option of your Virtual Machines.quanta archive. The iMac 24 archive will exclude everything you've individually set to exclude, while Virtual Machines will ignore that capture preference and capture them all.
It seems pretty clear from the log that it's not capturing the folder because it's excluded (by the capture preference). But I thought b20 (which is what I'm running) had a bug fix that was supposed to allow that when I was specifically trying to capture a folder that was ordinarily excluded. If so, that's not working. Or did I misunderstand?
That exception only applies to the "Ignore Changes" preference. Excluded items are always excluded, and should be logged as such.
I suspect the better solution, given my archive setup, is to delete the capture preference, and add an exclusion for the Virtual Machines folder to my "iMac 24" archive only. Then my regular backups should skip the folder, but my Virtual Machines capture action will get it, because it's capturing to a different archive which doesn't have an exclusion for the folder in the archive settings. Do you agree?
Dealer's choice, but I'm inclined to agree. My philosophy on this is:
Add items to the exclusion list of the archive when those exclusions are for the benefit of that archive, and that archive alone. For example, an archive that is focused on a particular subset of files (say, working projects) and you want to hone that focus by excluding extraneous items.
Add items to the exclusion list of the archive when the filesystem the items are on doesn't support extended attributes or there's a chance that the item could be deleted and recreated (which might discard the capture preferences attached to the original item).
Check the "Individually Excluded Items" option and use the capture preferences to conveniently exclude items from multiple archives. It's easy to change what items are excluded (no need to open the archive) and it applies to all archives.
If you have one or more "hot" archives that capture only important files on a regular basis (2 minutes after documents changed, for example) and a second "comprehensive" archive that captures everything (say, once a day), then use capture preferences to exclude the less interesting files from the "hot" archives and uncheck the "Exclude: Individually Excluded Items" from the "comprehensive" archive. The quick captures during the day will exclude items marked "Do Not Capture" and the daily capture will capture them anyway.
In this last scenario, you may find that there are items you never want to capture, ever. That's what the "Exclude from all archives" capture preference option is for. Even if an archive is set to capture "Do Not Capture" items, this option overrides that setting and excludes the item anyway. Or, to paraphrase Uncle Ben from the Amazing Spiderman: "With great flexibility comes great confusion."
|
|
|
Charles Watts-Jones wrote:Operating Systems have moved on since this thread started and perhaps the conditions have changed, so I'm resuscitating the question.
Not much has changed since this feature was first introduced. When QRecall starts an action, and the archive for that action isn't online (it isn't reachable, in the language of the filesystem), it requests that OS X make it available. OS X then looks to see if the archive was originally on a volume that's physically connected to the system and can be mounted, or if it was on a network volume that the OS can reconnect to. Encrypted and networked volumes might require supplying a password from the keychain. If any of these things are not true (the drive isn't connected, the server isn't online, the devices driver doesn't understand filesystem mount requests, or the keychain can't be accessed), then the volume won't be mounted and the action can't run.
I'm trying to use ControlPlace (a freebie available at http://www.controlplaneapp.com/) to handle the mounting of a NAS drive. It uses Rules to trigger Actions. The Rule that I've tried is to name the application which will trigger the drive mount. So far I've tried QRecallMonitor and then QRecallScheduler. Neither appears to work; ... I'd appreciate knowing what application should be named.
I doubt this rule will be useful. QRecall actions are performed by an executable. This is binary executable file, but it is not a Cocoa application bundle, which is what is generally referred to as an "application" in OS X. The OS X workspace manager posts notifications when a Cocoa application (like Mail) launches or terminates, and this is probably what ContolPlace is listening for. I highly doubt it's capable of determining when an executable is spawned, but if it is the executable is named QRecallHelper.
I say 'appears' because I'm getting a 'NetAuthSysAgent wants to use the "login" keychain' message when I login.
No clue. Can ControlPlace be set up to mount your volume when you execute a shell script? If so, you might look into Growl 2.0. Growl is a notification manager that long preceded—and was probably the inspiration for—the built-in notification manager added in OS X 10.8. Growl 2.0 (available from the Mac App Store) adds the ability to perform various actions when it receives specific notifications. This includes turning the notification in an email or SMS message, or even turning it into an iOS push notification via Prowl (available in the iOS App Store). That last one is the feature I use the most. I have all of my servers push their "Action Failed" notifications to my iPhone. QRecall sends four different notifications to Growl (and OS X's built-in notification center): Action Started, Action Complete, Action Failed, and Archive Needs Repair. You can set up Growl to run an ActionScript when it receives any of these notifications. The script can be passed the notification's name (Action Started, etc.) which it could use to determine if the volume should be mounted or unmounted. You'll still want to the keep the "Hold While No Archive" condition, to avoid the race condition between the action (that will immediately try to open the archive) and the script (that's simultaneously trying to mount the volume that archive is on).
|
|
|
Thanks for the feedback. I tend to agree with you on all counts, but I wanted to throw out the idea to see if an alternative made more sense for some people. I'll leave the logic the way it is now.
|
|
|
|
|