QRecall

Ralph Strauch wrote:After the repair I ended up with two layers marked "damaged" and one marked "incomplete,?

An ?incomplete? layer is one where the capture was interrupted (stopped/canceled) leaving some items uncaptured. This is mostly a warning that the items in that layer don?t fully include all of the changed items at that time. If you subseqently merge this layer with a more recent (complete) layer, this warning goes away becuase the more recent layer will contain the most recent items—and all of them.

A ?damaged? layer occurs when the repair action finds missing or inconsistent information about files or folder in that layer. When you merge it with a more recent layer, QRecall looks for successfully captured items in the more recent layer that supersede the damaged one in the earlier layer. If all of the damaged folders and items have been successfully recaptured, then the ?damaged? status should go away. If there are still any files or folder that QRecall is not satisfied have been completely recaptured, the merged layer retains the ?damaged? status.

Now I say ?should? because I think there?s a bug in this logic. QRecall is very conservative in how it determines that items have been successfully recaptured. If there?s any doubt, it retains the ?damaged? status on those items, and that layer. In my testing here, I?ve never been able to trick QRecall into thinking that a layer is still ?damaged? after all of the suspect items have been recaptued, but occationally users report that this is the case. (This has been a difficult issue to verify and debug becuase the information I need to verify the bug has to be gathered before the merge is performed, and no one reports this issue until after the merge.)

If you have a persistent ?damaged? layer, review the log for the repair action. It will identify exactly which folders and files it found problems with. After your next capture, simply browse the archive and verify that those items were successfully recaptured in the latest layer. If they were, you're in fine shape.

After my uncompleted Compact action, the unused space shown in the Status window jumped from less than 10mb to 106gb. If I understand Compact correctly, that was space that had been occupied by previously deleted files which the compaction process had erased, leaving scattered free space throughout the archive, and if the process had completed properly, those spaces would have been closed up by shifting everything to fill them in. Currently, though, qrecall simply uses that as available space within the archive. If this is correct, is there anything to be gained by running Compact again to close up those spaces, or is it just as efficient to simply let Qrecall fill them as over time?

That?s all true. QRecall will reuse that scattered free space for new captures. But just like volume fragmentation, the process eventually results in thounds, even millions, of tiny little blocks of space that can?t be effectively reused. These eventually become a drag on almost every action, since QRecall has to keep track of them all but can?t do much with them. Once you get your drive-unmounting issue resolved, I?d suggest taking another shot at compacting the archive, espeically with a 106GB free.

Ralph,

You can confirm the integrity of the archive by performing a verify. If the verify doesn?t complain, then everything in the archive is valid.

During the repair, however, you might have lost specific files or folders. If your drive is spontaneously disconnecting, then the archive probably has missing or incomplete data records that weren?t written before the drive disconnected. The repair will detect this. Look in the log file for items like ?Damaged directories,? ?Damaged files,? and ?Incomplete files.? These will list all of the files and folders the repair process couldn?t reassemble. These are the items you should be concerned about.

You can also look in the archive browser for layers marked as ?-Damaged-?. If the repair found any inconsistencies or missing data in a layer, it will mark that layer as ?damaged?. It will remain ?damaged? until it?s merged with a later layer that contains fresh captures of all of the suspect items. (The ?damaged? state also directs QRecall to force a recapture of those items during the next capture action.)

Since your repair followed an unsuccessful compact action, the repair will also report a bazillion ?Duplicate record? errors. That?s because the compact works by copying records from higher positions to lower positions in the archive file. If you interrupt the compact, it can leave a copy of thousands, even millions, of records at two positions in the archive. These can be ignored. The repair action simply logs the inconsistency and then ignores the duplicate.

In the logs you sent, I do see a handful of missing or damaged items, mostly in your iTunes media backup directories. These should be recaptured, but I?m guessing the rest of your archive is intact. I can?t tell for sure because the log records uploaded by the diagnostic report is incomplete. (The diagnostic report only uploads the last few megabytes of log records, and the repair action is so full of ?duplicate record? entries, that it doesn?t go back further than the last repair you performed.) If you want me to investigate this further, please email me your complete ~/Library/Logs/QRecall.log file.

Kurt,

All great questions. QRecall is primarily focused on detecting and protecting you from damage to the archive. Currently, all records in an archive are checksummed and QRecall tests those checksums whenever those records are used. This allows QRecall to detect if archive data has been corrupted, preventing you from recalling corrupted data or compounding the corruption with new data.

The next (as-yet-unreleased) version of QRecall adds data-redundancy. This allows QRecall to detect and correct limited amounts of data corruption in the archive.

QRecall cannot, however, do much about data corruption in the source filesystem. If you need that kind of protection, you should be using a RAID or other filesystem that supports data redundancy and correction.

It?s not practical to re-read every byte of data of every source file during every capture. QRecall is, at its heart, an incremental backup utility. To do that efficiently necessitates the use of monitoring filesystem change events and comparing file metadata to determine which items needs to be recaptured. Recapturing all of the source data every time would turn every capture into an initial capture, requiring many (many!) hours to complete.

If, however, you still want to do that sort of thing, I?ve had a "?force? capture option on the drawing board for awhile. This would force QRecall to recapture all of the source items in their entirely, as if they had never been captured before. If that?s important to you, let me know and I?ll give it a +1.

There?s also been requests for a verify/compare action that would perform a byte-by-byte comparison of data on the filesystem with the data in the archive and note any differences.

Alex wrote:I assume if I install QRecall on the server with its "Start and run actions while logged out" enabled, the actions will run (I understand I'll need to have appropriate filesystem permissions for QRecall on the server).

Correct. You also don?t need to install an identity key on the server if all you?ll be running are maintenance actions (verify, merge, compact, repair, and so on).

If you have a fast server and a slow network, it?s recommended that you perform everything but captures directly on the server.

However, it doesn't appear that on a headless server a verification failure has any place to warn me. Given my experience so far with handling fileserver disconnects I'm guessing I'll see verification errors with some regularity.

Possibly. For most actions (capture, merge, and so on) a spontaneous interruption due to a network communications break, process termination, or loss of power, will leave the archive in a state where it will be auto-repaired on the next regular action. That includes the verify action. So even if you regularly lose connection with the file server, or crash your laptop, the next regularly scheduled action will/should restore the archive to its previous state and pick up where it left off.

Caveat: There are critical moments in an archive?s life where a terminal interruption will leave the archive is a state that can?t be auto-repaired. For those (rare) situations, a repair will be necessary.

The discussion of autorepair in the manual focuses primarily on the client repairing during capture.

All actions will first attempt to auto-repair the archive before proceeding. The one exception is the Repair command when you explicitly tell it to not perform an auto-repair before repairing.

Tip: If all you want to do is auto-repair an archive, run a verify action. If it manages to get started, then any auto-repair that needed to be performed was successful, and you can now cancel the verify.

However, in my logs I see messages like:

Action 2014-11-14 18:00:01 Failure Capture failed
Action 2014-11-14 18:00:01 The index of the archive is incorrect and needs to be recreated.

That?s a more serious issue that would require a repair (or reindex) to correct.

- This old thread http://forums.qrecall.com/posts/list/118.page#488 mentions email notifications, was that ever implemented?

QRecall support Growl, and the latest Growl supports several email notification extensions. (Note that probably won?t help you on your server if you don?t leave a user logged in, but it would likely work on your clients.)

- Is it possible to chain verify to repair on the server action?

That?s on the to-do list for the next version of QRecall.

- If the server verification fails is this guaranteed to trigger a warning and eventual autorepair on the status on the client? Will the client's status window show the verification state produced by the server (i.e. is the verification failure persisted to the archive for other readers)?

An action that fails (but does not terminate) will update the status of the archive, and that status will (eventually) appear in all status windows that are displaying the status of that archive. So yes, a fatal verify error will eventually percolate back to the client?s status window.

Note that this doesn?t apply to auto-repairable conditions. The conditions that leave the archive in an incomplete state are unexpected terminations that (obviously) don?t finish and don?t update the archive?s status. The next action will attempt to auto-repair the archive. If successful, no harm no foul. If not, it will update the status to ?needs repair? and stop.

- Will the server repaired archive trigger recapture of corrupted files in the archive from the client?

Yes. If the repair process finds anything out of place in a layer, that layer is marked as ?damaged?. A capture that references a damaged layer ignores all file change history, exhaustively recurses through the entire directory tree, and fully recaptures all suspect files.

Alex,

Here?s what the Send Diagnostic report includes:

The comments and email address you enter into the Send Diagnostic dialog

An anonymous system profile

The last 20MB of QRecall log entries

Your preference settings

The actions you have defined

Any crash reports stored in ~/Library/Logs/CrashReporter who name starts with QRecall.

That information is then ZIP?d and uploaded to the diagnostic report server. The shell script that prepares this is named ?buildreport.sh? and is part of the QRecall application package.

Since most of that information is useful for diagnostics, it helps to send it all. If you?re still unconformatable using the built-in report upload feature, do this:

Create a folder (it can be anywhere, even on the desktop).

Save a plain ASCII text file inside that folder named ?comments.txt? containing whatever comments you have about the report.

Open the Terminal.

Use the terminal to navigate to the buildreport.sh script inside the QRecall.app bundle.

In Terminal, execute the buildreport.sh script. The first argument must be the empty folder your created in the first step.

When the script is done, the folder will contain a report.zip file containing the diagnostic information. If you?re satisfied with its content, send that zip file to support (mailto:support).

Or, just send whatever you want to support and we?ll figure it out.

A gray layer (or a missing layer when Show All Layers is off) means that the layer doesn?t contain any items contained in the browser window. It?s designed to help you ignore layers that don?t apply to the items you?re browsing.

If all recent layers are grey, it?s most likely that you?re browing an older set of files.

Everything in an archive has an owner, determined by your identity key. Within each owner are the volumes that have been captured, and inside those are the individual files and folders.

Use the navigation bar at the bottom of the browser window to navigate to the top of the archive. Here you will see all of the individual owners who have captured files to this archive. My guess is that on February 11, 2014 you either changed identity keys or migrated to a new hard drive. From that moment on, everything captured to the archive was added to the new identity key or volume. But the items belonging to the old owner or volume are still there, but in a different branch of the archive. This is the one you are probably browsing.

Find the owner for the current identity key and the volume that contains recently captured items. You?ll find your recently captured items there.

Also see the Help > QRecall Help > Guide > Advanced > Combine Items topic for how to join two sets of items captured with different identity keys or from different volumes into a single owner/volume.

Rosco Mahoney wrote:So QRecall 2.0 is still on the horizon? Maybe not too long?
Either way I am really looking forward to it.

Rosco,

Like the Great Pumpkin, QRecall 2.0 will rise again.

I?m happy to report that I delivered the final chapter of my new book, Learn iOS 8 App Development, to Apress yesterday. I expect near-full-time development of QRecall to resume in a week or two. I?ll post something on the forums when the next beta looks close.

Ralph Strauch wrote:... and got a status window message that the archive required repair. Since that's a long process with a terabyte archive, I decided to rerun the backup first. It completed successfully and has been working since. The Archive Status window, though, refuses to reset and continues to show an "archive needs repair" message and a red circle, as does the status window on my other computer.

That?s correct. When a capture, merge, compact, or other action that is modifying the archive encounters a fatal problem reading or writing the archive data, it sets a ?needs repair? flag in the archive?s status file. This is what triggers the message you?re seeing in status window.

The only actions that will clear that flag are verify and repair. Another successful capture, merge, or whatever does not mean the archive is fixed. None of these actions will repair anything that?s broken in the archive (except those things that can be fixed by auto-repair). If the next capture was successful, it just means you dodged the bullet that got your last capture, but it doesn?t mean the archive is in perfect working order.

I even told the Status window to forget the archive, thinking that would clear it, and the repair needed message still came back.

The Ignore command just makes the status window forget about that archive?s status. The archive, however, still maintains its status internally and the next action (capture, compact, verify, ?) will restore the archive?s status in the window in its entirely, which includes the remembered ?needs repair? flag.

Is the Status window indicating an underlying problem with the archive that needs repair, even though it seems to be working OK, or is this just a problem with the status report?

It may, or may not, be a problem with your archive. Since your failure was due to a network communications error, it?s possible there?s nothing actually wrong with your archive. But QRecall doesn?t know that, and it won?t reset that ?needs repair? status until it can verify that. And it only verifies the integrity of the entire archive during the verify and repair actions.

If it really bugs you, you could cheat. The archive?s status is maintained in the status.plist file in the archive?s package. Use the Ignore command in the status window to cause the QRecall application to discard its copy of the status, and then open the archive package and delete the status.plist file. QRecall will forget everything it knows about the status of that archive.

Or, you could just wait and run a repair.

Since updating to Yosemite, I'm seeing a couple of other anomalies in the log. The first is repeated "Cannot connect with scheduler" messages, followed by reinstallation of com.qrecall.monitor.plist

The interprocess communications is even more problematic in Yosemite. I?ve completely rewritten this logic in QRecall 2.0.

and the second is an "Unable to determine path to VM swap files; assuming /var/vm" message within each backup log. (I don't see either of the log entries in my other computer.)

I?m still doing research, but it appears that Yosemite has eliminated the vm_swap process. It?s now probably baked into the core OS. As such, there?s probably no way of customizing the location of your swap files in 10.10. This has little or no impact on QRecall. If it can?t find the vm_swap process it simply uses the standard VM swap location. In the next version of QRecall, I?ll remove this test for Yosemite users, but for now just ignore that message.

And as long as I'm writing, I'm curious about what's reflected in the "Ignored some changes" entries I see frequently?

OS X?s filesystem events history is a fantastic feature, but has known flaws. If QRecall relied solely on filesystem events to check for modified files, there are files it could premanently miss. (And yes, there are situations where Time Machine will neglect to backup files indefinitely.) QRecall combats this with a limit (which defaults to about a week) that it will trust the filesystem history information. Once that time period expires, it ignores the filesystem history and preforms an exhaustive scan of the entire filesystem. It also logs the message you see so you know when that happened.

Prion wrote:does QRecall run under Yosemite?

All testing so far indicates that the currently released version of QRecall (1.2.3[8]) is fully functional in Yosemite. I?ve been running QRecall on Yosemite since the first developer release, and have only seen a couple of small cosmetic issues. Testing has included normal day-to-day capture operations, as well as full restores of a bootable Yosemite system.

If anyone reports any Yosemite related bugs, I?ll release a patch. But so far, no one has.

Something is seriously wrong, but I?m not sure what it is.

If I had to guess, I?d say you had volume structure corruption. You said you checked the drive, but I want to make sure you checked the volume the archive is on (not the volume you were capturing).

The initial error was MacOS error -47 (file busy). This makes no sense on a file that?s on a local drive and is already open. Then, your repair actions fail because the stale .lock file can?t be deleted. The error returned was 22. This is either a dsNoPk5 (no package 5) error or EINVAL (invalid argument). Again, neither of these errors make any sense unless something internal to OS X is botched up.

My suggestions would be (a) restart the system, (b) run disk repair on the volume containing the archive, and (c) make sure you are NOT running any anti-virus, disk optimization, or disk virtualization software that might be interfering with QRecall?s ability to read, write, or delete files on that volume. If your external drive is using ownership and permissions, make sure the files belong to your QRecall user and have read and write access. If they don?t, either fix that or turn ownership and permissions off for that drive.

Then try to repair the archive. Repair should delete the .lock file and verify/restore the integrity of the data in the archive.

Mats,

It?s difficult to guess what?s going on here. If you haven?t already, send a diagnostic report (Help > Send Report).

The archive I/O error would indicate a (possibly transient) drive failure. The fact the disk utility give the volume a clean bill of health is excellent news, and your read test of the archive.quanta file is also excellent. While it doesn?t test the writability of the file, these are all good signs and would indicate (to me) that this was a transient error, probably a bus failure.

The .Lock file is very strange. Is this a local (directly connected) volume? If so, I can?t see anything that would prevent a sudo rm /Volumes/Safe House/Dagligen/dagligdags.quanta/.lock command from failing, beyond something seriously wrong with the operating system or the volume structure. If this is networked volume, then sudo will have no effect and you?ll need to delete the .Lock file (or just repair the archive, which will delete the .Lock file) on the computer that?s directly connect to the volume.

I?d be able to look into the further with a diagnostic report.

Norbert Karls wrote:? why is NAS over WiFi (even 800MBit/s) so excruciatingly slow in comparison with that very same NAS over CAT-5, FW-800 or USB-3.0?

This is the nature of WiFi. While the raw transmission speeds of the WiFi and Eithernet have gotten remarkably close, the reality is that the two operate in completely different environments. WiFi has to deal with interference, signal strength, collision avoidance (CSMA/CA vs. CSMA/CD), competing WiFi transmitters, and the latency of radio signals. While there are often things you can do to improve your WiFi performance?like switching channels?most of the performance issues are physical limitations of trying to conduct TCP/IP transactions using a radio.

Can we do anything to improve QRecall performance when backing up via WiFi? Maybe downloading some index data just once and using it locally?

I?ve considered local caching of some index files, but most of the indexes are already read once and cached in memory (which is quite fast, even using WiFi). The biggest problem is latency and the random accesses needed to match the data being captured with the data already in the archive. Random access really take a beating (performance wise) when latency is high.

For equipment that doesn?t wander around, Ethernet is much preferred over WiFi.

As for your merge action and schedule, this is entirely a matter of taste, need, and resources. Your archive isn?t that big, so you?re not constrained by resources—either by your disk space limits or computer time. So the only question is how much information do you want to capture and for how long.

The questions I like to ask are ?How often do my files change?? and ?How long would it be before I notice a problem I?d want to recover from?? Answering those two will questions will guide you on setting up the rolling merge. Here are two additional tips:

- I like to set the ?Ignore? period in the rolling merge to at least 3 to 7 days. This is my fine-grained incremental backup period. By never merging any layers in the past few days, I have access to hourly versions of my files should I discover I?ve done something stupid. And that happens more often than I care to admit.

- If you?ve got plenty of disk space, your merge and compact actions don't need to be performed every day. Schedule them to run once, maybe twice, a week. Schedule the merge to run before the compact.

Overall, it sounds like you?ve got things running pretty smoothly.

Help > QRecall Help > Guide: Automation > Actions > Capture > Adding special items to list

Like so many things about computers, it?s obvious ? once you know where to find it.

Alexandre Takacs wrote:How do I specify a hidden file or folder to be excluded from my backup selection. The UI allows be to specify finder-visible files, but what about the (more and more prevalent) hidden locations ?

I?ve got to start an FAQ.

When clicking on any + button to add items to an action:

hold down the Option key to pick invisible items

hold down the Shift key to pick items inside packages

Alternatively, you can use the Finder to expose the content of packages or use the Go to Folder command to navigate to invisible folders. Once there, drag anything into an action item list.