QRecall

Ralph Strauch wrote:I've now mounted the backup drive directly on the MBP and run a backup there, which took almost 11 hours and created a new layer containing 204gb, which is about 10x what I would have expected.
<clip>
The drive being backed up contains 360GB and the archive now shows in Finder as 713GB. This is about what it was before the backup, so I'm not sure where the new layer went. It's somewhere, though, because I can see it when I open the archive, and can restore files from it, (I should have been making notes during this process, but wasn't.) The new layer contains files that haven't changed in years, so I don't know what criteria Qrecall was using for changed files.

What's happening is QRecall is recapturing every file on your hard drive because the identity key is different. An archive is organized by owners (define by identity keys), which contain volumes, which contain files and folders. If your identity key changes QRecall treats your files as if they were from a completely different computer and recaptures them all. These new files will be contained within the new owner and volume at the root of the archive.

Use the browser bar at the bottom to navigate to the root of the archive to see the new owner and volumes. Use the Combine Items command to stitch two owners, or two volumes, that are actually the same into a single history.

Alternatively, you can reinstall the original identity key and QRecall will resume updating the history you already have. If you're wondering which key you have installed, go to QRecall > Preferences > Identity key, hold down the Option key while clicking on the Enter Key button. In the "key" field, you'll see a light grey number. That is your key's serial number. Now use this Magic Account URL on the QRecall website, and it will list the serial number of each of your identity keys.

The reason the archive isn't (much) bigger is because it's all duplicate data. (Duplicates of that "other" computer's files already in the archive.)

The Qrecall log did not record the completion of that backup.

That's because the capture isn't finishing. The capture action is crashing trying to encrypt data. Try setting your Concurrent Encryption Limit setting to 1 and run the capture again. Meanwhile, I'll fire off another blistering bug report to Apple.

At this point my best solution may be to return my new router and go back to my old system, so my only questions now are where the 204gb backup went and if the fact that it doesn't show as completed in the log is anything I should worry about?

Well, we should be worried about the encryption framework crashing on you. That's not productive at all.

If you didn't mean to change identity keys, that's something that you should look into. Aside from the annoyance of recapturing your entire hard drive and having separate history, or taking the trouble to combine them, it doesn't bother QRecall?but it might bother you.

Honestly, I would hope that you can get this working on your router. If it's possible for all of the QRecall clients to connect to the router with the same authentication, they should be able to peacefully share the single archive.

Ralph,

Version support is a red herring. This is only supported on macOS formatted volumes because it's a really complex feature. But it won't interfere with QRecall. QRecall tries to use only the very basic filesystem features so it can be compatible with a wide range of filesystems, servers, and volumes formats. When it does use a fancy filesystem features (like file range locking or atomic swaps), it implements fallback methods to accomplish the same thing when it finds itself using a filesystems that don't support them.

I'll assert that your problems are entirely ones of ownership and permissions, so here's the short course.

On a volume the honors ownership and permissions, you can access the files you own or have been granted access to by its owner. (There's a lot of exceptions and caveats to that rule, but this is the simple explanation.) When you create a file, you own it by default.

So Amy creates a file, but Bob can't read or write it because Amy didn't grant Bob access. This is the default/typical file ownership rule at work, and what keeps users of your "Guest" account from reading your email. The same rule applies to QRecall archives.

So Amy creates an archive, but Bob can't modify it.

Now if both Amy and Bob need to capture to the same archive, it's typically because the archive is on an external drive or a shared volume. An external drive is easy to fix by setting the drive to "Ignore ownership and permissions" on both systems. Now both computers can freely access the archive modified by the other.

On shared volumes things get a little more complicated. If both Amy and Bob both have their own accounts on the server, they have the same problem as before; Amy's archive will belong to Amy. Permissions, however, are managed by the server and there will be no option to ignore them.

The simplest solution is for Bob to sign onto the server as Amy, vice versa, or create a neutral account (Pat) that both users can connect to the server. When connected to a file server, the files you create belong to the account you authenticated as, not your user account. This is easy to forget as most people use the same account name on the server as they do on their computer. The complication here is if both Amy and Bob need to connect to the server using their account for other purposes. Most file servers either do not support, or make it difficult to, open multiple connections to the same server using different accounts.

Another possible solution is to change the umask of your account. The umask adjusts the default permissions granted to new files. You can set it up so that other users in your group (or everyone) can share files you create by default. Unfortunately, this is a global setting that also affects the files you create locally on your hard drive, so it might not be what you want and has obvious security implications.

If you're only using your router/server for QRecall backups, the most manageable solution is to have all clients connect to the server using the same account. (You'll wan to save that account name and password in your keychain too.) Now change the ownership of the archive to the account you all share, and you should have trouble-free access to that archive from all of your systems. How you accomplish that with your particular NAS/Server is an exercise left to the reader.

Ralph Strauch wrote:OK, this time I got a "permission denied."

Then that's the problem. From the perspective of the router's file server, you do not have permission to modify files inside the archive package directory. This pretty much rules out capturing data to it.

Soooooo

If you are the only computer & users capturing to this archive, then there might be a simple solution. Let's try this:

chown -R $(id -u) '/Volumes/volume2/3rd backup.quanta'

This will change ownership of the archive package directory, and all of the files it contains, to the user account you are currently logged in as. After that, try to accessing/capturing to the archive.

Again, if you're the only user/system accessing this archive that might fix the problem. If you share this archive with other users, then that complicates things a bit.

Ralph,

Close (but no cigar)

You need to test the access to the directory inside the archive package. Try these commands:

cd '/Volumes/volume2/3rd backup.quanta'

touch .semaphore

rm .semaphore

However, since it appears that the archive belongs to you, and I assume that you're performing the capture from the same computer using your account, it should work. Which might mean we're back to square one.

Norbert,

The problem is pretty obscure, and appears to be either legacy or corrupted data in your archive's settings file.

Specifically, the data that describes which items should always be excluded can't be decoded for some reason (the log isn't that detailed).

This should fix it:

Open the archive.

To go to the Archive > Settings, select and delete all of the excluded items. Save the changes.

Open Archive > Settings again, and add back in all of the items you want excluded. Save the changes.

The capture should now run.

If any of that didn't work (the same bug might cause the settings dialog to malfunction), you can fix the problem surgically, following these instructions.

Close the archive.

Using a plain text editor (from the Terminal if you're comfortable with that, or something like BBEdit or TextWrangler), open the settings.plist file inside your archive package. In it, you'll see a definition for your excluded items that will look like this:

<key>ExcludeFilter</key>

<data>

YnBsaXN0MDDUAQIDBAUGWFlYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3AS

AAGGoK8QFQcIERUcICQrMDM2OTxAREVLTE1OUlUkbnVsbNQJCgsMDQ4PEFdmaWx0ZXJz

...

XA18DY8NoQ2kDakAAAAAAAACAQAAAAAAAABcAAAAAAAAAAAAAAAAAAANqw==

</data>

Delete these lines, save the file, and return to the previous instructions to add your excluded items back to the archive.

Ralph,

I don't think that the problem is how the volume is mounted. Basically, QRecall (and most software) doesn't care how the volume appears in the Finder or what its mountpoint is. Software (QRecall) gets a path to the document, and it uses that path to manipulate the files. The rest is unimportant details (to the software).

I suspect you have a permissions problem or your SMB server doesn't support the necessary file locking features.

QRecall is getting stuck trying to obtain exclusive access to the archive. It uses several techniques to do this, because not all filesystem support the same file locking features. In your case, it's getting stuck trying to obtain a "distributed lock". The exact mechanics of a distributed lock vary from one filesystem to the next, but most involve create a "lock" file used to coordinate access from multiple clients. This what I found in your log:

2016-12-17 21:32:43.883 breaking distLock after 150 tries

2016-12-17 21:33:51.114 breaking distLock after 150 tries

2016-12-17 21:35:00.437 breaking distLock after 150 tries

2016-12-17 21:36:11.570 breaking distLock after 150 tries

This should never happen, and if it does it should only happen once; once a stale distLock is broken, it should start working again immediately, but clearly this one still isn't.

When a distributed lock is implemented as a file, it must have read and write access to that directory. You can test this in the Terminal. With the archive mounted on your SMB volume, issue these commands:

user$ cd <Drag and drop your archive icon here>

user$ touch .semaphore

user$ rm .semaphore

The .semaphore file is the filesystem object used to coordinate the lock. If you can modify it and delete it as a user, then it should work. If any of these commands report errors, then I suspect you have a permissions or access issue. QRecall documents are just like any other file; your user account must have read, write, and search permission on the contents of the archive package in order to update it.

Now if these commands all work just fine, and the rest of the archive package has the correct ownership and permissions, then I'm stumped. I would suspect that something about the file locking features implemented by your SMB server doesn't line up with what macOS is expecting.

Ralph Strauch wrote:What's my best course forward at this point? I assume that the archive is as big as it is because it now contains both unencrypted and encrypted data.

Ralph,

Thanks for sending a diagnostic report. From the logs, I can see that your encryption action is failing because the macOS encryption services are crashing. Some systems?and I don't yet have a sense for which ones?have a great deal of difficulty executing multiple encryption/description operations simultaneously. Sometimes the operations fail (which QRecall can deal with), but for some users the operations crash.

So what's happening is that your encryption action is crashing. For safety, Qrecall first makes an encrypted copy of the archive data. So if there is a problem, you won't be left with a half-encrypted archive. That intermediate copy is what's using up your disk space.

First, let's fix the disk space problem

Control/Right+click the archive in the Finder and choose Show Package Contents

Trash any files that begin with the name repository_scribble

Close the window and empty the trash

Now you should be able to start a new encryption task. But to keep that from crashing again, you'll want to limit the number of concurrent encryption operations QRecall will perform.

Make sure you've upgraded to QRecall 2.0.7

Go to QRecall > Preferences > Advanced

Find the Concurrent Encryption Limit setting and set it to 1 or 2

Now try the encrypt command again.

You can try adjusting the concurrent limit to a higher value, just realize that if it's too high some QRecall actions will randomly crash. One of the best ways to test a higher setting is using a verify command; verify performs a lot of concurrent encryption operations and there won't be any data loss if it crashes.

Alexandre,

I suspect something else is going on.

QRecall can certainly be configured to "just work," but in QRecall that's a choice.

It doesn't make sense to me that you can keep 6 months of Time Machine data on a volume, but QRecall runs out of space in a couple of weeks. QRecall is much more space efficient than Time Machine. You should be able to keep anywhere from 2 to 4 times as much history using QRecall.

For example, my main development system has a 960GB SSD. I capture my home directory every 3 hours and the entire volume daily. There's a rolling merge that keeps every capture for the past three days, consolidates daily captures for 30 days, then consolidates those into weekly layers for 26 weeks, and then finally merges those into monthly layers going back 18 months.

This means I have captured data, at various levels of granularity, stretching back over 2 years. The archive size is 1.2TB (60% of a 2TB volume). I have never, once, had to "manually" manage anything about this archive.

I'd be happy to help you find out what's going on. There are a number of possibilities. QRecall might be capturing a lot more data than Time Machine backs up, something else is using the space on your archive volume, or you need a different set of merge and compact actions to automatically manage its size.

Taking the last one first, I would recommend deleting all of the actions you have for this archive and use the Capture Assistant (in the Help menu) to create a fresh set of actions. For you, I suggest choosing the "Keep 5 days, then less frequently for 1 month" and "Yes, discard the oldest items to make more room" options in the assistant. When the assistant is done, I invite you to review the rolling merge action and adjust the time scale/granularity to your liking. You might also want to create an additional capture action that captures your home folder every few hours during the day. This combination will most closely emulate Time Machine's default schedule.

One of the important differences between QRecall and Time Machine is that when you run out of disk space, QRecall stops and lets you know. Time Machine just starts deleting things, without warning, until it has enough room.

Mike M wrote:Or ask them to pretend they are teaching it to me.

There's nothing that focuses one's learning like having to teach it.

Let me know if you need to teach me anything else.

Mike,

I can appreciate that some of these concepts give you headaches. Filesystems are hard.

I'll address some of you specific questions, but for the most part you can think of layers as "checkpoints" or "snapshots" or whatever concept you find easiest to deal with.

In the case where you have a single capture action, so that you capture the exact same set of files each time, and nothing else, most of these concepts can be equated. Specifically:

Here is how that is different from a layer. Your documentations describes layers as containing (1) files, (2) deltas from previous layers, and/or (3) some indication that a file has been deleted.

Start by ignoring the implementation details. How QRecall represents files and folders in a layer is really immaterial. Conceptually, each layer contains a complete copy of every item captured. This is your "checkpoint."

(In reality, a layer doesn't contain anything but references to unique database records, which contain the data and metadata of those items; the reason it's done that way is because it allows QRecall to store all of this data in the minimal amount of space possible.)

Your layers also have numbers rather than time tags--at least that's how you always describe it in the documentation.

Layer numbers are simply convenient labels that make referring to them in the interface, on the command line, or in actions, simple and easy to understand. Every layer has a date, the date it was captured. This date appear in the layer pane of the archive browser.

It's also useful to know that each item has a capture date (which you can see in the inspector panel). This is the exact moment in time that specific item was captured.

So what about the idea of checkpoints? Well, let's say there's a checkpoint at time T. I can then know, simply, that enough information is in the archive to recreate the state of all captured items at time T.

That's a layer. If an item exists in a layer, then that item can be recalled by rewinding the archive back to that layer.

This is easily visualized using the browser timelines. Select a captured item in the browser, and QRecall will draw its timeline back through the layers where that item was captured, recaptured, or simply existed in. Conceptually, if a timeline intersects a layer, that item "exists" in that layer.

I don't have to concern myself with how that data is represented. It might be deltas, it might need to refer to prior layers, it might look at other markers in the layer... but I don't need to know any of that stuff.

Now you're getting the idea.

Now let's consider the concept of "merging layers." This gives me a headache. I think it's actually a quite involved algorithm, and your examples have lots of parts in their diagrams.

What if, instead, we say that QRecall "deletes checkpoints"? No longer is there the concept of "merging." Instead, I can think about a "checkpoint deletion" as removing information from the archive ... and it's easy to think about that. The information lost, simply, is the state of the file system at the deleted checkpoints.

I don't have to care whether this is done via merging layers or any other algorithm. I also don't get a headache trying to think about the behavior of the system.

Again, if you limit the example to a single, uniform, capture action that captures the same set of files every time, then these concepts are equivalent. And if that makes you're life easier, then use that.

Most of the rest of your description is accurate. Basically, if an item exists in a layer then you can recall that item at some future date. Rolling merges eliminate intermediate layers/checkpoints so that only the last captured version in any particular timespan is retained. Whether you imagine layers being merged or checkpoints being deleted, the results are the same.

Now let me stop here and ask, is this actually an accurate way of thinking about it?

It is, as long as your layers remain simple. But layers can get complicated.

Consider capturing your whole startup volume at 3:00, then repeatedly capturing just your home folder every hour during the day. At the end of the day you merge all of those layers together. What do you have?

You have an interesting mixture of items captured at 3:00 (your applications) and newer items captured as late as 23:00 (in your Documents folder). That's because merging is just that; you can't think of it as "deleting" all of the earlier layers, because there's data in the very first layer that's not superseded by subsequent layers.

Now take another example of two volumes, or even two separate computer systems. Volume "A" is captured to layer 1. Later, volume "B" gets captured to layer 2. You then merge those two layers. What information was deleted?

The answer is nothing. The new layer contains a complete copy of both volumes "A" and "B" because none of the items in those layer sets intersect.

Sorry if that makes you're head hurt.

Alexandre,

Scheduled actions are the way you "set and forget" in QRecall. The actions described in the earlier post will manage your archive's overall size automatically, every day. Add a routine verify action and maybe an automated repair action, and you're good to go.

If you want help setting up a comprehensive set of actions, use the QRecall Assistant (under the Help menu). It will ask you a series of questions and create a set of actions that implement your answers. You can then review the actions the assistant created, edit them to refine your solution, or throw them away and start over.

The notable difference is that Time Machine has one management algorithm which you have no control over. QRecall gives you broad discretion on how you manage your archive. And that management can be completely automated, manual, or some combination you choose.

Alexandre Takacs wrote:Is it possible to set a fixed limit for the archive size - say 1TB - and let QRecall manage it so that it doesn't grow past that ?

No such option currently exists, for a variety of reasons, which I'll touch on in a moment.

If you really want do have a "hard" limit on how much your archive grows I suggest placing the archive in its own drive partition of the desired size. QRecall detects when the disk is nearly full and will abort captures and other actions that would fail if they ran out of disk space.

If you do this, you'll probably want to adopt the actions created by the capture assistant when you choose "keep items as long as possible" and "use all available space":

Merge that first three (0 through 2) layers whenever the free space on the volume is less than 10%, run daily.

Compact the archive whenever the free space on the volume is less than 10%, run daily, after the merge.

These two actions mimic Time Machine's logic. They monitor the free space on the volume and start discarding the oldest items in the archive whenever the volume starts to get too full. The rest of the time, they do nothing.

A requested feature, which is on the wish list for the next major version of QRecall, is to add a new "archive size more than X" schedule condition. That would allows actions to be scheduled based directly on the archive's size, rather than indirectly based on the free space available on its volume. If a feature like that would satisfy your needs, let me know.

It would be technically very easy to add an option that strictly limits the size of an archive. But doing so just sets up QRecall for failure (rather than success). The logic that prevents the archive from getting too big when the disk space is low works by aborting capture actions, leaving items un-captured. In this situation, QRecall is choosing to protect the integrity of the archive over capturing all of the new items. But this is still a capture failure.

If I implemented a user-configurable archive size, it just creates a situation where QRecall arbitratily stops working, and that seems like a bad idea.

Mike,

There are two reasons an item will exist in an archive: it was captured in a layer or it has been preserved as a deleted item.

tl;dr: captured items in older layers are preserved until they are merged with more recent layers. The "Keep deleted items..." option will ensure that deleted items are kept in the archive for a minimum period of time.

Here's the long-winded explanation...

Items in layers

The first should be obvious. Each capture creates a "layer" in the archive that represent a snapshot of all of the items at that moment in time. Let's take this example:

Create a document on Monday. Preform a capture

Delete the document on Tuesday. Perform a capture.

Your archive now has two layers. The first layer contains the document, the second layer does not. As long as both layers exist, you can always rewind the archive back to the first layer (using the layer shade) and recall the document as it was captured on Monday.

The document in the first layer will exist in the archive until that layer is merged with a subsequent layer. Merging the two layers combines them and keeps only the most recent state. Since the document didn't exist in the second layer, the document will not exist in the merged layer.

To answer your first question, merging layers is the natural way in which old documents, or old versions of documents, get deleted from the archive. As layers are merged, older items are discarded, and their space in the archive reclaimed. You can automate this using a merge action. How often you merge layers, and the granularity of those merged layers, determined how long captured items will remain in the archive.

Keep Deleted Items

Merging layers can sometimes discard items in a seemingly haphazard fashion, particularly for short-lived items. For example, let's say that you capture every day, periodically merge all of the layers captured during the week into a single layer (using a rolling merge action), and you keep the past 52 such layers in your archive.

A document that was created on Friday and deleted the following Monday will persist in the archive for at least a year. (The document existed at the end of the week, so it will exist in the merged layers for that week, and you keep 52 such layers, so the document will be kept for at least 52 week.) On the other hand, a document created on Monday and deleted on Friday won't exist in the archive for more than a couple of weeks. (When the layers for that week get merged, the document is purged because it didn't exist at the end of the week.)

To combat this capricious behavior, the archive has a "Keep deleted items for at least:" setting (see Archive > Settings....). If you set this period then a merge action will preserve the last version of a deleted item that would normally be discarded, if that item existed within the period you set.

Going back to the Monday-Friday example, if your "Keep" setting was set to 2 months, the file created on Monday and deleted on Friday would be preserved when the layers for that week were merged. The file exists as a special "deleted" item in the layer. Use the View > Show Deleted Items command to see and recall deleted items.

Deleted items are purged naturally as part of a subsequent merge or by a compact action. The compact action seeks out deleted items that are now past their "keep" date (two months, in the previous example) and removes them, without merging any layers.

To sort-of answer your second question, use the "Keep deleted items..." setting to enforce a grace period in which you can still recall the last captured version of a deleted item.

Other Reasons

The special "Delete Item(s)" command will delete any arbitrary item (file, folder, volume) from all layers of an archive, as if that item had never existed. This is mostly for cases where you have a very large file occupying space in the archive and you have no interest in keeping it, but don't want to sacrifice any other items or layers.

For completeness, I should mention that all of the file deletion logic only applies to captured items. If you create a file, immediately delete it, and then perform a capture, the file will never be captured and its fate is outside QRecall's purview.

Jeffery,

Send a diagnostic report (QRecall > Help > Send Report...) and we'll investigate further.

Jeffrey Fort wrote:I have an "encryption key password" and a "recovery key."

An "encryption key" is the cryptographic key used to encrypt, and decrypt, the data in your archive. It is stored in a "key file" in your home directory.

An "encryption key password" is a way of protecting that key file from unwanted agents by encrypting the file with a password.

If you've encrypted your key file with a password, QRecall will need you to supply that password every time it opens the archive. You can enter it manually when browsing the archive. For actions to run automatically, it will require that you store the password on your keychain.

When I try to restore I get a notice that I need a password.

That's a tough one. If you get this dialog when you open the archive, it's probably asking for the encryption key password (see above). Or it might be asking for your recovery key passphrase (see below). But if it's telling you that it needs to perform privileged operations, then it's asking for your administration account password. To avoid that in the future, go to QRecall > Preferences > Authorization and pre-authorize QRecall to use administrative privileges.

A "recovery key" is a backup of your key file stored in the archive itself, and protected with a passphrase. This is independent of your encryption key password (if any). It's basically a protected backup of your encryption key file and is only needed if you've lost your key file. (Without your encryption key file, your archive is unusable.)

For example, if you lose your startup volume and need to restore from scratch, you would start by installing a fresh copy of macOS. But that fresh copy of macOS doesn't have your encryption key file, so QRecall can't open up your archive and restore your hard drive.

That's where the "recovery key" comes into play. When you open the archive, QRecall will prompt you for the recovery key passphrase. Enter it, and it will restore the encryption key file from the secure backup copy stored in the archive. Once the encryption key file has been recovered, QRecall can then open the archive and retrieve your files.

For a explanation of how all of this works, see QRecall > QRecall Help > Guide > Advanced > Encryption. The section "Do not lose your encryption key!" is highly recommended reading.