Message |
|
Norbert Karls wrote:Is it possible that the same upgrade that thrashed the exclude filters in the settings.plist's also clobbered the correction code level?
Not likely. While it's true that the redundancy preference is stored in a property file, QRecall refers to the actual redundancy companion files when determining if redundancy is available and how much. So once the redundant data files are created, it doesn't really matter what the settings file says they should be. Conversely, if the data redundancy files are missing or malformed somehow, QRecall may determine that redundant data isn't available and will run with that assumption, even if the settings disagree.
|
|
|
Norbert Karls wrote:
qrecall verify work.medium.quanta --monitor # this is going to fail
|- -|
* verify process failed: unhashed data package missing from index
OK, there's probably a bug in the compact action. I'm not sure what, but I need to start with the details so please send a diagnostic report (in the QRecall app choose Help > Send Report). I might need more info later, but I need that to get started. From what you've posted, I can at least tell you what's going wrong, even though I don't know why. The "unhashed data package missing from index" error means that you have an un-de-duplicated data quanta that has been captured but not yet de-duplicated (using the "defer de-duplication" capture option). But for some reason, that data record is not included in the un-hashed quanta index table. That's very weird, because the repair should fix that and any subsequent compact should then de-dup the quanta in that table. But clearly something is going wrong here, so the diagnostic will help me start an investigation. So this probably explains why this only happens on some archives because you only have some capture actions which defer de-duplication. For there record, this is a fairly benign problem[1]. It just means that the next compact action won't know to de-duplicate some of the capture data, which means your archive might contain duplicate quanta. The only side effect of this is that your archive may be larger than it needs to be. The capture and merge actions still work because they don't touch un-de-duplicated data (that's been set aside of the compact to deal with). As a rule, specific actions verify only the data records and indexes they need to accomplish their task. The verify action verifies everything. That's why capture and merge still work, even when verify fails. [1] QRecall has an extremely low tolerance for any kind of inconsistency. It might be a curse when dealing with seemingly trivial issues like this, but I prefer the "better correct than sorry" philosophy.
|
|
|
So the mystery is that you can open and capture files to testbackup.quanta, but you can't open and capture files to 3rd backup.quanta.
I don't see any differences in the ownership or permissions for the two archives, and I don't see any stale .lock or .share files that might be blocking access to it.
So, the only thing I can think of at this point is that the file server supports file locking and/or advisory locks and is holding an orphaned lock on one of the files in 3rd backup.quanta. This can happen when a network clients obtains a lock on a file and then gets disconnected from the server.
This can usually be solved by restarting the server. If this is a network device that runs 24/7, it's easy for orphaned locks to stay around for weeks. (Note that in cases like this, restarting the clients won't have any effect on the problem.)
If that doesn't work, you can try repairing the 3rd backup archive (presuming you have enough free space), choosing the "Copy recovered content to new archive" option (say 4th backup archive). QRecall will extract all of the data in the original archive and use it to create a brand new one. When finished, you can discard the old archive.
|
|
|
On deck for QRecall 2.1 is the ability to run a script (or any POSIX executable) before, and again after, an action runs. A script could, potentially, take steps to prepare either the archive (such as mounting a NAS device) or the items (like backing up a database server or shutting down a VM) for capture. Once the action finishes, a second script can perform any desired cleanup or post-processing (disconnect a NA, resume a VM, ...). Your situation would require some tools (shell commands, AppleScript, etc.) that could control the running VMs.
|
|
|
Ralph, Thanks for the info, but it wasn't quite what I was looking for. I'm interested in seeing the insides of the archive packages. These commands should do the trick:
ls -lna@e '/Volumes/volume2/testbackup.quanta'
ls -lna@e '/Volumes/volume2/3rd backup.quanta' I'm interested in the ownership, permissions, and existence of the various .lock and .share semaphore files inside the archive package.
|
|
|
Ralph, Looking at the logs, QRecall is still stuck trying to obtain (and later break) the shared file semaphore. I suspect a permissions problem, but it's hard to tell from the logs. I'd be very much interested in knowing the ownership and permissions of all of the files in both the archive that is stuck and the one that is working. If you have the time, open an Terminal window and issue this 'ls' command for each archive:
ls -lna@e /Volumes/Backup/PathToArchive.quanta email the results, or post them here.
|
|
|
Alexandre Takacs wrote:My problem is that my system - and those VM - run pretty much 24/7. I might consider suspending them for capture but I'd really like to avoid actually shutting down (if at all possible).
The big question is does "suspending" a VM flush all of its data to disk? If it does, then you could take a manual approach: create a capture action for just your VM folder, occasionally suspend your VMs, start the capture, and then resume them when the capture is finished. (I have a much better solution for this kind of problem in the next major release of QRecall...)
|
|
|
Alexandre, You are absolutely correct. Making a backup (with any software) of a virtual machine image while that VM is running is probably going to result in an incomplete/corrupt copy. This is a general problem with software that doesn't immediately write all of its data to disk and affects databases, video editing software, and so on. QRecall 2.0 has a new set of action events specifically designed to address this. You can start a capture action when an application quits, and you can ignore or stop a capture action if an application is running. This encourages the capture to run only when the application that might be modifying those files is dormant. If you run captures regularly and your VMWare usage is infrequent, you might consider just skipping backups that occur when that app is running by adding a new condition to your capture action: - Stop If Application [ VMWare Fusion ] Is Open If you run VMWare regularly, and want to capture complete, and stable, copies of your VM images, run the capture using an Event Schedule: - Event: Run when [ VMWare Fusion Quits ] My solution was to split off my (Parallels) VM images into their own archive.
My primary archive, which captures my startup volume, excludes the Parallels folder from all captures (set up in Archive > Settings...)
I have a second archive the just captures the Parallels folder.
The capture action for the second archive runs 1 minute after Parallels Desktop quits
The action also stops capturing if the application starts again Using this scheme, QRecall captures my VM images immediately after I quit, and never while it's running. I always get a "clean" copy of the VM images. If you want to make occasional copies during the day, just suspend your VMs and quit the app when you take a break (I do it before going to get coffee). By the time you get back, QRecall will have captured your VMs. Another important note: QRecall depends on File System Change Events (a macOS service that tracks changes made to your filesystem) to quickly determine what items have changed and need to be considered for re-capturing. A few software applications, most notably VM apps, make changes in a way that eludes File System Change Events. This means that QRecall won't notice that certain files within your VM package have changed, possibly for weeks. The capture action has a new "Deep Scan" option that ignores File System Change Events and exhaustively examines every file for changes. As you can imagine, this is much slower. Another advantage of splitting off your VM captures into a separate archive and action is that you can perform a "Deep Scan" on your VM captures (which is fairly shallow) and continue to use File System Change Events for your regular captures.
|
|
|
Ralph Strauch wrote:I think I've that sorted out now, and am back to one key per computer. I assume I can just delete the extra layer that I created unintentionally without affecting the rest of the backup
A more surgical approach would be to find the "new" volume in the other owner and simply delete that from the archive. That way, you don't have to delete any subsequent layers.
I had set the Current Encryption Limit to 2 after you suggested 1 or 2, but I'll take it down to 1 now.
Let me know how that goes, or just send another diagnostic report when you get back.
I seldom even log onto the iMac, but it looks like Qrecall could run scheduled backups of the whole iMac from my account without my being logged in. Is that correct, and should that satisfy the router's desire for a single common user? (I'm uid 501 on both machines.)
Here's the important concept: The account on your computer is independent of the account you use on the file server. This is the concept that is most confusing when working with networked volumes. It does NOT matter what your user account is. The files written to your file server will belong to the (server) account you use to authenticate with when you connect to the server. If you and your wife can get set up so that both of your local accounts connect to the file server using the same server account, then the files (archives) on the shared volume will belong to both of you, and it doesn't matter what your local account is or what UID you're using. The reason I'm short on practical advice is that different file servers, NAS devices, and so on handle this differently. For example, Apple's Airport Time Capsule has (basically) two different authentication modes for its shared disk: shared and per-account. The "shared" modes allow all network users to access the files on the Time Capsule as if they were all the same user. This is the effect you need, and this is the mode I use with my Time Capsules. All QRecall users can connect to the Time Capsule and use the same archives, since (from the Time Capsule's perspective) they are all the same user. If I switched my Time Capsule to the per-account mode, I'd have the same problem you're experiencing. Other devices handle accounts and ownership differently. Some server/NAS devices, for example, deliberately extend file ownership to the shared volume using your local account ID, effectively emulating an external drive. So YMMV and you'll need to find the magic combination that works for you.
|
|
|
Ralph Strauch wrote:I've now mounted the backup drive directly on the MBP and run a backup there, which took almost 11 hours and created a new layer containing 204gb, which is about 10x what I would have expected. <clip> The drive being backed up contains 360GB and the archive now shows in Finder as 713GB. This is about what it was before the backup, so I'm not sure where the new layer went. It's somewhere, though, because I can see it when I open the archive, and can restore files from it, (I should have been making notes during this process, but wasn't.) The new layer contains files that haven't changed in years, so I don't know what criteria Qrecall was using for changed files.
What's happening is QRecall is recapturing every file on your hard drive because the identity key is different. An archive is organized by owners (define by identity keys), which contain volumes, which contain files and folders. If your identity key changes QRecall treats your files as if they were from a completely different computer and recaptures them all. These new files will be contained within the new owner and volume at the root of the archive. Use the browser bar at the bottom to navigate to the root of the archive to see the new owner and volumes. Use the Combine Items command to stitch two owners, or two volumes, that are actually the same into a single history. Alternatively, you can reinstall the original identity key and QRecall will resume updating the history you already have. If you're wondering which key you have installed, go to QRecall > Preferences > Identity key, hold down the Option key while clicking on the Enter Key button. In the "key" field, you'll see a light grey number. That is your key's serial number. Now use this Magic Account URL on the QRecall website, and it will list the serial number of each of your identity keys. The reason the archive isn't (much) bigger is because it's all duplicate data. (Duplicates of that "other" computer's files already in the archive.)
The Qrecall log did not record the completion of that backup.
That's because the capture isn't finishing. The capture action is crashing trying to encrypt data. Try setting your Concurrent Encryption Limit setting to 1 and run the capture again. Meanwhile, I'll fire off another blistering bug report to Apple.
At this point my best solution may be to return my new router and go back to my old system, so my only questions now are where the 204gb backup went and if the fact that it doesn't show as completed in the log is anything I should worry about?
Well, we should be worried about the encryption framework crashing on you. That's not productive at all. If you didn't mean to change identity keys, that's something that you should look into. Aside from the annoyance of recapturing your entire hard drive and having separate history, or taking the trouble to combine them, it doesn't bother QRecall?but it might bother you. Honestly, I would hope that you can get this working on your router. If it's possible for all of the QRecall clients to connect to the router with the same authentication, they should be able to peacefully share the single archive.
|
|
|
Ralph, Version support is a red herring. This is only supported on macOS formatted volumes because it's a really complex feature. But it won't interfere with QRecall. QRecall tries to use only the very basic filesystem features so it can be compatible with a wide range of filesystems, servers, and volumes formats. When it does use a fancy filesystem features (like file range locking or atomic swaps), it implements fallback methods to accomplish the same thing when it finds itself using a filesystems that don't support them. I'll assert that your problems are entirely ones of ownership and permissions, so here's the short course. On a volume the honors ownership and permissions, you can access the files you own or have been granted access to by its owner. (There's a lot of exceptions and caveats to that rule, but this is the simple explanation.) When you create a file, you own it by default. So Amy creates a file, but Bob can't read or write it because Amy didn't grant Bob access. This is the default/typical file ownership rule at work, and what keeps users of your "Guest" account from reading your email. The same rule applies to QRecall archives. So Amy creates an archive, but Bob can't modify it. Now if both Amy and Bob need to capture to the same archive, it's typically because the archive is on an external drive or a shared volume. An external drive is easy to fix by setting the drive to "Ignore ownership and permissions" on both systems. Now both computers can freely access the archive modified by the other. On shared volumes things get a little more complicated. If both Amy and Bob both have their own accounts on the server, they have the same problem as before; Amy's archive will belong to Amy. Permissions, however, are managed by the server and there will be no option to ignore them. The simplest solution is for Bob to sign onto the server as Amy, vice versa, or create a neutral account (Pat) that both users can connect to the server. When connected to a file server, the files you create belong to the account you authenticated as, not your user account. This is easy to forget as most people use the same account name on the server as they do on their computer. The complication here is if both Amy and Bob need to connect to the server using their account for other purposes. Most file servers either do not support, or make it difficult to, open multiple connections to the same server using different accounts. Another possible solution is to change the umask of your account. The umask adjusts the default permissions granted to new files. You can set it up so that other users in your group (or everyone) can share files you create by default. Unfortunately, this is a global setting that also affects the files you create locally on your hard drive, so it might not be what you want and has obvious security implications. If you're only using your router/server for QRecall backups, the most manageable solution is to have all clients connect to the server using the same account. (You'll wan to save that account name and password in your keychain too.) Now change the ownership of the archive to the account you all share, and you should have trouble-free access to that archive from all of your systems. How you accomplish that with your particular NAS/Server is an exercise left to the reader.
|
|
|
Ralph Strauch wrote:OK, this time I got a "permission denied."
Then that's the problem. From the perspective of the router's file server, you do not have permission to modify files inside the archive package directory. This pretty much rules out capturing data to it. Soooooo If you are the only computer & users capturing to this archive, then there might be a simple solution. Let's try this:
chown -R $(id -u) '/Volumes/volume2/3rd backup.quanta' This will change ownership of the archive package directory, and all of the files it contains, to the user account you are currently logged in as. After that, try to accessing/capturing to the archive. Again, if you're the only user/system accessing this archive that might fix the problem. If you share this archive with other users, then that complicates things a bit.
|
|
|
Ralph, Close (but no cigar) You need to test the access to the directory inside the archive package. Try these commands:
cd '/Volumes/volume2/3rd backup.quanta'
touch .semaphore
rm .semaphore However, since it appears that the archive belongs to you, and I assume that you're performing the capture from the same computer using your account, it should work. Which might mean we're back to square one.
|
|
|
Norbert, The problem is pretty obscure, and appears to be either legacy or corrupted data in your archive's settings file. Specifically, the data that describes which items should always be excluded can't be decoded for some reason (the log isn't that detailed). This should fix it:
Open the archive.
To go to the Archive > Settings, select and delete all of the excluded items. Save the changes.
Open Archive > Settings again, and add back in all of the items you want excluded. Save the changes. The capture should now run. If any of that didn't work (the same bug might cause the settings dialog to malfunction), you can fix the problem surgically, following these instructions. Close the archive. Using a plain text editor (from the Terminal if you're comfortable with that, or something like BBEdit or TextWrangler), open the settings.plist file inside your archive package. In it, you'll see a definition for your excluded items that will look like this:
<key>ExcludeFilter</key>
<data>
YnBsaXN0MDDUAQIDBAUGWFlYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3AS
AAGGoK8QFQcIERUcICQrMDM2OTxAREVLTE1OUlUkbnVsbNQJCgsMDQ4PEFdmaWx0ZXJz
...
XA18DY8NoQ2kDakAAAAAAAACAQAAAAAAAABcAAAAAAAAAAAAAAAAAAANqw==
</data>
Delete these lines, save the file, and return to the previous instructions to add your excluded items back to the archive.
|
|
|
Ralph, I don't think that the problem is how the volume is mounted. Basically, QRecall (and most software) doesn't care how the volume appears in the Finder or what its mountpoint is. Software (QRecall) gets a path to the document, and it uses that path to manipulate the files. The rest is unimportant details (to the software). I suspect you have a permissions problem or your SMB server doesn't support the necessary file locking features. QRecall is getting stuck trying to obtain exclusive access to the archive. It uses several techniques to do this, because not all filesystem support the same file locking features. In your case, it's getting stuck trying to obtain a "distributed lock". The exact mechanics of a distributed lock vary from one filesystem to the next, but most involve create a "lock" file used to coordinate access from multiple clients. This what I found in your log:
2016-12-17 21:32:43.883 breaking distLock after 150 tries
2016-12-17 21:33:51.114 breaking distLock after 150 tries
2016-12-17 21:35:00.437 breaking distLock after 150 tries
2016-12-17 21:36:11.570 breaking distLock after 150 tries This should never happen, and if it does it should only happen once; once a stale distLock is broken, it should start working again immediately, but clearly this one still isn't. When a distributed lock is implemented as a file, it must have read and write access to that directory. You can test this in the Terminal. With the archive mounted on your SMB volume, issue these commands:
user$ cd <Drag and drop your archive icon here>
user$ touch .semaphore
user$ rm .semaphore The .semaphore file is the filesystem object used to coordinate the lock. If you can modify it and delete it as a user, then it should work. If any of these commands report errors, then I suspect you have a permissions or access issue. QRecall documents are just like any other file; your user account must have read, write, and search permission on the contents of the archive package in order to update it. Now if these commands all work just fine, and the rest of the archive package has the correct ownership and permissions, then I'm stumped. I would suspect that something about the file locking features implemented by your SMB server doesn't line up with what macOS is expecting.
|
|
|
|
|