Message |
|
Charles Watts-Jones wrote:Having installed on a late 2007 iMac (iMac7,1) running 10.6.3 with 4 GB RAM, I turned off scheduled wake-up in Energy Saver and asked QRecall to wake and then sleep machine on completing its tasks. Worked flawlessly - great!
Excellent news!
I didn't however noticed a faster Verify (1:07:12 for a 237.6 GB archive) than I got with QRecall 1.1.4. Don't know if I should have expected this(?).
Verify was highly optimized a couple of versions back, and will be I/O bound for most users. In other words, QRecall can verify the archive as fast as it can be read from the drive. So until you get a faster hard drive/interface, verify won't get any faster. The principle improvements in QRecall 1.2 are more efficient use of the quanta and packages indexes, expanded RAM usage (thanks to 64-bit addressing), and multi-processor load balancing. These changes improve capture performance the most, and impact compact and merge actions to a lessor degree.
|
|
|
Dawn to Dusk Software is pleased to announce the beginning of the QRecall 1.2 beta test program. To get started, go to the QRecall Download page. If you already have a permanent or trial identity key, you can continue to use that. If don't have a permanent key, or your trial key has expired, you can obtain a free Beta Identity Key that will be valid for the entire beta test period. The theme of QRecall 1.2 is "performance and interface." The performance part involves a lot of under-the-hood changes and the bulk of that work is complete. Later beta versions will begin to incorporate a new interface, but I wanted to get the mission-critical code changes done first so they can get as much testing time as possible. As always, feedback is welcome and encouraged.
|
|
|
Dawn to Dusk Software is pleased to announce the beginning of the QRecall 1.2 beta test program. To get started, go to the QRecall Download page. If you already have a permanent or trial identity key, you can continue to use that. If don't have a permanent key, or your trial key has expired, you can obtain a free Beta Identity Key that will be valid for the entire beta test period. The theme of QRecall 1.2 is "performance and interface." The performance part involves a lot of under-the-hood changes and the bulk of that work is complete. Later beta versions will begin to incorporate a new interface, but I wanted to get the mission-critical code changes done first so they can get as much testing time as possible. As always, feedback is welcome and encouraged.
|
|
|
David Cretney wrote:It seems random but the qrecall error is consistent. Is this a disk failing? I ran disk util repair permissions. Do I need to repair more? I have been repairing the qrecall archive and I have another capture script that backs up the same imac disk to a different external drive and that one has not been producing these errors.
When encountering a data corruption error, here's my suggested course of action: (1) Repair the volume. You did that, and that's good. Repairing a volume corrects volume and directory structure problems which can wreak havoc with other software (i.e. QRecall), and report all kinds of erroneous errors. (2) Repair the archive using the default repair settings. Unlike most software, QRecall does not trust the reliability of your computer system or your disk drive. It confirms that every bit of information in an archive is correct and unaltered before using it. Repairing the archive will test every byte, discard anything that looks suspicious, and turn all of the vetted data back into a usable archive again. Now, why is this happening? There are a number of situations where data can get damaged/lost:
Transient memory (RAM) errors can corrupt the data before it's transported to the disk drive.
The data can be corrupted while being transported to the drive's controller (via USB, Firewire, etc.)
The drive can fail to write the data correctly to the disk surface.
The data can degrade over time and become corrupted on the drive.
Perfectly good data on the drive can be overwritten with other perfectly good data because the volume's directory structure has become damaged, corrupting the archive.
The data could be misread by the drive.
The data read could be corrupted while being transferred (USB/Firewire) back to your computer.
The data read into memory could be spontaneously corrupted by a transient memory (RAM) error. If the data is corrupted before it's written to the archive, or the data is changed/overwritten/corrupted on the disk drive surface, this creates a permanent data error in the archive. QRecall will complain that this particular bit of data is corrupt every time it attempts to read it. You can tell this is the problem by running a verify and looking at the log. Expand the "bad envelope whatever" message and look at the file position of the error. A permanent error will report the same file position every time. An error that occurs while reading or transferring the data back to your system can cause transient errors. QRecall attempts to work around transient errors by re-reading any data that looks corrupt. If it's successful, QRecall will log a "Transient data error" in the log. If you find these, then your system is spontaneously damaging data as it's being read. The culprits are often the drive's controller, your USB/Firewire/etc. bus, or flaky RAM. If the failure is transient but unrecoverable, then the problem is most likely the drive controller or data buffer. If you verify the archive multiple times and it fails in different locations, then it's a transient data problem that QRecall can't automatically recover from. If you have one computer system that doesn't encounter any problems while a second computer system encounters data corruption errors—and you've verified that the volume's directory structure is OK—then the source of the problem is likely to be the second system's I/O or RAM. Keep in mind that repairing a volume using Disk Utility or similar programs only corrects volume and directory structure problems. It does not verify the integrity or reliability of drive's data storage. This is largely due to the time and effort involved; a complete surface test of a modern drive would take days to perform and most modern drives have automatic data correction and sector sparing making testing somewhat redundant. But this doesn't mean that data is never lost, and is why QRecall is so untrusting of data storage devices.
|
|
|
David Cretney wrote:I'm trying to attach screenshots of a qrecall problem I have but I get this:
Apparently, the forum software is having directory permission issues. I'm looking into it.
|
|
|
Steven M. Alper wrote:The permissions entries are (I think):
2010-03-29 03:27:04.027 -0400 Details could not create lock file
2010-03-29 03:27:04.027 -0400 #debug# IO exception
2010-03-29 03:27:04.027 -0400 #debug# API: createFileAtPath:
2010-03-29 03:27:04.027 -0400 Details Path: /Volumes/WD Green (QR)/G5_whole.quanta/.lock
2010-03-29 03:27:04.028 -0400 #debug# OSErr: 22
BSD error 22 is "invalid argument"—which doesn't make any sense. This usually indicates that the volume structure is corrupt (which can cause the file system to spit out all kinds of nonsense error codes), or the operating system has simply become "confused." A restart will clear up the latter, and a disk repair will usually fix the former.
|
|
|
Steven M. Alper wrote:I think I've got the relevant log portion:
Nope. This log messages indicates a preallocation (disk full) error. If you're encountering disk full errors, and your disk isn't full, you might want to read the threads " Leopard/Snow Leopard conflict when backing up to a network drive" and " Preallocation failed". If this describes your situation, you can disable preallocation entirely by setting QRFilePreallocateDisable—see the post " Advanced QRecall settings"—or try to workaround the bug by changing the QRFilePreallocateBugWorkaroundRule setting.
James, I thank you again for your devotion to the software and support of your customers.
My pleasure.
|
|
|
Steven M. Alper wrote:I'm seeing repeated failures to multiple archives. As we've seen in the past it's probable that QRecall is not the cause, but I'm hoping we can eliminate it before I turn to other more expensive options. The reported problems range from:
icon reference does not reference a data package which leads to:
The archive data is damaged or incomplete....
Steve, This is a bug in QRecall. The repair command doesn't check the icon references of directories, which means that even after a repair the archive will fail verification. The good new is that this is entirely a cosmetic problem; the icon reference is simply so that QRecall can display the same folder image that the Finder does. It doesn't have any impact on the actual file data stored in the archive. This has been fixed in 1.2b1, which I will get out real soon.
This is, of course, my big archive at about 237gb. I have run repairs on the archive which have always completed successfully, leaving a "Damaged" layer that does not seem to be delete-able.
You can't delete a damaged layer, but you can merge it with subsequent layers. A damaged layer is missing information which subsequent layers may have recaptured. Personally, I'd leave this layer alone until 1.2 is available. I've recently made significant enhancements and fixed a number of bugs related to the repair, merging, and recapturing of items with damaged layers.
Last night I had a verify with the "icon reference" error, followed by a successful capture. (Why doesn't a verify failure prevent further unattended actions on an archive?)
This is somewhat of semantic decision. The Verify opens the archive read-only, so (by definition) it can't modify the archive in any way, which would include changing it so that subsequent actions wouldn't run. And in your case it's a good thing, because the verify is reporting an error that repair won't fix, so you'd really be stuck if verify made your archive unusable. In general, actions like capture, compact, and merge check what they need to check as they work. If anything is amiss, they will immediately stop using the archive to avoid corrupting the data in the archive. But if they don't encounter any problems, then more than likely they haven't made anything worse than it already was.
Previous to that I had a compact fail on the same archive with:
could not create lock file
Path: /Volumes/XXX/xxx.quanta/lock
Permissions: 0x0180
That's usually a permissions problem, but not always. Without the error code reported I can only guess. The rest of the errors you report are typically caused by a corrupted volume or transient communication errors. I would suggest that you repair the volume using Disk Utility or your favorite drive repair software. If the repair reports any problems, then verify/repair the archive.
|
|
|
Ralph Strauch wrote:I just restored a folder that showed up in the Qrecall info window as 1.85gb in my archive and it turned into a 9gb folder on my hard drive. I guess that means that the size in the info window is the size of total data in the archive and the original data contained enough redundancy to cause it to balloon up like that. Is that right?
No, it's actually supposed to be the aggregate size of the original items. There are two potentially confounding factors:
The size of items in QRecall is their size in bytes, not their size on disk. If you have lots and lots of very tiny files, they may take up considerably more space. This is the discrepency between the "size on disk" and the actual number of bytes that you see in the Get Info window.
There have been long succession of bugs in QRecall that miscalculate the aggregate size of a folder. I thought I'd gotten them all, but I wouldn't at all be surprised if there were others lurking. I may bother you again when I get around to tackling this issue again.
|
|
|
I know from experience that system configuration problems can become a Gordian knot, from which a clean break is often the only expedient solution. I'm glad to hear that you've disentangled yourself from the problem.
|
|
|
Steven M. Alper wrote:However, at that point all the complaints that used to be about 501 were now occuring about 503.
Did you uninstall QRecall as user 503 before renumbering the account? If the kicker was installed for users 501 and 503, changing 503 to 501 would only replace one non-existent user with another. You can still follow the instructions I gave at the beginning of this thread.
I'm not sure my problems are solved, though, because QRecall is now complaining about not being able to create files:
3/17/10 1:13:19 PM QRecallScheduler[100] 2010-03-17 13:13:19.404 -0400 Failure could not open standard log; logging to stderr
3/17/10 1:13:19 PM QRecallScheduler[100] 2010-03-17 13:13:19.413 -0400 Failure Cannot create support folder
3/17/10 1:13:19 PM QRecallScheduler[100] 2010-03-17 13:13:19.415 -0400 Details Folder: /Users/trashed/Library/Logs
3/17/10 1:13:19 PM QRecallScheduler[100] 2010-03-17 13:13:19.491 -0400 Details NSFileOwnerAccountID: 507
3/17/10 1:13:19 PM QRecallScheduler[100] 2010-03-17 13:13:19.524 -0400 Details NSFilePosixPermissions: 448
I think you've got other problems. I don't know what user this is being reported from (I'm assuming 501), but the log is telling you that QRecall is trying to access its support folders (Logs, Preferences, Actions, ...) in /Users/trashed/Library and it can't because the folder is owned by user 507. That's neither 503 nor 501. This is really odd, because if you're logged in as user 501 then it should be using user 501's home directory. If its home folder really is /Users/trashed, then it contains items owned by user 507 ... whoever that is. Are you sure your old account's UID was 503 and not 507? If user 501 has a different account name, then the account's home path is incorrect. In general, an account should have a home path (i.e. /Users/james) and an UID (501) and all of the files within its home folder should belong to it. So something is still out of place. You might need to reassign 507 files to 501, or fix a home path, or something.
|
|
|
Steven M. Alper wrote:That may solve the QRecall problem, which I assume is only a symptom of the much larger problem I discovered shortly after I wrote you: because of the client uid validation problem, the existing users don't have proper permissions.
Yes, this would only solve the problem for QRecall. Any other software that's specifically installed for user 501 will still bump into the same issue.
Meaning no write permissions, despite appearing as if they do.
The write permissions of an account/user are not a property of that account. Every file and folder is owned by a user and a group (ignoring ACLs for the moment), which it uses to grant read, write, and execute access to that user, group, or everyone in general. The problem is likely that you have files and folders which belong to the original user (501) that do not grant your current users access. The most straight forward solution would probably be to change the UID of your principle, administrative, user account to 501, then migrating any files owned by the current UID to 501. The "new" account will now have a UID of 501 and own all of the existing 501 files along with any that it previously owned. The process for doing this should be pretty straight forward:
1 Identify the account you want to turn into user 501. Have a second, administrative, account ready. If you don't have a second administrative account, create a temporary one for the purposes of this migration.
2 Log into the account you want to change and uninstall QRecall (Shift+Option+QRecall > Quit and Uninstall), and any other software that might be a problem. You don't want a repeat of the problem you now have with 501.
3 Log out of all accounts. Log into the second administrative account.
4 In the Accounts pane of System Preferences, authenticate and right/control+click on the account that will become user 501. Choose Advanced Options.
5 In the dialog, write down the user's current UID.
6 Change the UID to 501. Don't change anything else.
7 Close System Preferences and open a Terminal window.
8 Enter the command: sudo -s
9 Type your second account's password
10 Enter the command: find / -user <old_UID_from_step_5> -print0 | xargs -0 chown 501
11 Wait for the command to finish; it will take awhile.
12 Restart Roughly, these steps change the user ID of an existing account to 501. The chown command is then used to change the ownership of any file or folder that currently belongs the existing account from what it was to 501. Now that account, all of its files, and any old files that already belonged to owner 501, now all belong to the one account. Note: Normally I wouldn't use this forum for general troubleshooting, but since Dan Frakes and I are the author of the free ChangeShortName utility, I have some experience migrating, renaming, and renumbering user accounts.
|
|
|
Steven, The reinstall may be the source of the problem, and some of the similar messages you've gotten from other software. My first question is "do you have a user account with a UID of 501?" The message would appear to indicate that you do not. Since UID 501 is assigned to the first account create by the system, this might happen if you installed a new OS, create a new account, then delete the original one created by the system. If you have no user 501, the solution might be simple. You'll need to edit the file /Library/LaunchDaemons/com.qrecall.scheduler.kickstart.plist. This file is owned by root, so you'll need an editor that can handle it, like BBEdit, or run one of the command-line editors as root (i.e. 'sudo pico /Library/LaunchDaemons/com.qrecall.scheduler.kickstart.plist '). In this file, you'll find a list of the UIDs that need a QRecall scheduler daemon:
<key>ProgramArguments</key>
<array>
<string>501</string>
<string>502</string>
</array>
Delete the element for 501 (and any other account that might not exist). Background: The QRecallKickStart daemon is a system-level daemon that spawns user-level daemons for each QRecall user that has selected the option "Start and run actions while logged out." This was done because, while OS X provides for system-level daemons, system-level agents, and user-level agents, it doesn't provide the concept of a "user-level daemon," which is what QRecall needs. The parameters in com.qrecall.scheduler.kickstart.plist tell QRecallKickStart which users have requested a scheduler daemon.
|
|
|
Prion wrote:Out of curiosity, if QRecall gets interrupted in the midst of dealing with one particular file it will resume its activity by starting over again at this particular file, right? It will not somehow miss the bits it has started to do but forgotten because it was interrupted?
Oops, almost forgot to answer that bit. Every time QRecall begins a capture, it simply compares what's on the drive to what's in the archive. If it has completely captured an item and that item hasn't changed, then it skips it and moves on. Otherwise, it captures the item again. It's simplistic logic, but it makes it almost impossible to "fool" QRecall into overlooking an item.
|
|
|
Prion wrote:what I meant was why enabling TimeMachine may have conflicted with QRecall, possibly the fact that both running simultaneously and/or that the USB drive is connected to the TimeCapsule may play a role here though I have not idea *how* exactly this may have been a problem.
To the best of my knowledge, Time Machine and QRecall should coexist peacefully, with some minor caveats. I have a number of customers who run both QRecall and Time Machine and haven't reported problems. Having QRecall and Time Machine both backing up to the same drive simultaneously can result in poor performance—for both. QRecall and Time Machine both use a fair amount of memory and are very I/O intensive. The result is more competition than cooperation, resulting in wasted effort and poor performance. A small number of customers have reported that having multiple processes hammering on the same USB drive can sometime "overwhelm" the drive or the USB interface, resulting in I/O errors. These seem to be transient, but that would definitely cause QRecall to halt in its tracks. My suggestion is to make a note of when Time Machine normally does its thing and schedule QRecall to do its heavy lifting (like daily capture and merge actions) at some other time. You might want to suspend QRecall's scheduled actions until Time Machine is finished with its initial backup.
I definitely did not run out of space, on both hard drives there is more than 200 GB of free space available.
Then it sounds like an I/O or some other event caused QRecall to stop. The log should say. If you send a diagnostic report (Help > Send Report), I'll take a look at it.
The QR Archive on the USB drive could be repaired using the auto-repair
Auto-repair occurs when something disastrous happens (crash, I/O error) to the capture process. If the archive was auto-repaired, then I suspect an I/O error.
Whatever detail it was, everything is working again now. For now, the USB drive is connected to the MBP directly and TimeMachine is disabled. I'll await your comments on how to proceed regarding 1) coexistence of TimeMachine and QRecall
Should be OK. It might be too much for your USB drive, but it shouldn't be. Also, that could change if you move the drive to your Time Capsule.
2) connecting the USB drive to the TimeCapsule.
Should also be fine. After you move your drive to the Time Capsule, you'll need to open all of your QRecall actions and change the archive in each to refer to the new archive, which will now appear on a networked volume. You will also need to be logged in for your QRecall actions to run. Mac OS X requires an account and password to mount a networked volume, and that requires you to be logged in.
Thanks for your support! The longer I work with QRecall the more I realize just how much consideration and attention to detail went into its creation.
|
|
|
|
|