Mostly Harmless wrote:It`s clear that data integrity is important to you - unlike to some other developers of Mac backup software.
I'm glad that message is clear.
Yes, data integrity is very important.
What I really want is 1:1 copies of source data. No compression. I want to be able to access the data with finder and other programs for testing and restoring.
There are a bazillion programs that will make straight copies of files. There are ancient UNIX utilities that will checksum files. QRecall is not one of those.
Would this be optionally possible with QRecall (planned) or would you consider to add this feature?
I have this crazy dream, from time-to-time, of writing a file system plug-in that would mount a QRecall archive as though it was a volume containing the original files (read-only, of course). I'm not sure what benefit it would have so I keep shoving the feature off to be considered for a future version.
If you will not offer 1:1 copies you don`t have to read the rest of my post because I am afraid this is a must for me.
Keep reading; you might change your mind.
Verifying data:
1.
From your site I learned that backed up files are verified.
Usually I would ask if this is done by comparing the backed up files bit-by-bit to the original or with hash files.
I noticed however that QRecall doesn`t necessarily store complete copies of files if they are just new versions of files that already exist.
QRecall doesn't archive files. It archives the data in files. It breaks every file up into small blocks of data and stores each block in a massive database. Each block is uniquely identified and hashed.
When a QRecall archive verifies its data, it doesn't need the original files for comparison. It stores hashcodes and interlocking integrity checks for every single block of data, file, and directory information. This allows it to determine if any of the archive data has been altered or damaged.
As I wrote I am looking for a solution that does 1:1 copies and I want these to by verified either by bit-by-bit comparision to the original or by using hashes derived from the original.
QRecall does the latter. But on a block-by-block basis, not a file-by-file basis.
To have this with QRecall would first require that you use 1:1 copies as an option.
2.
Validation of source data:
Actually this is a *very* important feature though it`s not implemented in any backup solution for the Mac that I know of (and I am pretty sure I know about pretty much all products available).
I've considered implementing a "compare" feature that would compare the file information in a QRecall archive with what's on the disk. Surprisingly (up to now) no one has requested this feature.
By creating and *saving* hashes for each file that is going to be backed up from the source harddrive, both the original source data and the backup copies can be verified days or weeks later.
Why?
The original source data can be corrupted.
The problem with a compare function as an integrity tool that that files change
all the time. On modern computer systems, you can't blink without three files getting modified somewhere on your disk. So a compare feature can't tell you if one of your source files is damaged, only that it's
different. And it will tell you that hundreds of files are different every day.
Until then the defect files may already have been replacing all previous copies on daily, weekly and monthly backup drives.
So chances are that you have bought 3 external harddrives for redundancy, maybe you keep one offsite. You have carefully chosen and paid for a professional software solution.
You do all your homework but in the end you still lose data.
Not because it`s not backed up but because both the source and all backups are damaged and it wasn`t noticed.
No software (other than the application that created it ... and even then it's not always possible) can tell you if a document has become damaged. Comparing the files on your disk with the copies in your backup only tell you what's changed, not what shouldn't have changed but did.
The more practical approach, and the one that QRecall takes, it to ensure the integrity of the copy — not the original. By keeping efficient multi-generational copies of files, should you find a corrupted copy of a source file you can go back to the archive, go through its history, and retrieve a good copy of it before it became damaged.
http://www.taobackup.com/integrity.html
"...You will not achieve enlightenment until you control the integrity of your data, for a copy is useless if the original is corrupt."
I submit that the copy is useful for recovering from a corrupt original. That is, after all, why we take backups. But a copy is useless if it's corrupt...
http://www.taobackup.com/integrity_info.html
"The greatest danger to your data is not catastrophic failure, but subtle damage that goes undetected. The corruption of several files on your disk may cause great damage to your business in the long term, but go entirely unnoticed in the short term. If it does go undetected, the corrupt files will flow through your backups until there are no uncorrupted copies left."
I wholeheartedly agree, but so far no one has figured out how to take the "subtle" out of subtle damage. Yes, corrupted source files will percolate through your backups. And if you only make 1:1 copies each day (replacing the copy each time) you have less than 24 hours to discover that damage before it's too late. QRecall lets you keep days, weeks, months, even years worth of incremental copies in a single archive. This gives you months to discover that something has gone wrong and retrieve a valid version from your archive.
Due to the processing power to calculate hashes and the drive space needed to store the hash data I would probably want to use this feature only on the most important data (stuff I create myself). If would be nice to be able to browse the home directory and activate a checkbox for the feature for any folder.
QRecall considers all data imporant and performs its data integerity checks for every nibble of data in the archive.
A scheduled validation of the source data would be what I am dreaming of.
A Verify action can be scheduled to run on whatever schedule you choose. I verify archives daily (of course, I do a lot of experimentation that's likely to damage them).
I want to be actively informed of any errors (with a pop-up alert panel, some blinking symbol in the menu bar, alarm sound or a clearly visible statement in a summary at the end of a backup-job).
The QRecall activity window will indicate any action that didn't complete successfully. QRecall is also Growl savvy.
What I don`t want is the backup-job to stop.
Maybe I start a backup before I leave. When I come back I discover that the backup stopped because the user(-account) had no read permission for one folder or file. All other files could have been backed up but the program stopped the whole process when the error occurred. Not very smart.
A transient permission or read error with a single file will be logged, but won't stop a capture. Errors with the archive itself will immediately stop a capture, in order to prevent further damage to the archive.
Start a backup-job when a particular volume is mounted.
QRecall has "event" schedules that can start a capture when the volume containing the items to capture is mounted, or when the volume containing the archive is mounted. The later lets you create a mobile volume that can be plugged into multiple machines; the capture starts as soon as the volume is plugged in.
What happens if a file is saved while QRecall is backing up this particular file?
The answer is "it depends." If the application write a monolithic file using the common "safe save" technique, QRecall will capture the old copy of the document intact. If not, then the captured copy may be inconsistent and you'll have to perform another capture of the document while it isn't being modified before you have saved a valid copy. All file copy and backup program suffer from the same problem, even Apple's Time Machine.
At versiontracker the info says "beta".
Is it a beta or a finished product?
VersionTracker has apparently been purchased by download.com, and the means to edit old listings is now broken. As soon as its fixed again, I'll update the listing.
I would like to be able to pause a backup-process (if I need the resources to do something with the computer). It would be nice if the process could auto-start again after while, if the computer is idle (screensaver on?), in case I would forget to continue the backup (by pressing pause again before I leave the computer).
In the activity window, you have two choices. Stop and Reschedule cancels the current action and schedules it to run again at some future date. Or, you can pause a running action for varying amounts of time from between 5 minutes and 4 hours. The action will automatically resume after that time, or you can manually resume it at any time.