Ralph Strauch wrote:I?ve had a couple of strange backups this week, with qrecall capturing what seemed like an extreme amount of data almost certainly from files that had not changed.
Ralph,
Thanks for the diagnostic report. Out of curiosity, I wrote a script to extract the total capture action time, number of folder changes, total amount analyzed (new+changed files), and the amount of new (unique) data added to your archive.
Legend:
Circles: time the capture took to complete
Squares: number of changed folders. The spikes are an artificial number indicating where QRecall ignored the folder history and performed a deep scan of the filesystem
Orange bars: The amount of data (in GB) captured (read from new or changed files)
Red bars: The amount of unique data (in GB) added to the archive
As you can see, the long capture times correlate, pretty closely, with either (a) a deep scan of the filesystem or (b) an inordinate amount of captured data. Some of the longest capture times naturally occurred when QRecall captured the most data (the biggest four were 8GB, 12GB, 18GB, 59GB, and 139GB).
The real question is why QRecall is capturing all of this data when you believe that not much data has changed. A few of the longer captures are clearly legitimate. The 12GB and 18GB captures, for example, captured 9GB and 8.5GB of new data (respectively). So in those instances, there was at least 8GB of new data that had to be added to the archive.
The suspicious captures are the 59GB (6GB or 10% new data) and the 139GB (2GB or 1.4% new data). In these cases, it would appear that a substantial amount of the data did not change but was recaptured because QRecall though it might have changed.
Why QRecall thought the files change is anyone's guess at this point, although you could compare the file info for the files that were captured with that of pervious layers for clues. Any change in the file's creation/modification date, length, name, etc. will cause QRecall to recapture that file's data.
Another, often overlooked, cause for recapturing a file is renaming a folder. Let's say I have a "Working VMs" folder that I rename "Project VMs". QRecall sees this as a deleted folder and a new folder, which it then captures as if they were all new files. Of course, 99% of the data in the "new" folder will be duplicates to what had already been captured in the "deleted" folder, but the files are still recaptured in their entirety, which takes time.
In conclusion, I don't see anything amiss, from QRecall's perspective. The question is what's touching or renaming files that might cause QRecall to recapture large swathes of data.