QRecall Community Forum
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Top Downloads] Top Downloads   [Groups] Back to home page 
[Register] Register /  [Login] Login 

Capture reading everything back from archive RSS feed
Forum Index » Beta Version
Author Message
Adrian Chapman


Joined: Aug 16, 2010
Messages: 72
Offline
Almost every time I run a capture from my Mac Pro to my backup drive connected to my Mac Mini I seem to end up with the Mac pro having to download almost every file from the backup in order to compare it to the original, and the number of changed folders seems to be extremely high.

For instance, having performed a capture of my user directory with only a few changes which was done very quickly, when I next tried, barely 45 minutes later having done nothing more than checked my e-mail and browsed a few web sites, QRecall informs me 801 folders have changed, and then embarks on this comparison exercise on files in folders that haven't been opened or changed in years. The archive verifies OK.

Any ideas?
Adrian Chapman


Joined: Aug 16, 2010
Messages: 72
Offline
Hmm, bad practice replying to my own post, but I am adding to it rather than replying.

As I understand it QRecall uses FSEvents to monitor changes, but FSEvents only logs the fact that the contents of a folder have changed, and it is then down to QRecall to examine the contents of the folder and decide what needs to be updated in the archive.

I have probably got this completely wrong but from my experience on my mac Pro and indeed subsequently on my wife's Macbook, it would appear that QRecall actually has to compare the archived file to the current one, byte by byte, and it seems to do this even for files that have unchanged created and modified dates. If this is indeed how QRecall works I can see horrendous problems with folders that contain, say, many very large files which rarely change, and perhaps a few small text files which change frequently.

Please tell me I am wrong and why my machine seems to be behaving as I have described.
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
Adrian,

All good questions. My short answer is that I don't know why the operating system is telling you that hundreds of folders have changed (beyond the obvious reason that hundreds of folders did, in fact, change). QRecall only recaptures items under certain circumstances (which I explain later), but it should not try to recapture an item that hasn't been touched in anyway.

Let me see if I can accurately explain exactly what QRecall does and why.

Recapturing items can be largely divided into three phases: Folder changes, metadata, and file data.

In the first phase, QRecall requests the list of potentially modified folders from the FSEvents service. As you correctly observed, FSEvents only produces a list of folders that might contain changes. It's QRecall's job to examine the contents of those folders to find out what, if anything, has actually changed.

While that sounds simple enough, these things never are and there are a few caveats that you want to be aware of:

  • FSEvents is not infallible and QRecall only trust it for a limited amount of time. The default is to trust FSEvent change history for 13.9 days, which will cause QRecall to perform an exhaustive search of your entire folder tree about every two weeks. This setting can be changed using the QRAuditFileSystemHistoryDays advanced setting. So the fact the QRecall occationally runs out and rescans everything isn't, by itself, surprising.

  • The history of FSEvent information is kept on a per-captured-item basis, and can't always be applied to future captures. This means that QRecall might scan more folders than you expect if you have mixed or overlapping capture actions defined. For example, if you have one capture action that captures your entire home folder and another that captures just your Documents, capturing your Documents folder doesn't help the history that's saved for your home folder. The next time you capture your home folder, it will rescan all of the historic changes in your Documents folder too because it can only use the history information that entirely encompasses the item that it's capturing. I admit that this is confusing, but it has to work that way or QRecall might miss changes.

  • QRecall will ignore the FSEvent information and perform an exhaustive scan of your folders if the previous layer was incomplete, or contains any information from layers that are incomplete or marked as damaged by a repair. Items marked as "-Damaged-" are always recaptured in their entirety.


  • So, now that QRecall has its marching orders, it's time to examine the individual items in each folder. This it the metadata phase. QRecall starts by comparing the metadata for each item with what's in the archive. "Metadata," for those new to this, is "information about information." In this case, the metadata of an item is things like it's name, created date, last modified date, ownership, permissions, extended attributes, launch services attributes, custom icon, and so on. It's basically everything that the operating system knows about the item except its actual contents.

    If all of the metadata for an item is identical to what's in the archive, the item is skipped. If any changes are noted, the metadata for that item is recaptured. This doesn't mean the entire file is recaptured (that's later), just that the metadata about that file is recaptured so that QRecall always has the latest information about that file.

    The reading, testing, and even recapturing of metadata is pretty fast and most folders only require a fraction of a second to determine what items in that folder need to be recaptured.

    If QRecall finds any changes in a file's metadata that might indicate its contents could have been modified (creation date, last modified date, attribute modified date, name, extended attributes, number of data forks, length of data forks), it proceeds to recapture the data for that item. This consists of reading every byte of the file and comparing that to what's already in the archive. This is the data phase of the capture, and the one that takes the most time.

    If you believe that QRecall is recapturing file items that it should be breezing past, you can find out why with a little sleuthing.

    The advanced setting QRLogCaptureDecisions (see the Advanced QRecall Settings post) will log the reason that QRecall decided to capture each item. Note that it only logs the first reason; there could be more than one. This will tell you something about what it is about the item that triggered QRecall's decision to (re)capture the item. Warning: This setting logs a ridiculous amount of information to the log file, so don't leave the setting on once you've found the information that you're looking for.

    If you find that all of these files have really been modified, then I would go hunting for some background or system process that is surreptitiously rummaging around your file system in the background.

    - QRecall Development -
    [Email]
    Adrian Chapman


    Joined: Aug 16, 2010
    Messages: 72
    Offline
    Hi James

    Thanks yet again for a speedy and detailed reply.

    James Bucanek wrote:
  • FSEvents is not infallible and QRecall only trust it for a limited amount of time. The default is to trust FSEvent change history for 13.9 days, which will cause QRecall to perform an exhaustive search of your entire folder tree about every two weeks. This setting can be changed using the QRAuditFileSystemHistoryDays advanced setting. So the fact the QRecall occasionally runs out and rescans everything isn't, by itself, surprising.


  • I understand this and could I perhaps make the suggestion that when QRecall decides it is time to go rummaging, it flags up some sort of warning so that this deep search can either be postponed or rescheduled. Would it be possible to have a preference so that the time when this occurs could be set. It could be very inconvenient if QRecall embarks on an action that could last many hours just as you need to shut down your machine.

    James Bucanek wrote:
  • The history of FSEvent information is kept on a per-captured-item basis, and can't always be applied to future captures. This means that QRecall might scan more folders than you expect if you have mixed or overlapping capture actions defined. For example, if you have one capture action that captures your entire home folder and another that captures just your Documents, capturing your Documents folder doesn't help the history that's saved for your home folder. The next time you capture your home folder, it will rescan all of the historic changes in your Documents folder too because it can only use the history information that entirely encompasses the item that it's capturing. I admit that this is confusing, but it has to work that way or QRecall might miss changes.


  • My present arrangement is to backup my entire system EXCEPT my home directory once per day, and to backup my home directory at intervals of 2 hours. As it happens both actions will occur at the same time (10:00am) so one or the other gets queued. Would it be better to make the full system action backup my home directory too, and make the separate home directory backup skip the 10:00am one.?

    James Bucanek wrote:
  • QRecall will ignore the FSEvent information and perform an exhaustive scan of your folders if the previous layer was incomplete, or contains any information from layers that are incomplete or marked as damaged by a repair. Items marked as "-Damaged-" are always recaptured in their entirety.


  • This may well be what has been causing me problems because sometimes I have cancelled a backup. Would it avoid the full rescan if the incomplete layer is deleted? prior to the next scheduled backup?

    James Bucanek wrote:
    If all of the metadata for an item is identical to what's in the archive, the item is skipped. If any changes are noted, the metadata for that item is recaptured. This doesn't mean the entire file is recaptured (that's later), just that the metadata about that file is recaptured so that QRecall always has the latest information about that file.

    The reading, testing, and even recapturing of metadata is pretty fast and most folders only require a fraction of a second to determine what items in that folder need to be recaptured.

    If QRecall finds any changes in a file's metadata that might indicate its contents could have been modified (creation date, last modified date, attribute modified date, name, extended attributes, number of data forks, length of data forks), it proceeds to recapture the data for that item. This consists of reading every byte of the file and comparing that to what's already in the archive. This is the data phase of the capture, and the one that takes the most time.


    Yes I understand this and it's what I had expected QRecall to do.


    James Bucanek wrote:If you believe that QRecall is recapturing file items that it should be breezing past, you can find out why with a little sleuthing.

    The advanced setting QRLogCaptureDecisions (see the Advanced QRecall Settings post) will log the reason that QRecall decided to capture each item. Note that it only logs the first reason; there could be more than one. This will tell you something about what it is about the item that triggered QRecall's decision to (re)capture the item. Warning: This setting logs a ridiculous amount of information to the log file, so don't leave the setting on once you've found the information that you're looking for.

    If you find that all of these files have been really been modified, then I would go hunting for some background or system process that is surreptitiously rummaging around your file system in the background.


    This thought had passed through my mind, and the problem only seems to have occurred on the Mac Pro since I started playing around with Path Finder but I can't believe that is tinkering with the file system in such a way that so many files need to be checked. Neither does it explain why my QRecall on my wife's Macbook today suddenly decided it needed to to rummage through 1GB of files and save just 80MB or so of modified files.

    Anyway, thanks again for all the information. I will try the QRLogCaptureDecisions idea if the problem persists.

    I love this application and I am sure my copy of QRecall and I can come to some sort of mutually acceptable way of working

    Adrian
    James Bucanek


    Joined: Feb 14, 2007
    Messages: 1572
    Offline
    Adrian Chapman wrote:Would it be possible to have a preference so that the time when this occurs could be set.

    I'll add that to the wish list.
    It could be very inconvenient if QRecall embarks on an action that could last many hours just as you need to shut down your machine.

    That's what the Stop and Reschedule action menu item (in the monitor window) was designed for. If it's time to shutdown, stop and reschedule the action to run in a few minutes, or at least enough time to shutdown. When the system reboots, the action will pick right back up where it left off.

    My present arrangement is to backup my entire system EXCEPT my home directory once per day, and to backup my home directory at intervals of 2 hours.

    I would recommend that you NOT exclude your home folder. The Exclude items feature is designed to exclude (thus the name) items from ever being captured. QRecall treats excluded items as if they did not exist. When you capture your entire volume and exclude your home folder, you create a layer in the archive where your home folder does not exist. If you restored your system using this layer, you wouldn't have a home folder.

    Set one action to capture your entire volume, and a second action to periodically capture your home folder during the day. Now you can recall from any layer and you'll get your system files up to the last day, and your personal documents up to the last hour.

    I have plans for a new "Ignore" feature that will do what you're trying to use the Exclude filter for, but it hasn't been implemented yet.

    This may well be what has been causing me problems because sometimes I have cancelled a backup. Would it avoid the full rescan if the incomplete layer is deleted? prior to the next scheduled backup?

    If that's what's causing the rescan, then yes.

    This thought had passed through my mind, and the problem only seems to have occurred on the Mac Pro since I started playing around with Path Finder but I can't believe that is tinkering with the file system in such a way that so many files need to be checked.

    Hard to say. Since Path Finder (ironically, one of my first Apple ][ applications was a program called Pathfinder for ProDOS) is a file browser, it might (like the Finder) store information in invisible files (like .DS_Store) or use extended attributes. Any changes like this would cause QRecall to recapture items.

    I love this application and I am sure my copy of QRecall and I can come to some sort of mutually acceptable way of working


    - QRecall Development -
    [Email]
    James Bucanek


    Joined: Feb 14, 2007
    Messages: 1572
    Offline
    (Replying to my own post...)
    James Bucanek wrote:If you restored your system using this layer, you wouldn't have a home folder.

    It also occurred to me that this is one more way in which QRecall will be forced to recapture everything in your home folder.

    When you capture your entire volume and exclude your home folder, it's just as if your home folder doesn't exist. The next time you capture your home folder, QRecall sees a brand new folder with thousands of items in it! It has no history or metadata to compare with, so it performs a first-time capture of every file.

    - QRecall Development -
    [Email]
    Adrian Chapman


    Joined: Aug 16, 2010
    Messages: 72
    Offline
    James Bucanek wrote:
    I would recommend that you NOT exclude your home folder. The Exclude items feature is designed to exclude (thus the name) items from ever being captured. QRecall treats excluded items as if they did not exist. When you capture your entire volume and exclude your home folder, you create a layer in the archive where your home folder does not exist. If you restored your system using this layer, you wouldn't have a home folder.

    Set one action to capture your entire volume, and a second action to periodically capture your home folder during the day. Now you can recall from any layer and you'll get your system files up to the last day, and your personal documents up to the last hour.


    James

    As usual, you have hit the nail on the head. It was the exclusion of my home folder in the daily full backup that was causing the problem. QRecall has been running perfectly for several days now, and it's fast.

    Thanks again.
     
    Forum Index » Beta Version
    Go to:   
    Mobile view
    Powered by JForum 2.8.2 © 2022 JForum Team • Maintained by Andowson Chang and Ulf Dittmer