QRecall Community Forum
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Top Downloads] Top Downloads   [Groups] Back to home page 
[Register] Register /  [Login] Login 

Recapture of unmodified .pdf files RSS feed
Forum Index » Cookbook and FAQ
Author Message
Pierpaolo Remelli


Joined: Jan 11, 2012
Messages: 13
Offline
Hi,
I'm using beta version 1.2.0.55 of QRecall.

I noticed that opening/closing a .pdf file (both in Preview and Acrobat) without making any modifications will anyway result in a capture of the file as if it was modified. QRecall can recognize that the file has not changed (the log says data are 100% duplicate) but the file is anyway captured again.

Is there a way to avoid this behavior?

Thanks,
Pierpaolo
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
pirem71 wrote:I noticed that opening/closing a .pdf file (both in Preview and Acrobat) without making any modifications will anyway result in a capture of the file as if it was modified.

Something is changing.

QRecall looks a variety of information about each item to determine if it should recapture it. This includes the file's dates (creation date, data modification date, attribute modification date), permissions, ownership, extended attributes, finder information, resource forks, access control lists, and so on. If any of that has changed, QRecall will recapture the item. In this case, since the actual data of the items hasn't changed, the only thing QRecall will add to the archive is the modified metadata.

If you want to know exactly what's changing, you can temporarily set the QRLogCaptureDecisions option (see advanced setting). Perform a capture, change this setting to true, open and close a PDF, perform another capture, and then delete the setting (or set it to false). The log will record the first reason that QRecall decided to recapture each item.

- QRecall Development -
[Email]
Pierpaolo Remelli


Joined: Jan 11, 2012
Messages: 13
Offline
Hi James,
I set the QRLogCaptureDecisions option and made a few tests (after opening/closing one .pdf file at a time in Preview or Acrobat, sometimes only scrolling the document sometimes doing absolutely nothing).
I ended up with different behaviors but I'm not able to find a correlation between what I did and the result of capture action:

1. In one case only the capture action didn't create a layer (the log says, among other things, "Nothing captured")

2. 90% of the times a new layer was created capturing 1 item (log says "Action 2012-01-13 15:00:40 Minutia Capture Decision; capture file .DS_Store because content modification date different; was 2012-01-13 14:52:50 +0100, now 2012-01-13 14:59:55 +0100").
If I check the new layer in the main Archive window (I don't know if this is the correct name) I cannot see the modified .DS_Store file (since they are not displayed) but the folder containing the .DS_Store is marked as modified in its Timeline.

3. 10% of the times a new layer was created capturing 2 item: the .DS_Store and the viewed .pdf file itself.
The log says (for .pdf only) "Action 2012-01-13 15:02:12 Minutia Capture Decision; capture file Decisione_CE_3_maggio_2000,_n._532.pdf because attribute modification date different; was 2011-11-15 21:25:01 +0100, now 2012-01-13 15:01:59 +0100.
Anyway having a look un the Finder the modification date for that file is still 2011-11-15 21:25

A few thoughts about the results:

1. A folder is marked as changed if its .DS_Store file is changed even if it's not shown (since .DS_Store files are actually captured). Is it possible to avoid this behavior or even exclude .DS_Store files from capture? Are there any advantages in capturing .DS_Store files that I'm not aware of?

2. It would be nice to be able to select a Layer in the main Archive window and choose a command "Show capture log" (that shows the log in a separate window or, as an alternative, open the standard Log window and highlights the searched row) instead of manually search the Log window comparing the time column

3. My guess about modification time issue: maybe Preview makes a "temporary" modification of the time when you open the file and discard it when you close it in case you made no changes. The temporary date could be stored somewhere (.DS_Store?) and it misleads QRecall.


Note: I made a copy/paste of Log file rows; is there a better way to include portions of the Log in a post?

Regards,
Pierpaolo

James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
pirem71 wrote:1. In one case only the capture action didn't create a layer (the log says, among other things, "Nothing captured")

In that case, nothing changed. If a capture can't find anything to capture, it doesn't create a layer.

2. 90% of the times a new layer was created capturing 1 item (log says "Action 2012-01-13 15:00:40 Minutia Capture Decision; capture file .DS_Store because content modification date different; was 2012-01-13 14:52:50 +0100, now 2012-01-13 14:59:55 +0100").

QRecall found that a .DS_Store file in the folder changed and capture it.

If I check the new layer in the main Archive window (I don't know if this is the correct name) I cannot see the modified .DS_Store file (since they are not displayed) but the folder containing the .DS_Store is marked as modified in its Timeline.

.DS_Store files are normally invisible. Choose View > Show Invisible Items in the QRecall browser to see invisible items.

3. 10% of the times a new layer was created capturing 2 item: the .DS_Store and the viewed .pdf file itself.
The log says (for .pdf only) "Action 2012-01-13 15:02:12 Minutia Capture Decision; capture file Decisione_CE_3_maggio_2000,_n._532.pdf because attribute modification date different; was 2011-11-15 21:25:01 +0100, now 2012-01-13 15:01:59 +0100.
Anyway having a look un the Finder the modification date for that file is still 2011-11-15 21:25

The data modification date and the attribute modification date of a file are different things. You normally don't see the attribute modification date, but it's one of the metadata values that QRecall checks. The attribute modification date is changed whenever some process sets the attributes, permissions, extended attributes, or other metadata for a file.

1. A folder is marked as changed if its .DS_Store file is changed even if it's not shown (since .DS_Store files are actually captured).

That's correct. In QRecall. a folder is considered to have changed if its metadata or immediate contents have changed.

Is it possible to avoid this behavior

I can't imagine why one would want to.

or even exclude .DS_Store files from capture?


Not at this time, but arbitrary file filers are on the to-do list.

Are there any advantages in capturing .DS_Store files that I'm not aware of?

.DS_Store files keep all of the display information used by the Finder. It includes the screen position and size of the folder's window, its display mode (icon, column, ...), sorting preference, and a host of other details. The Finder is constantly updating these files as you interact with Finder windows.

2. It would be nice to be able to select a Layer in the main Archive window and choose a command "Show capture log" (that shows the log in a separate window or, as an alternative, open the standard Log window and highlights the searched row) instead of manually search the Log window comparing the time column

Layers and log records do not have a one-to-one correlation.

Note: I made a copy/paste of Log file rows; is there a better way to include portions of the Log in a post?

The forum has a "code" tag that might make them easer to read, but other than that copying and pasting seems like the best way.

- QRecall Development -
[Email]
Pierpaolo Remelli


Joined: Jan 11, 2012
Messages: 13
Offline
Hi James,
good to hear from you that exclusions based on filters are under development.

Back to capture process, with my previous post I didn't want to imply in any way that QRecall is doing something strange or even wrong.
If QR captured a revision of a file, something MUST be different on that file (or attributes, metedata, etc).

My only concerns are from the perspective of low level user:

1. I'm still not able to tell how the instances captures differs from each other and that creates a sense of lack of control on what's going on. I must probably blame it to my scarce knowledge of Mac environment but I'm still uncomfortable with that (I hope you can understand me).

2. Let's consider the moment you have to make a restore (the moment of truth for a back up system). If files accessed for viewing are continuously captured, after a few years I will have 100-200 (or more) instances of a certain file spread along its timeline.
Restoring a specific version of file could became difficult (since usually one doesn't know the modification date of the file and maybe not even which is exactly the version he's looking for).
My approach in that situation would be to restore in a folder all the instances included in a time range and throw them to a program that can highlight differences.

Is it the right strategy or am I misunderstanding the restore process?

Is there any possibilities to perform an automatic recall of all versions of a file over a time range defined shading/un-shading layers (e.g. adding incremental suffix to the name)?

Regards,
Pierpaolo
James Bucanek


Joined: Feb 14, 2007
Messages: 1572
Offline
pirem71 wrote:1. I'm still not able to tell how the instances captures differs from each other and that creates a sense of lack of control on what's going on. I must probably blame it to my scarce knowledge of Mac environment but I'm still uncomfortable with that (I hope you can understand me).

This problem isn't unique to QRecall, and I'm afraid I don't have a simple answer. What constitutes a significant change in a file is subjective. QRecall's #1 job is to capture all changes to a file over time, and almost by definition that means it's going to capture changes that you're not particularly interested in.

Restoring a specific version of file could became difficult (since usually one doesn't know the modification date of the file and maybe not even which is exactly the version he's looking for).
...

In this particular case, you shouldn't have to look far. QRecall is recapturing the item because metadata (permissions, extended attributes, ...) is changing. But that won't change the modification date of the file, which should only change if the data of the file changes. So you can simply view the most recently captured version of the file and look at its modification date. This will tell you the last time the actual contents of the file changed. All intermediate captures are probably uninteresting.

Is there any possibilities to perform an automatic recall of all versions of a file over a time range defined shading/un-shading layers (e.g. adding incremental suffix to the name)?

Not at this time. (Again, you could write probably write a simple shell script to do this once a command-line version of QRecall was available.) I have a more general solution to this kind of problem on the drawing board, which I hope to attack after 1.2.0 is released.

- QRecall Development -
[Email]
 
Forum Index » Cookbook and FAQ
Go to:   
Mobile view
Powered by JForum 2.8.2 © 2022 JForum Team • Maintained by Andowson Chang and Ulf Dittmer