QRecall

Matt Washchuk

Hello,

I've got a number of Macs on a LAN that I want to back up to a single file server on the LAN, and I might also look at backing up a remote machine to the server. Assuming I use a different archive for each Mac, I was wondering if you can tell me if there is anything I can do to minimize bandwidth utilization during the capture process. What I am really wondering is how QRecall verifies that only portions of a file have changed. Does QRecall have to verify the existing file in the archive (meaning, download it to the local machine where QRecall is running), put it in memory, then scan the current version of the local file, find the changes, and then upload them? If so, I am concerned that backing up multi-gigabyte files will use a significant amount of bandwidth each time it updates a file in the archive, even if the delta is quite small. But I really have no idea if this is accurate.

Do you have any suggestions on how I can keep bandwidth utilization to a minimum? I guess I already assume it would be best to run the verify/merge/compact actions directly from the file server rather than the local machines, but I wonder if there's a way to help the capture process.

Thanks so much.

James Bucanek

Matt Washchuk wrote:Do you have any suggestions on how I can keep bandwidth utilization to a minimum?

Not really, because QRecall already tries to minimize the amount of data read and written to the archive as much as possible.

QRecall performs block-level de-duplication. When it captures a file, each block of data of the file is first read. QRecall generates a hashcode from the data of that block. It then looks in the quanta index of the archive to see if a data block with that hashcode has already been added to the archive. If not, the data is unique and is added (written) to the archive. If it has, the matching block in the archive is read and compared to the block in the file to make sure they are identical (which it most likely is). This repeats with the next block until the file has been captured. (There are a few exceptions to this, but that's what QRecall is doing 99% of the time.)

So it's about as efficient as it can be. In general, capturing a 500MB file will cause either 500MB of data to be read from the archive, written to the archive, or some mixture of the two?but about 500MB is going to fly across the wire.

There is one optimization that will reduce this. It occurs when you're re-capturing a large file with very little changes (like a log file or a disk image); in that case, QRecall may skip reading the data blocks in the archive for some of the blocks in the file, significantly reducing the amount of data that has be transferred.

I guess I already assume it would be best to run the verify/merge/compact actions directly from the file server rather than the local machines, ...

That's a very good guess, and absolutely correct.