QRecall

maxbraketorque

I'm in the process of backing up all the external drives attached to my stationary Mac. Three of the four drives are connected to the Mac via the somewhat outdated FW800 protocol. The QR archives are being hosted on a 4-drive RAID10 NAS with Ironwolf drives. Both machines are networked at 2.5 gbe through a multigigabit switch. I've clocked large file reads from the NAS at 250 MB/s and large file writes to the NAS at 150 MB/s.

I'm currently backing up one of the FW800 drives that consists primarily of files that are >500 MB in size. When the backup operation was started, I was seeing write speeds to the NAS of 80-90 MB/s which is right at the limit of the FW800 protocol, but now at 5+ hours into the backup, write speeds have slowed to 25-35 MB/s. Is there some behavior in QR that results in periods of reduced write speed?

James Bucanek

QRecall does not copy files. It breaks them into blocks of data and adds the unique blocks to a massive database.

As the archive grows, the corpus of previously captured data grows. Every new block of data must query the captured set to determine if it's unique. This involves several layers of hash tables and indexes, and many of these tests will require data to be read from the archive, usually in a very random manner.

So I/O performance must be less than what you'd see if you simply wrote the files. There will be occasional reading of the archive during a capture, and the drive's seek time becomes an important performance metric. In short, it's a lot of work.

maxbraketorque

ok. Thanks much for the explanation. Transfer speeds have drifted up in the last few hours.

maxbraketorque

After backing up ~4 TB of data, I'm seeing an average speed during these first capture operations of ~100 GB/hr. Not bad I guess. I think Time Machine would be faster for these initial backups, but for TM I have no option to break out my drives to their own backups, TM is very slow on subsequent incremental backups especially as the archive grows, and it appears that QR backups are much more resistant to corruption which is important for backups over a network.

maxbraketorque

My stationary computer is working its way through the first round of daily automated capture updates. Everything went quickly for three of incremental backups, but for one backup that contains one disk where I moved around ~80 GB of large files (maybe 200 files in total) without any file name changes, QR is really crawling. Based on the current progress, its going to take ~8 hrs for the incremental backup to run. Rather fascinating that its so slow to update the backup for files moved around on the disk. This is about 1/10th the speed during the initial backup of the disk.

James Bucanek

QRecall doesn't track files. It doesn't matter if you delete a file and create a new one, or move the file, or rename the file, or swap two files, or .... well you get it. All QRecall knows is that a file that was there yesterday no longer exists, and a file that wasn't there yesterday now exists.

The disappearance of the old file will be noted, and the new file will be captured, de-duplicating all of its data against what has been previously captured.

So no new data gets added to the archive (except a little meta-data), but it does take time to perform all of that de-duplication.

If fast captures during the day are important and you have plenty of extra disk space, there is a feature just for this. In the capture action there's a "defer de-duplication" option. If you set that, no data de-duplication is performed during the capture. All new file data (~80GB) is simply appended to the archive. The next compact action that runs first begins by de-duplicating all of that deferred data. Note that this is slower than if the data had been de-duplicated during the capture, but it does give you the option of performing the de-duplication when it is more convenient.

maxbraketorque

ok. I'll consider the deferred de-duplication. Thanks much.