QRecall

Christian Roth · screenshot 2018-10-04 at 13.02.53.png

Hi,

I had to do a repair of my archive, which lives on a NAS. While initially, all went with a performance I?d expect for my big archive, I notice that the final ?Reindexing Layers? step seems to slow down more and more the more it nears the archive end. The attached screenshot shows the state after running for about 43 hours, and progress seems to have come almost to a halt compared to progress about 6 hours ago.

Is this expected due to the data structures that need to be read/updated in this stage, or could it be I am thrashing (my Mac has 18 GB of RAM, so I think QRecall should definitely use the maximum of 8 GB, as I can see from the description in the Advanced preferences window for this setting, which is set to ?Actual physical memory?)? Is a 1.6 TB archive just too big?

Thanks,
Christian

James Bucanek

A 1.6TB archive is pretty big for being hosted on a NAS. The problem isn't so much the size of the archive, as much the number of records you've accumulated and the nature of networked volume transactions. It's also not a memory issue.

The early part of the repair (checking archive data) is highly optimized to read large chunks of the archive, sequentially. This can be very efficient even on a networked volume, and is usually constrained only by the network's speed.

Some of the later phases, like reindexing the layers, is highly transactional. QRecall is going back to each file and directory record and reconstructing the tree of items captured in each layer. It reads these (small) records one at a time.

Most filesystems are pretty smart about this, and the operating system will both cache data (so that reads of multiple, small, records come from RAM) and read ahead (so that it's likely the next read request can be satisfied from RAM). Networked filesystems tend to be transactional, which means that every tiny read request gets packaged up as a request, sent to the server, the server performs that request, and returns the results. Wash, rinse, repeat.

This means that the latency of requests for networked storage is much higher than for local storage. You can probably see this in the Activity Monitor. The total network throughput will be rather modest, but you'll see that the number of network requests per second is in the thousands.

You have a some 26,000,000 file and directory records to read and process. If you're processing only 1,000 records per second, that's at least 6 hours just to read them all. If your latency is higher, you might only be processing a few hundred a second, which could mean 18 or 36 hours. In contrast, a local drive would typically be able to read 2,000 to 3,000 records per second.

I'm working on future optimizations so that QRecall can group multiple record reads into a single request, which I hope will speed up networked volumes. But for now, it relies on the native filesystem drives for whatever optimization it can get.

In conclusion, if the repair is once-in-a-blue-moon occurrence, you might just live with the performance for now. Another option might be to split up the archive. I have a Mac mini that I capture the media (iTunes) folder to one archive and a second archive captures everything else (by capturing the startup volume but excluding the iTunes folder). These archives are stored on an Airport time capsule, and it keeps the size of each archive down to a manageable size.