2021.12.16 09:01 "[Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

2021.12.16 16:56 "Re: [Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

Correct. The mmap alignment we can handle on the client/reader side, that is less of an issue. Obviously, it would be ideal if the offset would be page aligned too, to reduce the memory "waste" at the start of the first page, but that's less of an issue imo.

Filesystem block-size could help as another optimization, but that sounds pretty low level to me.

The filesystem block size is the unit that the OS may actually be reading and caching. If this is the case, it makes sense to get it into your app while the data is still hot.

The pixel/data sample alignment on the other hand is a crucial requirement. Without that, one cannot use the mmapped data after all, as it would lead to potential bus errors on ARM and UBSAN warnings about misalignment on x86.

Yes, alignment is critical on some CPUs and improves performance on most CPUs.

<snip / reordered for the important part>

In summary, the use of mmap and carefully aligned input data might not provide actual benefit over larger programmed (or scheduled via async-I/O) reads into a aligned buffer, even though it clearly requires an extra memory copy.

What you are writing isn't wrong, but it's sadly completely besides the point for us.

I do agree with your requirement if you are using mmap.

So, do these two things mean that a patch would be accepted upstream to make libtiff write the buffers in an aligned fashion? How would that API look like? For our purpose a minimal non-optional API that will always ensure the strip buffer offsets are BitsPerSample-aligned would be enough. Would that be acceptable upstream? Or does it have to be user configurable? If so, could you please give me a rough outline of your expectations, then I will work on this and prepare a patch.

We control who's reading the file, so there's just going to be the single mmap and no other copy of it.

If there is more than one file, then there will be another copy. If there are multiple applications reading from the same file at approximately the same time, then mmap will provide a win if the system memory is large enough.

This is unrelated to the issue at hand, but it piques my interest nevertheless: Can you please expand on this "another copy"? We have some files A, B, C, and then each has potentially multiple image buffers, say A1, A2,... Then we create read-only mmaps for these buffers, say M_A1, M_A2,..., M_B1,... On access we'll trigger a page fault that will make the kernel load the data from the disk to fill the page mapping. But there's only going to be one single "copy" of a given file page here in this scenario. Where is the "other copy"?

When the OS runs short of memory, memory pages allocated for mmap (which continue to exist in memory even after your app has removed the mapping) need to be identified and evicted to make room for new mappings. In the past, I have noticed that some OSs have rather poor eviction rates (e.g. it is done by a kernel thread which wakes up) and so it is possible to allocate memory faster than it is reclaimed.

If you are using a Unix/Linux type OS, then madvise(2) may provide some measure of control over how long the OS retains the memory (e.g. MADV_WILLNEED/MADV_DONTNEED), and also provide a clue about sequential reading (MADV_SEQUENTIAL). Unfortunately, systems vary widely as to how they handle the advice, or if an option is regarded at all.

Linux and macOS copes will without the madvise in our case. Eviction rates are also never an issue with what we are throwing at the system. It's only Windows which is problematic in some cases due to its bad eviction of dirty pages. But clean read-only mapped segments from tiff files or other formats that are directly mmappable is mostly fine in our experience.

Cheers

Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts