2021.12.16 09:01 "[Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

2021.12.16 15:44 "Re: [Tiff] Ensuring buffer alignment when creating tiff files", by Bob Friesenhahn

Correct. The mmap alignment we can handle on the client/reader side, that is less of an issue. Obviously, it would be ideal if the offset would be page aligned too, to reduce the memory "waste" at the start of the first page, but that's less of an issue imo.

Filesystem block-size could help as another optimization, but that sounds pretty low level to me.

The filesystem block size is the unit that the OS may actually be reading and caching. If this is the case, it makes sense to get it into your app while the data is still hot.

The pixel/data sample alignment on the other hand is a crucial requirement. Without that, one cannot use the mmapped data after all, as it would lead to potential bus errors on ARM and UBSAN warnings about misalignment on x86.

Yes, alignment is critical on some CPUs and improves performance on most CPUs.

We control who's reading the file, so there's just going to be the single mmap and no other copy of it.

If there is more than one file, then there will be another copy. If there are multiple applications reading from the same file at approximately the same time, then mmap will provide a win if the system memory is large enough.

When the OS runs short of memory, memory pages allocated for mmap (which continue to exist in memory even after your app has removed the mapping) need to be identified and evicted to make room for new mappings. In the past, I have noticed that some OSs have rather poor eviction rates (e.g. it is done by a kernel thread which wakes up) and so it is possible to allocate memory faster than it is reclaimed.

If you are using a Unix/Linux type OS, then madvise(2) may provide some measure of control over how long the OS retains the memory (e.g. MADV_WILLNEED/MADV_DONTNEED), and also provide a clue about sequential reading (MADV_SEQUENTIAL). Unfortunately, systems vary widely as to how they handle the advice, or if an option is regarded at all.

In summary, the use of mmap and carefully aligned input data might not provide actual benefit over larger programmed (or scheduled via async-I/O) reads into a aligned buffer, even though it clearly requires an extra memory copy.

What you are writing isn't wrong, but it's sadly completely besides the point for us.

I do agree with your requirement if you are using mmap.

Bob

Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt