2021.12.16 09:01 "[Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

2021.12.16 14:43 "Re: [Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

As you can see, these offsets are misaligned for a 2byte/16bit greyscale image.

Looking at the libtiff API, we cannot find anything that would allow us to ensure that the SubIFDs are aligned correctly. Are we missing something or is this simply not possible currently?

We think that it would only require a small change in the code base, namely ensuring that the seek at [1] ends at an aligned address based on the BitsPerSample for the current IFD.

The notion of writing 'aligned' data (which requires inserting some dead space to assure alignment) is interesting and seems useful. This is mostly useful when the data is not compressed. I have not heard of this before in the context of the TIFF format, but some some other formats take care to assure it. Obviously, alignment could only be assured if the file is written by a TIFF writer which assures it.

You seem to be talking about aligning the TIFF data samples (a good thing), but there may be other beneficial alignment factors such as alignment to mmap memory page boundaries, or filesystem block-size boundaries.

Correct. The mmap alignment we can handle on the client/reader side, that is less of an issue. Obviously, it would be ideal if the offset would be page aligned too, to reduce the memory "waste" at the start of the first page, but that's less of an issue imo.

Filesystem block-size could help as another optimization, but that sounds pretty low level to me.

The pixel/data sample alignment on the other hand is a crucial requirement. Without that, one cannot use the mmapped data after all, as it would lead to potential bus errors on ARM and UBSAN warnings about misalignment on x86.

Regardless, take care not to be sucked into the vortex of using memory mapping to read files. When using memory mapping to read data, your program loses control and the thread which is doing the reading is put to sleep while the I/O is taking place. The I/Os are usually the size of a memory page, which is often just 4k. This means that your program gets put to sleep more often than desired, with many more context switches than if a larger copy-based I/O was used. If the data has recently been read before then memory mapping seems great since the data is likely to already be cached in memory.

I am aware of all that. But the size of images we are dealing with leaves us no altnerative. It's simply not feasible for us to copy so many GB of data into memory for us. And because we are integrating with other software stacks, we also cannot pass through an API around `TIFF*` to do the reading on demand either. We really need contiguous buffers...

When reading from a file, it is common for the operating system to try to deduce if the reading is sequential or random. If it is able to deduce that the reading is sequential, then it may pre-read data in order to lessen the hit (time spent sleeping) when the data is read in order. Operating systems may not have useful support for detecting sequential reads when using mmap to do the reading. TIFF requires random access and so the operating system might be slow to detect and optimize for a sequential read.

If the operating system does not provide a "unified page cache" with the filesystem, then there may be a filesystem data cache, and another copy of the data for use with mmap. This increases memory usage and does not avoid a data copy. It seems like the "unified page cache" approach has falled by the wayside since it is difficult to implement with the many filesystems available. Instead operating systems have moved toward offering "direct I/O" to lessen caching and data copies.

We control who's reading the file, so there's just going to be the single mmap and no other copy of it.

In summary, the use of mmap and carefully aligned input data might not provide actual benefit over larger programmed (or scheduled via async-I/O) reads into a aligned buffer, even though it clearly requires an extra memory copy.

What you are writing isn't wrong, but it's sadly completely besides the point for us.

Cheers

Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts