2020.12.30 15:15 "[Tiff] Multithreading support?", by OnlineCop

2020.12.30 23:18 "Re: [Tiff] Multithreading support?", by Larry Gritz

You can open the same file many times from different threads and they can all read different parts of the file, but you can't use a single TIFF* handle from more than one thread at the same time.

It may not be obvious why this is important, so I'll give little background and my perspective as the lead author of OpenImageIO, which is one of the main libraries by which applications for high-end VFX and animation interact with TIFF files. There are two really important use cases for which the lack of threading support in libtiff has been problematic:

  1. A single application thread that needs to read a large portion or all of a TIFF image... and it just takes a long time because the I/O and decompression are completely serialized inside libtiff itself, and therefore using a tiny fraction of computational resources of a modern machine. An example of a library that approaches this quite differently is OpenEXR, which maintains a thread pool so that when you are using the API calls that read many scanlines or tiles at a time, a lot of the data copying, conversion, and decompression is happening in parallel. I did the same thing for TIFF files in OpenImageIO -- my TIFF reader uses libtiff to read raw strips and then submits them to a thread pool for decoding and data conversion in parallel (and similarly for writing) -- and I frequently get a 5-10x or more speedup. But this is a big PITA and is limited: it's a lot of tricky code on my side that I wish was inside libtiff, I have only implemented it for one or two common compression codecs that were convenient for me to implement outside libtiff, and it only works if the range of scanlines being requested aligns to strip boundaries.
  2. Multiple threads needing to read different parts of the same file at once. For us, the most important application of this is using multi-resolution tiled TIFF files as a format for pageable textures, and this texture caching engine is used inside several of the leading film renderers. The renderers are all multithreaded, so it's common for multiple threads to simultaneously need different tiles from the same TIFF file, that are not currently in the in-memory cache, and so they need to block against each other because the two tiles can't be read simultaneously. (Having separate TIFF* open files for each of the potentially dozens of threads defeats the whole purpose of the cache, which is to drastically reduce the amount of memory and number of simultaneously open file handles -- we commonly have scenes that reference 15000+ texture files totalling 1-2TB of pixel data, and we render those frames using in-memory texture cache of only 1-4GB and keeping only a couple hundred files open at a time.)

So, anyway, I'm not trying to beat up on the wonderful libtiff which has been a crucial and reliable dependency for us for over a decade (and for other things I have written for an additional decade or two before that).

But it would dramatically improve libtiff to have any or all of the following:

Many applications could see an order of magnitude improvement in the I/O performance for TIFF files.

-- lg

Larry Gritz
lg@larrygritz.com