2016.07.05 09:41 "[Tiff] Proposed API improvements", by Yakov Galka

2016.07.05 09:41 "[Tiff] Proposed API improvements", by Yakov Galka

Hello everyone,

I know that libtiff is very old, doesn't change much, and thus I'm not sure how reluctant you are to any significant modifications, even if backward compatible. Yet, I inherited a codebase that uses libtiff a lot and thus I wish to provide some input based on my experience of real-world uses of libtiff, lessons learned, and suggest what can be done to improve libtiff in year 2016.

Uniformization: Read strips as tiles
=============

Despite the difference between them per the specification, they are essentially different ways to segment the image. Code that reads a TIFF does the same thing: gets the dimensions of the tiles/strips, computes the indices of the tiles/strips within the region-of-interest, and fetches the data from these tiles/strips. In the codebase I maintain there was a lot of code that supported both tiled and stripped tiffs by means of conditioning and calling one interface or the other. I refactored it to use some TIFFBlockXXX functions which dispatch to TIFFTileXXX or TIFFStripXXX as needed, thus hiding the differences. Although a trivial change, I believe that others will benefit if this makes it into libtiff. Since libtiff can extend the behavior of its APIs, we can change the TIFFTileXXX interface to support reading strips rather than introducing new API like I did. The only function that has to be added is TIFFTileDims(TIFF *t, uint32 *tw, uint32 *th, uint32 *td) which will return the dimensions by looking at the tags appropriate for the tiff at hand.

Having something like this for writing TIFFs would be useful too, though I haven't looked into it yet.

Note that libtiff already provides a somewhat similar feature wrt. reading scanlines through the strip interface (aka 'strip chopping'), though there we have a significant difference between the two: strips provide fast random-access which scanlines do not.

Side note on strip chopping: it had been nice if the reported tag values would always be what's written in the TIFF (or what's implied), and the actual tile sizes reported by TIFFTileDims. Though I acknowledge that this would break the compatibility and thus isn't relevant much for the current discussion.

Error handling
=========

Already raised in, at least, 2000 and 2007: http://www.asmail.be/msg0054963626.html

http://www.asmail.be/msg0054639755.html

Current error handling routines are overridden globally, thus causing trouble in multi-threaded environments or in library code that uses libtiff internally but cannot interfere with other users of libtiff (i.e. from application code or sibling libraries using libtiff).

It was suggested to use TIFFErrorHandlerExt as a potential solution. But TIFFErrorHandlerExt CANNOT BE RELIABLY USED, because the caller of

TIFFErrorHandlerExt is not in control of what the clientdata he is receiving will be.

In one of the above threads, it was proposed to duplicate each of the TIFFOpen, TIFFOpenW, TIFFFdOpen, TIFFClientOpen functions to accept the error handling procedure. I want to note that this is nonsensical either, because if the clientdata isn't specified by the author of the error handler (like in the TIFFOpen, TIFFOpenW and TIFFFdOpen) then there is nothing she can do with it in the callback. Therefore it's only the TIFFClientOpen which has to be extended.

Surely these features require a lot of laborious work (mostly the second one). I may participate in some of the effort. Therefore, considering the scope of the changes, I wish a consensus is reached of whether such changes would be welcome at all.

Sincerely,

Yakov