2004.12.07 20:45 "Re: [Tiff] Having problems adding IPTC to a TIFF", by Joris Van Damme

Well maybe the first thing to do would be to see what Photoshop IS doing. Since Adobe is in charge of the TIFF specs, if they're writing IPTC as a LONG array, then THAT IS the spec, even if it's conceptually incorrect. If they're writing UNDEFINED of whatever, then libtiff should definitely follow.

Wrong. If Adobe breaks the TIFF spec, that is a different thing from Adobe changing the TIFF spec.

Each TIFF field has an associated Count. This means that all fields are actually one-dimensional arrays, even though most fields contain only a single value. For example, to store a complicated data structure in a single private field, use the UNDEFINED field type and set the Count to the number of bytes required to hold the data structure.

</unquote>

It's really simple: the data isn't an array of long, so it isn't an array of long. This bit about the use if the undefined type is quoted from the top decisive spec, and respecting it is logically required for the data to make sense after applying byte swapping!

As to Adobe's current practice, Chris Cox clearly said "Photoshop can read it with any tag type, but writes as type long for compatibility with some unspecified application (and the comment has been in there a long time)." This goes to show, readers that inappropriatelly compensate for writer bugs is one thing, and can be argued to be good, but when writers (the issue here!) compensate for reader bugs, then incorrect data is produced and bugs are propagated, and matters get worse instead of being corrected, in the long run.

I would like to add that logic implies that just about every coder building a reader such that it compensate for this writer bug, must have been very confused at one stage, and must have carefully examined the matter to find out what was the source of the corruption. When coders did that, they must have realized the correct datatype, and most probably made their code to accept it. Thus, by correcting the mistake, we would most probably be breaking only the dumbest of libraries. In any case, we wouldn't be breaking Photoshop reading of IPTC data, that's clear from Chris' comment.

Bottom line: we will not be breaking apps that were broken anyhow by the corrupted data. We will instead restore their correct operation, by curing this bug. We will also not be breaking the vast majority of bug compensating readers. Some writers write correct data, most compensating reader coders must have exacmined the issue, and thus most of these compensating readers are very likely to also accept correct data. We might be breaking a single heap of pasted code snippets... so what? The alternative is to keep writing corrupt data, and keep breaking the apps that only accept good data.

This whole mess, the umpty files with corrupted data out there, clearly indicate that people with contributing or even CVS writing access to LibTiff should be required to read and understand the TIFF spec. Otherwise, a threat to the integrity of the library and the written data arises.

I made my point at least half a dozen times, now. I'm gratefully withdrawing now, adding only that I'm a bit put off by this discussion... Aren't we the ones that should ensure specs are being honoured, instead? Isn't that the single main pillar of open data exchange? I strongly dislike the fact that this is being doubted, especially since probably we would probably only be breaking the single <quote>unspecified application</unquote> that got this bad corruption started in the first place, by correcting this bug, and un-breaking the correct libraries.

So cure the bug, and document the historical mistake and how to compensate for it in existing TIFFs, to try and somewhat make up for the damage that is done.

Joris Van Damme
info@awaresystems.be
http://www.awaresystems.be
Download your free TIFF tag viewer for windows here:
http://www.awaresystems.be/imaging/tiff/astifftagviewer.html