2002.01.03 16:31 "Updates to unrecognised tag handling", by Frank Warmerdam
Over the holidays I started work on something I have wanted for a long time. Basically, I have often wished that libtiff could carry unrecognised tags as well as the extensive list of recognised tags. The main justification for me is that it makes writing TIFF extensions (like GeoTIFF) much easier.
To that end, I have added logic (not yet in CVS) to:
- add new TIFFFieldInfo definitions to the internal tag list structure (managed in tif_dirinfo.c) when unrecognised tags are encountered while reading. These definitions default to a name that is "Tag %d" and with the min/max counts set to variable, and the passcount flag set on.
- tif_dirread.c will now read these custom (I think they will be renamed generic) tags into an "extras" list in the TIFFDirectory structure.
- tif_dirwrite.c knows how to write these custom tags.
- TIFFGetField() and TIFFSetField() know how to handle these tags (this part isn't really complete though it works for common cases).
- I have added an API for discovering the unrecognised tags that are available.
- I exposed the API for getting, and adding TIFFieldInfo's in tiffio.h. This is now a public API so that extensions like GeoTIFF don't need access to tiffiop.h.
- On the libgeotiff side, I have successfully reengineered libgeotiff's libxtiff code to use the new mechanism. Now the geotiff tags are read, and handled directly by libtiff so xtiff.c is very simple. Adding tags types really just requires calling TIFFMergeFieldInfo() with the appropriate definitions. the SetField, GetField and PrintField methods no longer need to be overridden. Now libgeotiff just uses TIFFGetField() to fetch the geotiff tag values from libtiff.
The benefit for libgeotiff is that it no longer needs any private include files for libtiff to build, and changes to the TIFF and TIFFDirectory structure will have no effect on libgeotiff. As things stand now, building an application using libgeotiff with a separate libtiff.so is prone to violent crashes if a different libtiff.so is picked up at runtime... even a minorly different libtiff because of the dependences on structure layouts from libtiff in libgeotiff. I get regular reports of this. This is no longer true in new approach.
The benefit for other libtiff users is that new application specific tags can be added very easily. No overriding a whole bunch of functions, just merge in the field info at the appropriate point.
Furthermore, applications can carry and access arbitrary new fields, potentially even copying them to a new file in a systematic way.
Now the downside. The changes are significant, and of course the new libgeotiff depends on the new libtiff interfaces. For these reasons I am planning that the next release with this new approach will be libtiff 3.6.0, and libgeotiff 1.2.0. These first new releases will likely be considered experimental for a while.
A few questions have come up while I was working on this upgrade:
- Should I drop a bunch of tags that are currently handled individually from the TIFFDirectory structure, and just handle them generically? They would still have a proper TIFFFieldInfo definition ensuring they continue to have the same SetField()/GetField() semantics, but cutting down on the amount of per-tag code. Since the main benefit here is simplification of tif_dir.c, I think I will leave this till after the new approach has been proven under fire.
- Currently the TIFFFieldInfo list is kept on a per-file (TIFF*) basis. Application specific tags have to be incorporated a tag extender function called for each file opened (see TIFFSetTagExtender()). It seem like it would be nice if application could just register new tags once, essentially extending the static tiffFieldInfo list maintained in tif_dirinfo.c. Actually, reviewing the way TIFFSetTagExtender() works now, this change is likely not worthwhile.
- In libgeotiff I can now completely do away with the need to call XTIFFOpen() and XTIFFClose() instead of TIFFOpen() and TIFFClose() with the possible extension of a need for some sort of call to initialize libgeotiff to ensure that TIFFSetTagExtender() can be called once before the first TIFFOpen() call.
- Given the substantial rework going on, is there any other features I should be incorporating at this time? One thing I would really like is much better "re-write in place" support from libtiff, allowing directories in existing files to be updated and rewritten. Currently this only works under very limited circumstances, and it leaks all the file space used up by the previous directory (at least sometimes). However, this is a big job, and I could use a "sponsor" to support the work if any are interested.
- Should I create a branch for the 3.5.x libtiff so that a new "classic" libtiff 3.5.8 could be issued even after the 3.6.0 work is in CVS?
I would appreciate any input from libtiff (and libgeotiff) users with suggestions or concerns about the proposed changes. If anyone wants to review the new approach let me know and I will let you know when the code is in CVS.
I set the clouds in motion - turn up | Frank Warmerdam, firstname.lastname@example.org
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent