1998.09.22 14:49 "TIFF Questions - TIFF 7.0 ?", by Ed Grissom

1998.09.22 14:51 "Large images (>2gb & >4gb)", by Ed Grissom

This message is one of a series of messages with questions about advanced raster topics and TIFF. See the message entitled "Tiff Questions - TIFF 7.0 ?" for more info.

We sell hardware for scanning aerial photos. Typical aerial photos are 10"by10" and are scanned in 24-bit color at 7.5 microns (~3300 DPI). It is possible that both higher resolutions and greater bit depths will be used in the future.

Our imagery exploitation systems use tiling and a "full-set" of overviews (1/2, 1/4, ...) to optimize performance. Usually there are about 8 overviews in a full set. With a full 10x10 RGB scan and a full set of overviews, the uncompressed data size approaches 5 Gigabytes for a single image. In the uncompressed form, this exceeds the addressing capability of the TIFF format.

Of course, we could write compressed files instead of uncompressed. Our experience with compression of these types of images has been:

The 2 Gigabyte limit is not a limit of TIFF, since offsets are defined to be "unsigned long"s. However many TIFF readers will blindly hand off the offset to the "fseek" function. Since the standard C library implementation of fseek takes a signed number, >2Gig offsets will be negative. This may even be true on 64-bit OS's if sign extension is automatically done when converting from 32-bit long to 64-bit long.

Files with offsets to strips or tiles that are past the 4Gig limit are plainly not allowed in TIFF.

It is apparent that larger files are in the works: the recently released NITF format allows images up to 17 Gigabytes in size.

Niles Ritter suggested that one solution to this dilemma could be to have the first IFD in the file be a reduced resolution version of the image that adheres to the current TIFF spec, and always contains offsets less than 2Gig (to avoid either problem).

A "private" tag in the first IFD could point to other IFDs much the same way the TIFF TREES proposal in TTN1 does. These IFDs would contain advanced features that a naive TIFF reader would not understand, but only apps that understand the private tag would be accessing these IFDs.

To implement this, we need some new tags, and some new tag types.

Obviously a tag type of "LONG64 - A 64 bit unsigned integer" is needed, and while we are at it we might as well include the signed version: "SLONG64"

New definitions of the offset tags for both strips and tiles are needed. I would assume that for single strip and single tile compatibility, new definitions of the bytecounts tags will also be needed, as well as a new RowsPerStrip definition.

The question now is: What numbers do we assign these new tags and types? I realize that with a private tag and essentially private IFDs, we are free to use any numbers we see fit. However, if any of these changes are planned for TIFF 7.0, I would like to be compatible with them.

Conversely, what should I do to get this into TIFF 7.0?

Write up TTN3?

Here is a stab at the basics....

======================================================================== TAG TYPES:

YY = "LONG64 - A 64 bit unsigned integer"
YY+1 = "SLONG64 - A 64 bit signed integer"
!!! DONT USE THESE TYPES - THIS IS NOT YET APPROVED!!!

TAGS:

IFDOffset64!!! DONT USE THIS TAG - THIS IS NOT YET APPROVED!!!
  Tag = XXX
  Type = LONG
  Count = 1
  Points to a IFD that contains LONG64 data types in the tag list.

RowsPerStrip64!!! DONT USE THIS TAG - THIS IS NOT YET APPROVED!!!
  Tag = XXX+1
  Type = LONG64

The number of rows in each strip (except possibly the last strip.) For example, if ImageLength is 24, and RowsPerStrip is 10, then there are 3 strips, with 10 rows in the first strip, 10 rows in the second strip, and 4 rows in the third strip. (The data in the last strip is not padded with 6 extra rows of dummy data.)

StripOffsets64!!! DONT USE THIS TAG - THIS IS NOT YET APPROVED!!!
  Tag = XXX+2
  Type = LONG64
  For each strip, the byte offset of that strip.

StripByteCounts64!!! DONT USE THIS TAG - THIS IS NOT YET APPROVED!!!
  Tag = XXX+3
  Type = LONG64
  For each strip, the number of bytes in that strip after any compression.

TileOffsets64!!! DONT USE THIS TAG - THIS IS NOT YET APPROVED!!!
  Tag = XXX+4
  Type = LONG64
  N = TilesPerImage for PlanarConfiguration = 1
    = SamplesPerPixel * TilesPerImage for PlanarConfiguration = 2

For each tile, the byte offset of that tile, as compressed and stored on disk. The offset is specified with respect to the beginning of the TIFF file. Note that this implies that each tile has a location independent of the locations of other tiles. Offsets are ordered left-to-right and top-to-bottom. For PlanarConfiguration = 2, the offsets for the first component plane are stored first, followed by all the offsets for the second component plane, and so on.

No default. See also TileWidth, TileLength, TileByteCounts64.

TileByteCounts 64!!! DONT USE THIS TAG - THIS IS NOT YET APPROVED!!!
 Tag = XXX+5
 Type = LONG64
 N = TilesPerImage for PlanarConfiguration = 1
   = SamplesPerPixel * TilesPerImage for PlanarConfiguration = 2

For each tile, the number of (compressed) bytes in that tile. See TileOffsets for a description of how the byte counts are ordered.

No default. See also TileWidth, TileLength, TileOffsets64.

========================================================================

I can see at least one problem with this approach. Programs that add to or modify header values "in-place" (i.e. without re-writing the entire file) may try to re-write the IFD or some data pointed to by the TAGs in the IFD at the end of the image. Such programs will either fail or write a corrupted image if they try to modify one of these files.

Even programs that understand the LONG64 construct will be hard pressed to do this correctly since there is no free space in the <2gig area to write additional data for the first image.

Perhaps an additional tag that points to some free space that was purposefully left in the <2gig area could help here. (bring back the deprecated "FreeOffsets & FreeByteCounts tags ?)

The only other valid solution I can see is so drastic that I hate to even bring it up. We would need to modify the TIFF Version number to be something other than "42" and use an 8-byte initial IFD offset along with the definitions for the new tags above (or re-cast the current implementation of the necessary TAGs to accept LONG64 values).

--
ed grissom
egrissom@ingr.com