2017.09.17 16:27 "Re: [Tiff] TIFF tile size limit", by Even Rouault
On dimanche 17 septembre 2017 11:29:24 CEST Kemp Watson wrote:
"Tile sizes are already allowed to be larger than the image dimensions by the TIFF specification since they can spill over the right and bottom of the image. A one pixel image could be in a 1kx1k "tile²."
Ugh. That¹s the root of the issue, for sure.
Another issue, more with the implementation of libtiff than with the spec itself, is if you have a big number of tiles or strips, the allocation of the StripByteCount and StripOffset arrays can be very costly. Let's take the case of a 1 million x 1 million image with tiles of 32 x 32. The allocation cost for those arrays is (1e6 / 16) * (1e6 / 16) * sizeof(uint64) * 2 = 62 GB (actually libtiff would refuse to read the content of a tag if it is more than 2GB large), and currently they are allocated and read entirely as soon as you read the first tile. libtiff should rather just read in the file the values of the tile byte count & offset for the tile it is going to read. Even with a more reasonable tile size of 256x256, the cost is still 244 MB that you need to allocate and read at file opening.
Yes, the compression is not really the issue (LZW will compress whitespace to almost nothing), it¹s the allocation of the raw rasters/tiles for the decompressed data.
I just went back and re-read the Tiled Image specification. You are completely correct, there is no formal restriction on tile size vs image size, although there¹s a lot of ³not recommended² verbage. Personally, I¹d be inclined to limit the tile size to the image size (technically, to the 16-pixel boundary just above the image size, or perhaps to the next larger power-of-two size to provide 'quadrants'). That would break the compatibility with the TIFF spec, though. Would that be a problem in practice?
You'd probably break reading very small images (let's say 20x20) that use a standard tiling size of 256x256 (eg someone using GDAL's "gdal_translate in out.tif -co TILED=YES). So if we went on this direction, we should probably need to allow a "reasonable" tile size of let's say 2K x 2K even for 1x1 images.
Adobe¹s not really keeping the spec up to date with modern needs anyway, and BigTIFF is not a spec either. The reality is that libtiff is diverging from the TIFF 6.0 specification.
W. Kemp Watson
On 2017-09-17, 11:04 AM, "Bob Friesenhahn" <firstname.lastname@example.org> wrote:
>On Sun, 17 Sep 2017, Kemp Watson wrote:
> I know that we use very large bigtiffs, sometimes terabytes in size, but >> currently we allocate only 512-pixel tiles. I could see that going to 4K
>> maximum in the near future, very practically.
>> But, would limiting the tile size to be up to but not more than the full >> image dimensions not essentially guarantee that a tiled implementation
>> would not use more memory than a full rasterized image (barring pointers >> and small stuff)?
>Tile sizes are already allowed to be larger than the image dimensions >by the TIFF specification since they can spill over the right and >bottom of the image. A one pixel image could be in a 1kx1k "tile".
>> I may well be missing some critical detail here - what in those sample >> files is the root cause of the large allocations?
>That is a good question. Some compressors are capable of remarkable >compression ratios and so the files can claim large pixel dimensions >although the file is very small. In some cases it is not easy to know >what a decoder can produce from a very small input.
> >Even Rouault tells me that one of the compressors is th