2019.01.14 17:44 "Re: [Tiff] tiffcp altering image contents (in contrast to what the manual says)?", by Daniel McCoy
It might be worthwhile to look at the output of "tiffinfo -s". That will show the strip offsets and strip lengths.
If tiffcp were just compressing out unused gaps in the file, the number of strips
and strip byte counts would be the same, but the strip offsets would change. If this were the case, you wouldn't even have to make your multi-page tiff to check,
you could just "tiffcp" each of the files individually then compare the output of "tiffinfo -s"
for the before and after versions. If only the offsets change, then the actual image data
is probably not the same and the file has just been "defragmented".
Why: Some tiff writing programs flush incomplete directories to the file while writing.
As the directory grows in length with strips being added, it has to keep being relocated to the end of file, leaving unused gaps between some strips. This can happen with programs which do not know
the whole image beforehand and want partial images to be recoverable. (Renderer, scanner, ...)
If this is the case, then running the file through tiffcp essentially would perform garbage collection on the file, resulting in a smaller file with exactly the same data in it.
Dan McCoy - Pixar
On Sun, Jan 13, 2019 at 3:12 PM Binarus <email@example.com> wrote:
thank you very much for your impressive answer.
On 13.01.2019 22:11, Richard Nolde wrote:
Bob is certainly correct in stating that the issue is that the output is written using YCBCR encoding.
One of my previous messages shows the output from tiffinfo for each of the source files. tiffinfo shows that the source files are indeed written using YCbCr encoding, so this is true.
But I don't understand why this imposes an issue. Since OJPEG does not seem to be a problem in this case, why does tiffcp (obviously) alter the image data? Why doesn't it just copy the YCbCr encoded data byte by byte when merging the images (just altering the directory, endianness, offsets,... accordingly)?
Why not simply use another compression algorithm and why use Graphics Magick at all if you are just compression and combining them?
Two questions in one sentence :-)
We can't use another compression algorithm for the 24 BPP files because they will get huge if we do. One one hand, we can afford the degradation which is caused by encoding as JPEG with 90% quality if the degradation is guaranteed to happen only once. On the other hand, the size of those files will be at least 5 times their current size if we use any other compression than JPEG. Since we will have to handle some 100000 of them, this is a problem.
The current compression scheme is well-crafted and approved. The problem is that degradation must only happen once. We couldn't accept that tiffcp would re-encode the image data, and I therefore would like to understand exactly what is happening here and why tiffcp obviously touches the image data at all.
> Likewise, we don't want to use another compression r