2003.11.20 07:59 "[Tiff] tiff2pdf contribution", by Ross Finlayson

Hi Chris,

That's a good idea, that's pretty much what I had in mind.

I just started drafting the requirements, and I think what I'll do is work on both modes: matrix modification and also rotating and flipping the buffers.

Basically no matter how you flip or rotate the (uncompressed) buffer it is the same size, except for pathological cases with less than eight bits per sample. For example, an image with one bit per sample that has a width that is a multiple of eight, like a standard fax image, might have an odd length, and the buffer that could hold the image rotated 90 or 270 degreswould have to be larger than the buffer where it was originally.

When then data is compressed losslessly, then it's not a difficult matter to preserve the dimensions, except sometime as mentioned in the case above where scanlines are to start on a byte boundary, or sometimes word boundary.

When the compression has data units that do not fit within a byte, the "addressable atom" of storage, for example how JPEG stores its data units in 8x8 samples of 8 or 12 bits per sample, again the dimensions then sometimes require larger, or in some case I guess smaller, buffers to store the rotated data.

Another issue with JPEG is that it is often lossy, a lossy data compression method. Uncompressing and recompressing it the JPEG data is to be avoided. I was looking at libjpeg the other day and noticed that I think it has some functions for lossless transformation including perhaps clipping and rotation, I'm not quite sure about rotation. I'll look here.

Again, trasnforming the CTM, Current Translation (Transformation?) Matrix, is probably a more direct route, but if the output format didn't have a CTM, for example as TIFF doesn't, then functions to orient the data would be useful to fully implement TIFFReadRGBAOriented, for example.

Also, to do matrix math I would have to study for days or at least hours before I came across the right equations, although as we are talking about right angles sine and cosine are 0 and +-1.

So in TIFF there are four rotations from the image that is generally top-down and left-right in organization, the image can be rotated 0, 90, 180, or 270 degrees, and then for each it can fllpped as in a mirror image. In some documents when they diagram how the output looks they use a capital letter F, unsymmetrical among letters.

So anyways when the rotation is 0 or 180 then a buffer with the same extent and scanline pointers can be used, in a way, although I guess scanline pointers would flip to either side of the rectangular matrix of the sample space.

Here TIFF defines these constants for rotations and flips:

#define TIFFTAG_ORIENTATION             274     /* +image orientation */
#define     ORIENTATION_TOPLEFT         1       /* row 0 top, col 0 lhs */
#define     ORIENTATION_TOPRIGHT        2       /* row 0 top, col 0 rhs */
#define     ORIENTATION_BOTRIGHT        3       /* row 0 bottom, col 0 rhs */
#define     ORIENTATION_BOTLEFT         4       /* row 0 bottom, col 0 lhs */
#define     ORIENTATION_LEFTTOP         5       /* row 0 lhs, col 0 top */
#define     ORIENTATION_RIGHTTOP        6       /* row 0 rhs, col 0 top */
#define     ORIENTATION_RIGHTBOT        7       /* row 0 rhs, col 0 bottom */
#define     ORIENTATION_LEFTBOT         8       /* row 0 lhs, col 0 bottom */

Types 1, 2, 3, and 4 are the 0's and 180s, and 5, 6, 7, and 8 are the 90's and 270's.

Most programs use either top-left or bottom-left. I'm kind of unclear at the moment but I think PDF uses a bottom-left coordinate matrix. The XObject image that is the raster element in PDF is however filled top-left, in the same way as the default TIFF orientation.

The TIFFReadRGBAImage by default returns an image in bot-left. I think that is because that is how SGI was doing it. The TIFFReadRGBAImageOriented function was recently added to fill the adequately allocated image buffer according to the user specification.

In terms of the coding to rotate and flip or reflect the image, it can be most memory efficient in the case of the transforms among 1, 2, 3, 4 and among 5, 6, 7, 8 separately. In either case, you could use a single sample and other flip the buffer in place, albeit slowly, with various sizes of buffer the translation operation can be accomplished with memory copies basically at the level of the scanline. Between 1 and 3 the scanline is the same, just copied around the buffer, as is between 2 and 4. Between 1 and 2, for example, or 1 and 4, the scanline has to be reversed, as the computer memory is not going anywhere. Between 1, 2, 3, 4 and 5, 6, 7,8, the scanline row for 1 is a column for 5, where the address for contiguous elements of the rows increments by one, the

address for contiguous elements of the column increments by the row width. This is where in C we basically have row-major matrices. The act of rotation is basically converting the matrix to a column-major matrix in terms of linear addressing. At this point I have been tacking together words to attempt to sound urbane and clinical.

So anyways when we think of a function that is going to either a: reorient the contents of a buffer in place, or b: fill a newly allocated buffer with an oriented version of the original buffer, we can keep in mind various classes of implementations that will save the computer a cycle or two, for which it will not care because it is a piece of metal or ceramic. Only software cares.

Once we decide to reorient an image that is composed of constituent stripes or tiles, then at least we are in TIFF instead of the superbly overengineered JPEG2000 component registration. Anyways, we could reconstitute the entire image and then split it back into stripes or tiles after reorientation from 1<->5 in the troublesome case, or in the case of converting where the tile start in the same corner doing the tile individually. This is an example where for hooge image the tiled configuration is more resource efficient.

Back to the JPEG lossless transformation for a second, here I will browse libjpeg and see if my addled mind can quickly retrack its path to that descriptive content. Lane mentioned where it was described in one of the text format .doc extension files there, let's see,

[space:~/Desktop/libjpeg/jpeg-6b] space% ls *.doc
coderules.doc install.doc libjpeg.doc usage.doc
filelist.doc jconfig.doc structure.doc wizard.doc

He (I assume Tom's a he) mentioned it in libjpeg.doc. Here we go, block quote:

Really raw data: DCT coefficients
libjpeg.doc (85%)
---------------------------------

It is possible to read or write the contents of a JPEG file as raw DCT coefficients. This facility is mainly intended for use in lossless transcoding between different JPEG file formats. Other possible applications include lossless cropping of a JPEG image, lossless reassembly of a multi-strip or multi-tile TIFF/JPEG file into a single JPEG datastream, etc.

To read the contents of a JPEG file as DCT coefficients, open the file and do jpeg_read_header() as usual. But instead of calling jpeg_start_decompress() and jpeg_read_scanlines(), call jpeg_read_coefficients(). This will read the entire image into a set of virtual coefficient-block arrays, one array per component. The return value is a pointer to an array of virtual-array descriptors. Each virtual array can be accessed directly using the JPEG memory manager's access_virt_barray method (see Memory management, below, and also read structure.doc's discussion of virtual array handling). Or, for simple transcoding to a different JPEG file format, the array list can just be handed directly to jpeg_write_coefficients().

Each block in the block arrays contains quantized coefficient values in normal array order (not JPEG zigzag order). The block arrays contain only DCT blocks containing real data; any entirely-dummy blocks added to fill out interleaved MCUs at the right or bottom edges of the image are discarded during reading and are not stored in the block arrays. (The size of each block array can be determined from the width_in_blocks and height_in_blocks fields of the component's comp_info entry.) This is also the data format expected by jpeg_write_coefficients().

When you are done using the virtual arrays, call jpeg_finish_decompress() to release the array storage and return the decompression object to an idle state; or just call jpeg_destroy() if you don't need to reuse the object.

If you use a suspending data source, jpeg_read_coefficients() will return NULL if it is forced to suspend; a non-NULL return value indicates successful completion. You need not test for a NULL return value when using a non-suspending data source.

It is also possible to call jpeg_read_coefficients() to obtain access to the decoder's coefficient arrays during a normal decode cycle in buffered-image mode. This frammish might be useful for progressively displaying an incoming image and then re-encoding it without loss. To do this, decode in buffered- image mode as discussed previously, then call jpeg_read_coefficients() after the last jpeg_finish_output() call. The arrays will be available for your use until you call jpeg_finish_decompress().

To write the contents of a JPEG file as DCT coefficients, you must provide the DCT coefficients stored in virtual block arrays. You can either pass block arrays read from an input JPEG file by jpeg_read_coefficients(), or allocate virtual arrays from the JPEG compression object and fill them yourself. In either case, jpeg_write_coefficients() is substituted for jpeg_start_compress() and jpeg_write_scanlines(). Thus the sequence is

* Create compression object
* Set all compression parameters as necessary
* Request virtual arrays if needed
* jpeg_write_coefficients()
* jpeg_finish_compress()
* Destroy or re-use compression object

jpeg_write_coefficients() is passed a pointer to an array of virtual block array descriptors; the number of arrays is equal to cinfo.num_components.

The virtual arrays need only have been requested, not realized, before jpeg_write_coefficients() is called. A side-effect of jpeg_write_coefficients() is to realize any virtual arrays that have been requested from the compression object's memory manager. Thus, when obtaining the virtual arrays from the compression object, you should fill the arrays after calling jpeg_write_coefficients(). The data is actually written out when you call jpeg_finish_compress(); jpeg_write_coefficients() only writes the file header.

When writing raw DCT coefficients, it is crucial that the JPEG quantization tables and sampling factors match the way the data was encoded, or the resulting file will be invalid. For transcoding from an existing JPEG file, we recommend using jpeg_copy_critical_parameters(). This routine initializes all the compression parameters to default values (like jpeg_set_defaults()), then copies the critical information from a source decompression object. The decompression object should have just been used to read the entire JPEG input file --- that is, it should be awaiting jpeg_finish_decompress().

jpeg_write_coefficients() marks all tables stored in the compression object as needing to be written to the output file (thus, it acts like jpeg_start_compress(cinfo, TRUE)). This is for safety's sake, to avoid emitting abbreviated JPEG files by accident. If you really want to emit an abbreviated JPEG file, call jpeg_suppress_tables(), or set the tables' individual sent_table flags, between calling jpeg_write_coefficients() and jpeg_finish_compress().

End quote. I seemed to remember a mention of an actual implementation of that in one of the utility programs, let's see here. I look in filelist.doc, here it's mentioned: transupp,c "Support code for jpegtran: lossless image manipulations." It says Guido Vollbeding wrote the initial design and code.

/*
  * Lossless image transformation routines.  These routines work on DCT
  * coefficient arrays and thus do not require any lossy decompression
  * or recompression of the image.
  * Thanks to Guido Vollbeding for the initial design and code of this feature.
  *
  * Horizontal flipping is done in-place, using a single top-to-bottom
  * pass through the virtual source array.  It will thus be much the
  * fastest option for images larger than main memory.
  *
  * The other routines require a set of destination virtual arrays, so they
  * need twice as much memory as jpegtran normally does.  The destination
  * arrays are always written in normal scan order (top to bottom) because
  * the virtual array manager expects this.  The source arrays will be scanned
  * in the corresponding order, which means multiple passes through the source
  * arrays for most of the transforms.  That could result in much thrashing
  * if the image is larger than main memory.
  *...

I think that's plenty to get going on rotating the images. I'll try and figure it out tomorrow, besides I figure there is already some support for image rotation in libtiff.

When sending the output to something like PDF, where we might be using the CTM transform and let the PDF browser do the lifting, I'm wondering about the Rotate parameter of the Page structure. It says to set Rotate to /90 /180 or /270 if I recall correctly (IIRC) and that the page would be rotated thusly in viewing and printing mode. See, I don't want to use that because the only reason to rotate the page is to print it portrait nd view it landscape for ledgers. I haven't tried it to see what it does but we don't have to worry about it because it's PDF and not TIFF.

Anyways, back to libtiff, if there are already functions in place to do something then they should be used and extended, in a way that will make them easy to break off later.

In tif_getimage.c, there is the TIFFGetRGBAImageOriented, it implements most of the transformation. That's within that function though, it can't really be called because the idea of implementing the oriented image is function not using TIFFReadRGBAImage. What it is is food for the compiler.

In the tiff2pdf program, all the page offset information and the PDF rectangles are calculated in the t2p_compose_page function. It figures out what the media box is going to be, and where the image is going on the media box, then, if the image is tiled it determines where to place the tiles, where this is separate from any of the data transfer logic. The argument to the function is a T2P*, it could have a field added to it for the orientation for the page in the t2p->tiff_pages member of the T2P_PAGE* for the page. That's one nice option with that program, it can offload the rotation handling to the PDF interpreter. Now, I'm wondering how to integrate setting the rotation field with tiffcp, for example, or a separate rotation tool, to set the rotation so that viewer program that didn't handle the rotation and reflection field could get oriented images.

I'm wondering whether to break the implementation out of TIFFReadRBGAImageOriented for general use from, say, tiffio.h. Refactoring the code to have a prototype in the public interface would also make it available for a tool concept as mentioned above.

Right on.

Ross F.