2019.04.23 18:55 "[Tiff] TIFFWriteScanLine - buffers to RAM before flushing to disc?", by Paul Hemmer

2019.04.24 11:28 "Re: [Tiff] TIFFWriteScanLine - buffers to RAM before flushing to disc?", by Paul Hemmer

Hi Bob, Kemp, thanks for the replies.

I'm on 64bit Win10 using LibTIFF v4.0.10.

This is data that I generated from scratch (microscope images) and is held in an OpenCV matrix prior to writing.

I'm working exclusively with 16bit grayscale data. If size dictates BigTIFF, I do not use compression. (aside: is it true that BigTIFF can't be compressed?)

I instantiate my TIFF and setup the basic set of tags for height/width/bitdepth/contiguous etc.

Then I get the scanline size:

scanlineSize = TIFFScanlineSize(myTIFF);

and setup a buffer to hold a single scanline which I reuse:

uchar *buf = (uchar *)_TIFFmalloc(scanlineSize);

I then loop over each row in my OpenCV matrix, and copy a line to that buffer (the "line" is ~150k pixels wide) and in this case "rows" is much smaller, say 2500.

for (size_t y = 0; y < myOpenCV.rows; y++) {
    memcpy(buf, myOpenCV.data + myOpenCV.step * y, scanlineSize);
    TIFFWriteScanline(composite_tiff, buf, y, 0);
}

(I also include a try/catch around that which I've verified is never thrown)

Finally I close the TIFF and free the buffer.

TIFFClose(myTIFF);
_TIFFfree(buf);

If I put a breakpoint before the loop and after the loop, I see the memory grow with each call to TIFFWriteScanline and then before TIFFClose(), the file as shown in Windows still lists as 0kb in size.

TIFFClose() can take several seconds, and then the file in Windows shows the expected size and the memory drops all the way back.

If LibTIFF is memory mapping the file, that would support what I'm seeing. Is there a way to turn that off?

________________________________
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
Sent: Tuesday, April 23, 2019 5:24 PM

To: Paul Hemmer
Cc: Tiff List

Subject: Re: [Tiff] TIFFWriteScanLine - buffers to RAM before flushing to disc?

I noticed that when using LibTIFF to write a BigTIFF in a scanline based way that it seems like TIFFWriteScanLine doesn't immediately write to disc and flush memory (even though I see a call to TIFFFlush inside the code for TIFFWriteScanLine)... If I write lines in a loop, RAM utilization increases and when I finally call TIFFClose(), there is a delay as the file seems to be actually written, and then all the memory frees. I haven't checked yet to see if the behavior is similar with using Tiled output.

Can you tell us more about your program and the operating system you are using? Is the program generating image data from scratch, or is it being read from a different file?

Libtiff prefers to memory map its input file if it can. This can result in apparent decrease in available overall system memory as the input file is read since memory mapping is a form of caching even if the memory may be returned to the OS on demand.

The operating system normally provides a filesystem cache and uses it to cache data which has not yet been flushed to disk. For some filesystems (e.g. zfs) the amount of memory the system might use for large and fast writes may be very large.

Is this expected behavior? These are large images, where a given scanline can easily be 150,000+ pixels.. Is there a way to stream lines to disc without the internal buffering?

I doubt that this internal buffering exists. The only buffering I am aware of is the strip-chopping feature which allows huge strips to be handled incrementally using per-row scanlines. This works by diminishing the amount of memory the applications needs to use by increasing the number of I/Os.

If you can reveal the operating system and filesystem you are using, we can surely provide more assistance.

bfriesen@simple.dallas.tx.us, https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.simplesystems.org%2Fusers%2Fbfriesen%2F&amp;data=02%7C01%7C%7C7a1f1ef5a2714063c90d08d6c8321762%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636916514799333617&amp;sdata=6VCkTx4pwa3Mnh