2017.06.28 10:09 "Re: [Tiff] Excessive memory allocation while chopping strips", by Even Rouault

On mardi 27 juin 2017 22:26:40 CEST Olivier Paquet wrote:

2017-06-27 18:52 GMT-04:00 Even Rouault <even.rouault@spatialys.com>:

Fuzzers (specifically the address sanitizer) crash on this situations, assuming that attempts of huge memory allocations are a bug. libtiff itself will not crash (or it is a real bug) if a malloc() fails, but this isn't a good practice to attempt allocating a lot of memory if it is not needed. And indeed the files processed by fuzzers are just a few kilobytes large at most.

I think we should be careful not to break libtiff because of a broken tool then.

I wouldn't call the tool "broken". It is a reasonable behaviour in most cases This is the -rss_limit_mb setting of http://llvm.org/docs/LibFuzzer.html which is used by OSS Fuzz underneath and is set up to 2 GB by default. It really helps identifying places in the code where large memory allocations are done (and sometimes this isn't just reserving virtual memory, but really pinning it, which can cause OOM killer to trigger)

It's great if we can reduce the huge allocations but it seems to me that because of the nature of the TIFF format, they will be hard to eliminate completely without breaking some valid files or writing a lot of complex code (read: code with new, unknown bugs). Just something to keep in mind.

Agreed. I'd encourage people using libtiff to regularly track the HEAD version to help spotting potential new regressions.

With that said, I don't mind limiting the patch to split huge strips. I don't think we're helping anyone by applying it to files millions of pixels high. However, won't it just push the potential allocation problem to strip size? Or are there other checks there?

For strip reading, there's the mechanism I already put in place and described in my previous email. Small files that would declare large strip byte count will no longer cause the full allocation to be done before beginning reading the strip content.

An API to set an allocation limit is a little tricky as TIFFOpen() will read the first IFD unless you use 'h' and that could contain large tags.

Good point (and thanks for advertizing this flag I didn't know), but that's acceptable I think for someone using advanced API.

But otherwise, it looks like something which might be useful to someone required to deal with unsafe input.

Given the number of teams that fuzz libtiff / tiff tools and report issues in Bugzilla, it seems there's a lot of interest for such use cases.

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com