2010.02.08 12:37 "[Tiff] Libtiff and UTF-8 filename support", by Graeme Gill

2010.02.08 17:03 "Re: [Tiff] Libtiff and UTF-8 filename support", by Bob Friesenhahn

I was looking though the libtiff V4.0.0beta5 sources, and noticed that there is probably a discrepancy between the behavior of Libtiff between MSWindows and Unix like platforms with regard to filename encoding.

On Unix like systems (tif_unix.c) TIFFOpen calls the open() with a possibly UTF-8 filename, while in tif_win32.c it calls CreateFileA(), which doesn't take a UTF-8 argument.

Many Unix systems are character-set agnostic. File names are just binary blobs supporting null termination. Some Unix filesystems do support an indication of allowed character sets.

If a filesytem is set to a specific encoding, it may reject certain file names since they are not compatible with the encoding. It is also possible that a stored filename won't match a freshly provided one, which varies only by some factor that the OS normally ignores (e.g. case).

Wouldn't it be better to make TIFFOpen support UTF-8 on both Unix and MSWindows, by making tif_win32.c do a MultiByteToWideChar() and then use CreateFileW()? (or would this create problems in code that expects there to be differences between the platforms ?)

This seems like a good option to support, but there may be side-effects such as inability to compare two file names, or inability to store a file name.

Bob
--
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/