2006.04.25 14:49 "[Tiff] Re: Tiff Digest, Vol 23, Issue 23", by Glenn Widener
>
> Message: 1
> Date: Mon, 24 Apr 2006 13:39:41 +0200
> From: "Gerben Vos" <Gerben@ZyLAB.COM>
> Subject: RE: [SPAM HEADER] - [Tiff] Microsoft Document Imaging status
> / snapshot - Email found in subject
> To: <tiff@lists.maptools.org>
> Message-ID:
> <FC840EC0BF7BFA45A0F94548AF36920DABF224@zynlms01.ZyLAB.WAN>
> Content-Type: text/plain; charset="iso-8859-1"
0xef 0x82 0xa7 = some kind of bullet point symbol
0xef 0x82 0xb7 = some kind of bullet point symbol (different to a7)
0xe2 0x80 0x93 = em-dash
0xe2 0x80 0x9c = `` (smart doublequotes, left side of quoted material)
0xe2 0x80 0x9d = '' (smart doublequotes, right side of quoted material)
0xe2 0x80 0x99 = ' (apostrophe of some kind)
0xe2 0x80 0xa6
0xe2 0x80 0x94 = short dash?
0xc3 0xa9 = e with grave. (00a9 is the unicode equivalent, perhapsthis will form some pattern)
These are clearly UTF-8 encoded Unicode characters:
U+F0A7 = (user-defined)
U+F0B7 = (user-defined)
U+2013 = en-dash (shorter than em-dash!)
U+201C = left double quote
U+201D = right double quote
U+2019 = right single quote
U+2026 = ellipsis (three dots)
U+2014 = em-dash (longer than en-dash!)
U+00A9 = e-graveSome of the ones you list (e.g., the first two bullets) are in the "implementation defined" Unicode area, but lists with the Microsoft assignments in there are easy to find on the Internet.
>From our experience decyphering "Word Smart Quotes" in Windows print driver output, it also produces:
2018 = left single quote.
By the way, thanks for posting this; I was intending to try to figure this out, but had to postpone it.
Likewise. I'm finishing up a release and will be diving into this later this week. Beyond wanting to read/write MS's TIFF text info, I will be pondering whether their format or a variant might be the basis for a "standard" TIFF selectable text extension. Note that I say "selectable" - text bounding boxes are an essential requirement for us.
--
Glenn Widener
SwiftView Tools Product Manager
SwiftView Inc. - quality PCL portable document tools and services
www.swiftview.com
Work: (971)223-2621
Cell: (503)351-1178