2021.05.02 00:36 "[Tiff] SIMD optimizations", by Larry Bank
-
2021.05.02 09:48 "Re: [Tiff] SIMD optimizations", by Even Rouault
- 2021.05.02 10:33 "Re: [Tiff] SIMD optimizations", by ZdPo Ster
-
2021.05.02 13:55 "Re: [Tiff] SIMD optimizations", by Bob Friesenhahn
-
2021.05.02 14:09 "Re: [Tiff] SIMD optimizations", by Even Rouault
-
2021.05.02 15:35 "Re: [Tiff] SIMD optimizations", by Larry Bank
- 2021.05.02 17:20 "Re: [Tiff] SIMD optimizations", by Even Rouault
- 2021.05.03 15:58 "Re: [Tiff] SIMD optimizations", by Even Rouault
-
2021.05.03 19:16 "Re: [Tiff] SIMD optimizations", by Jeff Breidenbach
-
2021.05.03 20:40 "Re: [Tiff] SIMD optimizations", by Bob Friesenhahn
- 2021.05.03 21:09 "Re: [Tiff] SIMD optimizations", by Even Rouault
- 2021.05.03 21:44 "Re: [Tiff] SIMD optimizations", by Larry Bank
- 2021.05.03 22:50 "Re: [Tiff] SIMD optimizations", by Akira Urushibata
-
2021.05.03 20:40 "Re: [Tiff] SIMD optimizations", by Bob Friesenhahn
-
2021.05.02 15:35 "Re: [Tiff] SIMD optimizations", by Larry Bank
-
2021.05.02 14:09 "Re: [Tiff] SIMD optimizations", by Even Rouault
2021.05.04 21:02 "Re: [Tiff] SIMD optimizations", by Larry Bank
There's no need to write asm code to make it fast. The problem is the awful way that the libtiff G4 encoder abuses memory. Clean C code will compile into a good result for G4 encoding and decoding. The decode side of the libtiff G4 codec isn't terrible because it's not doing anything awful with memory. The right way to count runs of 1-bit pixels is to use the CLZ (count leading zeros) instruction on the native integer size. GCC provides an intrinsic for it and has efficient code for systems that are missing this instruction (very few).
Larry B.