I've been doing some benchmarking today to see where I stand with raw (non-RDO) BC7 encoding. Depending on the profile, I'm up to 2.26x faster at higher average quality using linear colorspace metrics vs. ispc_texcomp.
BC7 mode utilization, ispc_texcomp's basic profile, with my encoder set to a roughly similar profile:
Here's ispc_texcomp's slow profile, my encoder at a higher quality profile, and DirectXTex with BC_FLAGS_USE_3SUBSETS (with the pbit bug fixed so this is a fair comparison).
This was a multithreaded test. The timings are the overall amount of CPU time utilized only for encoding across all threads.
My encoder favors mode 6 on grayscale inputs (one large texture is grayscale), and it's always the first mode that's checked for opaque blocks so on simple blocks mode 6 gets favored. Mode 6 has very good endpoint precision (7777.1) and large 4-bit indices, so even on complex blocks it's fairly good.
BC7 mode utilization, ispc_texcomp's basic profile, with my encoder set to a roughly similar profile:
Here's ispc_texcomp's slow profile, my encoder at a higher quality profile, and DirectXTex with BC_FLAGS_USE_3SUBSETS (with the pbit bug fixed so this is a fair comparison).
This was a multithreaded test. The timings are the overall amount of CPU time utilized only for encoding across all threads.
My encoder favors mode 6 on grayscale inputs (one large texture is grayscale), and it's always the first mode that's checked for opaque blocks so on simple blocks mode 6 gets favored. Mode 6 has very good endpoint precision (7777.1) and large 4-bit indices, so even on complex blocks it's fairly good.