Got grayscale ETC1 to DXT5A conversion working, using a small 32*8*3 entry table. This work is for DXT5 support in the universal texture format. Now that this is working I can proceed to finishing the full universal encoder.
The groundwork is laid out and it's all downhill from here now. My main worry now is the ETC1S->DXT1 lookup table's size, which is currently around 3-4MB. It can be quickly computed dynamically at startup or on the fly as needed, or it can be precomputed into the executable.
Note none of these images were created with my best ETC1 encoder. They use an early prototype from late 2016 that has so-so quality. The main point of these experiments is to prove that efficiently converting ETC1 data to DXT1/5 is practical and looks reasonable. The encoder is now aware of DXT5A transcoding, but it is aware of the ETC1S->DXT1 transcoding (which helps a lot).
All stats are dB vs. the original image. This image's subtle gradients are hard to handle, you can see this in the DXT1 version.
To those who argue that a universal GPU texture format that is based off ETC1/DXT1 isn't high quality enough: You would be amazed at the low quality levels teams use with crunch/Basis. This tech isn't about achieving highest texture quality. It's about enabling easy distribution of supercompressed GPU texture data. It's a "JPEG-like format for GPU texture data", usable on mobile or desktop.
Original
ETC1 near-optimal 48.903
ETC1S 46.322 (universal format base image in ETC1 mode)
ETC1S->DXT1 45.664
ETC1S green channel converted to DXT5A (43.878)
Original
ETC1 near-optimal 51.141
ETC1S 46.461
ETC1S->DXT1 44.865
ETC1S green channel converted to DXT5A 46.107