Quantcast
Channel: Richard Geldreich's Blog
Browsing all 302 articles
Browse latest View live

A few Intel SPMD Compiler (ispc) C porting tips

I took notes as I was porting my new BC7 encoder from C to ispc. First, be sure to read and re-read the user guide, performance guide, and FAQ. This compiler tech kicks ass and I hope Intel keeps...

View Article


Image may be NSFW.
Clik here to view.

ispc_texcomp BC7 issues

Been studying ispc_texcomp today to better understand why it's so slow compared to my encoder (currently by a factor of 2x at max quality). We do many of the same things, so why is it slower? Overall,...

View Article


BC7 showdown: ispc_texcomp vs. my ispc encoder

This benchmark compares the Fast ISPC Texture Compressor's BC7 encoder vs. my new ispc vectorized encoder. I've just barely begun to profile and optimize it, but it's already looking really strong. To...

View Article

RDO BC7 encoder planning

I'm building my first RDO BC7/BC6H encoders for our product (Basis), and I'm trying to decide which BC7 modes to implement first. I've decided on modes 1+6 for opaque textures. For alpha, I've boiled...

View Article

Proper pbit computation in the BC7 texture format

The BC7 GPU texture format supports the clever concept of endpoint "pbits", where the LSB's of RGB(A) endpoints are forced to the same value so only 1 bit (vs. 3 or 4) needs to be coded. BC7's use of...

View Article


DirectXTex BC7 3 subset wierdness

I brought Microsoft's DirectXTex project (latest code) into my test project, to see how it fairs vs. ispc_texcomp and my encoder. Unfortunately, it appears broken. Across 31 test images (kodim and...

View Article

LZHAM decompressor vectorization

The is just a sketch of an idea. It might not pan out, there are too many details to know until I try it. LZHAM is an old codec now, but maybe I could stretch it a bit further before moving on to a new...

View Article

Image may be NSFW.
Clik here to view.

GPU texture compression error metrics

While working on an encoder I conduct a lot of experiments (probably thousands over time) to improve it. To check for regressions or silly bugs, you must use some sort of error metrics otherwise it'll...

View Article


Image may be NSFW.
Clik here to view.

BC7 mode utilization comparison of three encoders

I've been doing some benchmarking today to see where I stand with raw (non-RDO) BC7 encoding. Depending on the profile, I'm up to 2.26x faster at higher average quality using linear colorspace metrics...

View Article


Comparing ispc_texcomp alpha performance vs. my encoder

Most papers and encoders focus on opaque performance with BC7, but alpha textures are very important too. BC7's alpha support is somewhat weaker than opaque, especially with alpha signals that are...

View Article

A tale of multiple BC7 encoders

To get going with BC7, I found an open source block decompressor and first created a BC7 struct (using bitfields) to create a single mode 6 block. I filled in the fields for a simple mode 6 block,...

View Article

New BC7 encoder open sourced

bc7m16.c/.h is one of the strongest CPU BC7 encoders available at the moment for opaque textures:https://github.com/richgel999/bc7enc16In perceptual mode using luma PSNR on opaque textures it beats...

View Article

BC7 showdown #2: Basis ispc vs. NVidia Texture Tools vs. ispc_texcomp slow

Got my BC7 encoder test app linking with NVTT. The BC7 encoder in NVTT is the granddaddy of them all, from what I understand. It's painfully slow but very high quality. I called it using the blob of...

View Article


One last non-RDO BC7 benchmark: ispc_texcomp slow vs. my encoder in...

"ISPC" is Intel's Fast ISPC Texture Compressor. (Both of our encoders use ispc.)In perceptual mode, you basically trade off around 1 dB of RGB PSNR for a gain of 2.6 dB Luma PSNR, relative to...

View Article

Image may be NSFW.
Clik here to view.

BC7 mode 0-only encoding examples

BC7 mode 0 (3 subsets, 444.1 endpoints with a unique p-bit per endpoint, 3-bit indices, 16 partitions) is probably the most difficult mode to handle correctly. If you don't do the pbits right, the...

View Article


Image may be NSFW.
Clik here to view.

Basis non-RDO BC7 benchmark

This graph shows the performance of ispc_texcomp at each of its supported opaque profiles (from ultrafast to slow) vs. Basis's non-RDO BC7 encoder at various settings. I haven't decided on the settings...

View Article

Image may be NSFW.
Clik here to view.

Graphing our BC7 encoder's quality vs. encoding time for opaque textures

I non-RDO encoded to BC7 a 4k test texture containing random blocks chosen from a few thousand input textures 5000 times, using random codec settings for each trial encode. I recorded all the results...

View Article


BC7 opaque encoding sweetspots

By running our non-RDO codec in an automated test thousands of times, I've identified 3-4 major encoding time vs. quality sweetspots. These regions are somewhat specific to our codec and how it's...

View Article

Image may be NSFW.
Clik here to view.

Graphing BC7 with random codec options: modes 1 and 6

I used a different set of data (and a different random seed) to compute this scattergraph vs. my previous post. This time, I've highlighted all random solutions that enable mode 1:Modes 1 and 6 are the...

View Article

Some Basis baseline universal format details

The "Rosetta stone" for a basic universal GPU format is a subset of ETC1 we call ETC1S/ETC1SA. It uses no flips and disables individual mode, i.e. each 4x4 block has a single 5:5:5 color, a 3-bit table...

View Article
Browsing all 302 articles
Browse latest View live