Quantcast
Channel: Richard Geldreich's Blog
Viewing all articles
Browse latest Browse all 302

UASTC block format encoding

$
0
0
UASTC is a 15 mode 4x4 pixel LDR-only subset of the ASTC specification with a simpler 128-bit block format. It can be losslessly transcoded to the standard ASTC block format, quickly transcoded to BC7 with very low quality loss (.75 RGB dB PSNR on average), or re-encoded to high quality ETC1/2 with a small amount of per-pixel work. There are 8 opaque modes, 1 solid color mode, and 6 alpha modes. These UASTC modes each map to one of 6 BC7 modes (all except 0 and 4). UASTC is the first high quality universal GPU texture format that supports block partitioning.

This post shows how each mode is laid out in a 128-bit UASTC block at the bit level. Bits are written starting from the beginning of the block (at the first byte's LSB) working "down" towards bit 128. The mode field is always first and is stored at bit 0 in the block (bit 0 of byte 0).

See the previous post for a description of each UASTC mode: How many subsets, the weight/endpoint BISE ranges, number of planes, etc.

Unlike ASTC, the weights are not stored in reverse bit order starting from the end of the block. Instead they are stored immediately following the endpoint bits in regular (LSB first) bit order.

The CEM field is always 8 (RGB Direct) for modes 0-7, and 12 (RGBA Direct) for 9-14. Blue Contraction isn't supported (i.e. the endpoints can be in arbitrary order, which we exploit to reduce the few index bits like BC7 does). Mode 8 is void-extent.

This is a snapshot of the current encoding. This may change somewhat over the next few weeks.

Field Definitions:


Mode: Huffman coded mode (2, 4, or 5 bits). One mode (15) is saved for future expansion. The Huffman codes and code lengths are (first bit of Huffman code is the LSB):

{ 0xB, 5 }, { 0x1B, 5 }, { 0x7, 5 }, { 0x17, 5 }, { 0xF, 5 }, { 0x1F, 5 }, { 0x2, 4 },
{ 0xA, 4 }, { 0x6, 4 }, { 0xE, 4 }, { 0x1, 4 }, { 0x0, 2 }, { 0x9, 4 }, { 0x5, 4 }, { 0xD, 4 }, { 0x3, 4 }

ETC1F, ETC1D, ETCI0, ETCI1: 8-bits of ETC1 transcode hints (flip, differential, inten table 0, inten table1).

These hints are used by the transcoder to quickly create ETC1 blocks from the unpacked UASTC texels. To use them, the transcoder computes each 4x2 or 2x4 subblock's average color, quantizes them to 555:333 or 444:444 bits, then computes the selectors in luma space. No other work is necessary (because all the hard work was done in the UASTC encoder).

ETC2TM: 8-bits of ETC2 EAC A8 transcode hints (4-bit table, 4-bit multiplier)

This is similar to how ETC1 blocks are packed, except these hints are for the alpha portion of ETC2 EAC A8 blocks. These bits are only present in modes 9-14 (the alpha modes).

ETQ: Packed endpoint trits/quints values. A simplified form of BISE is used in UASTC, see:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#astc-integer-sequence-encoding

See the "UASTC BISE Endpoint Ranges table" below for the # of trits or quints for each endpoint range. Some of the ranges don't have trits or quints, so there will be no ETQ fields.

We store the trits/quints first, followed by each value's bits. The bit interleaving and trit/quint rearranging and preprocessing in section 18.2 aren't used. Instead the encoded trits/quints are stored in UASTC as-is.

For quints, each encoded value is up to 7-bits: quint2*25+quint1*5+quint0, and similar for trits except each encoded value is up to 8-bits. When the number of endpoint values isn't a multiple of 5 or 3 values, the size of the final code is the minimum # of bits necessary to represent the encoded value (to save bits).

EBITS: Endpoint bits (one set of bits per ASTC endpoint value). See the "UASTC BISE Endpoint Ranges table" below for the # of bits for each endpoint range. Endpoint order is the same as ASTC's: RL, RH, GL, GH, BL, BH, etc. Max of 18 values (RGB 3-subsets: 3*2*3).

To retrieve the endpoint values, you extract the trits/quints from the encoded ETQ values, shift each one left the appropriate number of bits (depending on the UASTC mode's endpoint range) and logically OR in the EBITS values.

Endpoint values are a sequence of integers that must be dequantized to [0,255] by following the ASTC spec in section 18.13, see:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#astc-endpoint-unquantization

WEIGHTS: Encoded weight indices. Just like BC7, the first weight of each subset's "anchor" texel index always has an MSB of 0, so these weights can be encoded with one less bit than the others. (UASTC doesn't use Blue Contraction so we can use this trick.)

Weights are always encoded as plain bits (no BISE necessary). Weight ordering is the same as ASTC's (raster order, left to right/top to bottom scanline). In dual plane mode, the ordering is also ASTC's: p0 p1, p0 p1, p0 p1, etc. (two weight indices per texel).

The weights are dequantized to 6-bit interpolation values in the same way as ASTC's:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#_weight_unquantization

And the endpoints are interpolated in the same way as ASTC's:
https://www.khronos.org/registry/DataFormat/specs/1.1/dataformat.1.1.html#astc_weight_application

PAT: Index into the common BC7/ASTC partition pattern table. This table contains BC7 pattern indices, ASTC pattern seeds, and permutation/flip flags which indicate how to map ASTC pattern subset indices to BC7's. There are three tables and 60 total partition patterns.

A UASTC decoder can either use ASTC's partition pattern generator or BC7's partition tables. To map ASTC's partition patterns to BC7's, the pattern subset indices are either used as-is, inverted, permuted, and/or combined to get BC7 partition pattern subset indices (see the tables/example code at the very bottom). These simple transformations correspond to changing the order of the encoded BC7 endpoints, or setting 2 endpoints in a 3-subset BC7 block to the same color/alpha values. Every ASTC pattern included in the below common tables maps to a BC7 pattern without loss (i.e. there is no subset "crosstalk" when mapping a UASTC to a BC7 pattern).

COMPSEL: ASTC's Color Component Selector field. Only present on Dual Plane modes.
This maps to BC7 mode 5's 2-bit component rotation field (the value must be remapped).

Other notes:


- The number of color components is 3 for modes [0,7], or 4 for modes [8,14].
- The number of subsets is [1,3].
- The total number of endpoint values is num_comps * 2 * num_subsets.
- The number of planes is either [1,2].
- The total number of weight values is either 16 (non-dual plane modes) or 32 (dual plane modes).
- Dual plane modes always have 1 subset in UASTC.
- Weight indices are always 1, 2, 3, or 4-bits for compatibility with BC7. BISE is not used at all for weight indices, only endpoints.
- Various endpoint value ordering examples for 1 and 2 subsets (this is the same as ASTC):
1 subset RGB: RL0 RH0 GL0 GH0 BL0 BH0
1 subset RGBA: RL0 RH0 GL0 GH0 BL0 BH0 AL0 AH0
2 subset RGB: RL0 RH0 GL0 GH0 BL0 BH0 RL1 RH1 GL1 GH1 BL1 BH1
2 subset RGBA: RL0 RH0 GL0 GH0 BL0 BH0 AL0 AH0 RL1 RH1 GL1 GH1 BL1 BH1 AL1 AH1
- Transcoding UASTC->ASTC is always a 100% lossless operation. The endpoints may need to be swapped (and the corresponding weight indices inverted) to disable blue contraction, but this is a lossless transformation.
- The primary source of loss when transcoding UASTC->BC7 is mapping UASTC endpoints to BC7 endpoints. This is done using a simple scale with optional optimal p-bit computation. The UASTC weight indices are either copied as-is, or converted to the closest corresponding BC7 weight indices using a lookup table. The partition patterns are lossless, the weight tables are the same for 2/3-bits and very similar for 4-bits, and the endpoint interpolation method is nearly the same (16-bits in UASTC/ASTC, 8-bits with BC7, and both formats use [0,64] weights with rounding in the linear interpolation).

Modes:

Format is "field: bit_offset num_bits"

**** Mode: 0 (CEM 8)
DualPlane: 0, WeightRange: 8 (16), Subsets: 1, EndpointRange: 19 (192) MODE6 RGB
mode: 0 5
ETC1F: 5 1
ETC1D: 6 1
ETC1I0: 7 3
ETC1I1: 10 3
ETQ: 13 8
ETQ: 21 2
EBITS: 23 6
EBITS: 29 6
EBITS: 35 6
EBITS: 41 6
EBITS: 47 6
EBITS: 53 6
WEIGHTS: 59 63
Total bits: 122, endpoint bits: 46, weight bits: 63

**** Mode: 1 (CEM 8)
DualPlane: 0, WeightRange: 2 (4), Subsets: 1, EndpointRange: 20 (256) MODE3
mode: 0 5
ETC1F: 5 1
ETC1D: 6 1
ETC1I0: 7 3
ETC1I1: 10 3
EBITS: 13 8
EBITS: 21 8
EBITS: 29 8
EBITS: 37 8
EBITS: 45 8
EBITS: 53 8
WEIGHTS: 61 31
Total bits: 92, endpoint bits: 48, weight bits: 31

**** Mode: 2 (CEM 8)
DualPlane: 0, WeightRange: 5 (8), Subsets: 2, EndpointRange: 8 (16) MODE1
mode: 0 5
ETC1F: 5 1
ETC1D: 6 1
ETC1I0: 7 3
ETC1I1: 10 3
PAT: 13 5
EBITS: 18 4
EBITS: 22 4
EBITS: 26 4
EBITS: 30 4
EBITS: 34 4
EBITS: 38 4
EBITS: 42 4
EBITS: 46 4
EBITS: 50 4
EBITS: 54 4
EBITS: 58 4
EBITS: 62 4
WEIGHTS: 66 46
Total bits: 112, endpoint bits: 48, weight bits: 46

**** Mode: 3 (CEM 8)
DualPlane: 0, WeightRange: 2 (4), Subsets: 3, EndpointRange: 7 (12) MODE2
mode: 0 5
ETC1F: 5 1
ETC1D: 6 1
ETC1I0: 7 3
ETC1I1: 10 3
PAT: 13 4
ETQ: 17 8
ETQ: 25 8
ETQ: 33 8
ETQ: 41 5
EBITS: 46 2
EBITS: 48 2
EBITS: 50 2
EBITS: 52 2
EBITS: 54 2
EBITS: 56 2
EBITS: 58 2
EBITS: 60 2
EBITS: 62 2
EBITS: 64 2
EBITS: 66 2
EBITS: 68 2
EBITS: 70 2
EBITS: 72 2
EBITS: 74 2
EBITS: 76 2
EBITS: 78 2
EBITS: 80 2
WEIGHTS: 82 29
Total bits: 111, endpoint bits: 65, weight bits: 29

**** Mode: 4 (CEM 8)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 12 (40) MODE3
mode: 0 5
ETC1F: 5 1
ETC1D: 6 1
ETC1I0: 7 3
ETC1I1: 10 3
PAT: 13 5
ETQ: 18 7
ETQ: 25 7
ETQ: 32 7
ETQ: 39 7
EBITS: 46 3
EBITS: 49 3
EBITS: 52 3
EBITS: 55 3
EBITS: 58 3
EBITS: 61 3
EBITS: 64 3
EBITS: 67 3
EBITS: 70 3
EBITS: 73 3
EBITS: 76 3
EBITS: 79 3
WEIGHTS: 82 30
Total bits: 112, endpoint bits: 64, weight bits: 30

**** Mode: 5 (CEM 8)
DualPlane: 0, WeightRange: 5 (8), Subsets: 1, EndpointRange: 20 (256) MODE6 RGB
mode: 0 5
ETC1F: 5 1
ETC1D: 6 1
ETC1I0: 7 3
ETC1I1: 10 3
EBITS: 13 8
EBITS: 21 8
EBITS: 29 8
EBITS: 37 8
EBITS: 45 8
EBITS: 53 8
WEIGHTS: 61 47
Total bits: 108, endpoint bits: 48, weight bits: 47

**** Mode: 6 (CEM 8)
DualPlane: 1, WeightRange: 2 (4), Subsets: 1, EndpointRange: 18 (160) MODE5 RGB
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
COMPSEL: 12 2
ETQ: 14 7
ETQ: 21 7
EBITS: 28 5
EBITS: 33 5
EBITS: 38 5
EBITS: 43 5
EBITS: 48 5
EBITS: 53 5
WEIGHTS: 58 63
Total bits: 121, endpoint bits: 44, weight bits: 63

**** Mode: 7 (CEM 8)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 12 (40) MODE2
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
PAT: 12 5
ETQ: 17 7
ETQ: 24 7
ETQ: 31 7
ETQ: 38 7
EBITS: 45 3
EBITS: 48 3
EBITS: 51 3
EBITS: 54 3
EBITS: 57 3
EBITS: 60 3
EBITS: 63 3
EBITS: 66 3
EBITS: 69 3
EBITS: 72 3
EBITS: 75 3
EBITS: 78 3
WEIGHTS: 81 30
Total bits: 111, endpoint bits: 64, weight bits: 30

**** Mode: 8 (Void-Extent)
Void-Extent: Solid Color RGBA (MODE5 or MODE6)
mode: 0 4
R: 4 8
G: 12 8
B: 20 8
A: 28 8
Total bits: 36

**** Mode: 9 (CEM 12)
DualPlane: 0, WeightRange: 2 (4), Subsets: 2, EndpointRange: 8 (16) MODE7
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
PAT: 20 5
EBITS: 25 4
EBITS: 29 4
EBITS: 33 4
EBITS: 37 4
EBITS: 41 4
EBITS: 45 4
EBITS: 49 4
EBITS: 53 4
EBITS: 57 4
EBITS: 61 4
EBITS: 65 4
EBITS: 69 4
EBITS: 73 4
EBITS: 77 4
EBITS: 81 4
EBITS: 85 4
WEIGHTS: 89 30
Total bits: 119, endpoint bits: 64, weight bits: 30

**** Mode: 10 (CEM 12)
DualPlane: 0, WeightRange: 8 (16), Subsets: 1, EndpointRange: 13 (48) MODE6
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
ETQ: 20 8
ETQ: 28 5
EBITS: 33 4
EBITS: 37 4
EBITS: 41 4
EBITS: 45 4
EBITS: 49 4
EBITS: 53 4
EBITS: 57 4
EBITS: 61 4
WEIGHTS: 65 63
Total bits: 128, endpoint bits: 45, weight bits: 63

**** Mode: 11 (CEM 12)
DualPlane: 1, WeightRange: 2 (4), Subsets: 1, EndpointRange: 13 (48) MODE5
mode: 0 2
ETC1F: 2 1
ETC1D: 3 1
ETC1I0: 4 3
ETC1I1: 7 3
ETC2TM: 10 8
COMPSEL: 18 2
ETQ: 20 8
ETQ: 28 5
EBITS: 33 4
EBITS: 37 4
EBITS: 41 4
EBITS: 45 4
EBITS: 49 4
EBITS: 53 4
EBITS: 57 4
EBITS: 61 4
WEIGHTS: 65 63
Total bits: 128, endpoint bits: 45, weight bits: 63

**** Mode: 12 (CEM 12)
DualPlane: 0, WeightRange: 5 (8), Subsets: 1, EndpointRange: 19 (192) MODE6
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
ETQ: 20 8
ETQ: 28 5
EBITS: 33 6
EBITS: 39 6
EBITS: 45 6
EBITS: 51 6
EBITS: 57 6
EBITS: 63 6
EBITS: 69 6
EBITS: 75 6
WEIGHTS: 81 47
Total bits: 128, endpoint bits: 61, weight bits: 47

**** Mode: 13 (CEM 12)
DualPlane: 1, WeightRange: 0 (2), Subsets: 1, EndpointRange: 20 (256) MODE5
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
COMPSEL: 20 2
EBITS: 22 8
EBITS: 30 8
EBITS: 38 8
EBITS: 46 8
EBITS: 54 8
EBITS: 62 8
EBITS: 70 8
EBITS: 78 8
WEIGHTS: 86 31
Total bits: 117, endpoint bits: 64, weight bits: 31

**** Mode: 14 (CEM 12)
DualPlane: 0, WeightRange: 2 (4), Subsets: 1, EndpointRange: 20 (256) MODE6
mode: 0 4
ETC1F: 4 1
ETC1D: 5 1
ETC1I0: 6 3
ETC1I1: 9 3
ETC2TM: 12 8
EBITS: 20 8
EBITS: 28 8
EBITS: 36 8
EBITS: 44 8
EBITS: 52 8
EBITS: 60 8
EBITS: 68 8
EBITS: 76 8
WEIGHTS: 84 31
Total bits: 115, endpoint bits: 64, weight bits: 31

UASTC BISE Endpoint Ranges table:

Range    Bits Trits Quints       UASTC Modes   Quant. Levels
7        2    1                  3             12
8        4                       2 9           16
12       3          1            4 7           40
13       4    1                  10 11         48
18       5          1            6             160
19       6    1                  0 12          192
20       8                       1 5 13 14     256


UASTC/BC7 2-subset partition pattern table:


const uint32_t TOTAL_ASTC_BC7_COMMON_PARTITIONS2 = 30

struct
{
  int m_bc7_pattern;
  int m_astc_seed;
// if true, invert the BC7 pattern's subset index to match ASTC's subset index
  bool m_invert;
} g_astc_bc7_common_partitions2[TOTAL_ASTC_BC7_COMMON_PARTITIONS2] =

{
  { 0, 28, false  }, { 1, 20, false }, { 2, 16, true }, { 3, 29, false },
  { 4, 91, true }, { 5, 9, false }, { 6, 107, true }, { 7, 72, true },
  { 8, 149, false }, { 9, 204, true }, { 10, 50, false }, { 11, 114, true },
  { 12, 496, true }, { 13, 17, true }, { 14, 78, false }, { 15, 39, true }, 
  { 17, 252, true }, { 18, 828, true }, { 19, 43, false }, { 20, 156, false }, 
  { 21, 116, false }, { 22, 210, true }, { 23, 476, true }, { 24, 273, false },
  { 25, 684, true }, { 26, 359, false }, { 29, 246, true }, { 32, 195, true },
  { 33, 694, true }, { 52, 524, true }
};


UASTC/BC7 3-subset partition pattern table:


const uint32_t TOTAL_ASTC_BC7_COMMON_PARTITIONS3 = 11;

const struct
{
  uint8_t m_bc7;
  uint16_t m_astc;

// maps ASTC to BC7 subset indices using g_astc_bc7_subset_index_perm_tables[][]
  uint8_t m_astc_to_bc7_perm;
} g_astc_bc7_common_partitions3[TOTAL_ASTC_BC7_COMMON_PARTITIONS3] =
{
  { 4, 260, 0 },  { 8, 74, 5 },  { 9, 32, 5 },  { 10, 156, 2 },
  { 11, 183, 2 },  { 12, 15, 0 },  { 13, 745, 4 },  { 20, 0, 1 },
  { 35, 335, 1 },  { 36, 902, 5 },  { 57, 254, 0 }
};


const uint8_t g_astc_bc7_subset_index_perm_tables[6][3] = 
{
{ 0, 1, 2 },{ 1, 2, 0 },{ 2, 0, 1 },{ 2, 1, 0 },{ 0, 2, 1 },{ 1, 0, 2 }
};

UASTC/BC7 2-subset partition pattern table (mapped to the BC7 3-subset patterns, used only in UASTC mode 7):


const uint32_t TOTAL_BC73_ASTC2_COMMON_PARTITIONS = 19;

const struct
{
uint8_t m_bc73;
uint16_t m_astc2;
// [0,5] - how to modify the BC7 3-subset pattern to match the ASTC pattern (LSB=invert). See convert_subset_index_3_to_2().
uint8_t k;
} g_bc73_astc2_common_partitions[TOTAL_BC73_ASTC2_COMMON_PARTITIONS] =
{
{ 10, 36, 4 },{ 11, 48, 4 },{ 0, 61, 3 },{ 2, 137, 4 },
{ 8, 161, 5 },{ 13, 183, 4 },{ 1, 226, 2 },{ 33, 281, 2 },
{ 40, 302, 3 },{ 20, 307, 4 },{ 21, 479, 0 },{ 58, 495, 3 },
{ 3, 593, 0 },{ 32, 594, 2 },{ 59, 605, 1 },{ 34, 799, 3 },
{ 20, 812, 1 },{ 14, 988, 4 },{ 31, 993, 3 }
};

uint32_t convert_subset_index_3_to_2(uint32_t p, uint32_t k)
{
    assert(k < 6);
    switch (k >> 1)
    {
    case 0:
        if (p <= 1)
            p = 0;
        else 
            p = 1;
        break;
    case 1:
        if (p == 0)
            p = 0;
        else 
            p = 1;
        break;
    case 2:
        if ((p == 0) || (p == 2))
            p = 0;
        else 
            p = 1;
        break;
    }
    if (k & 1)
        p = 1 - p;
    return p;
}


UASTC weight tables:


const uint32_t g_astc_bc7_weights1[2] = { 0, 64 };
const uint32_t g_astc_bc7_weights2[4] = { 0, 21, 43, 64 };
const uint32_t g_astc_bc7_weights3[8] = { 0, 9, 18, 27, 37, 46, 55, 64 };
const uint32_t g_bc7_weights4[16] = { 0, 4, 9, 13, 17, 21, 26, 30, 34, 38, 43, 47, 51, 55, 60, 64 };
const uint32_t g_astc_weights4[16] = { 0, 4, 8, 12, 17, 21, 25, 29, 35, 39, 43, 47, 52, 56, 60, 64 };

Note BC7 and ASTC use the same 2 and 3 bit weight tables, while the 4-bit tables are slightly different.

Viewing all articles
Browse latest Browse all 302

Trending Articles