gpu: suboptimal preference of formats - emulated formats can have better performance #263

ruihe774 · 2024-05-18T19:07:36Z

In cmp_fmt(), non-emulated formats with more caps are always preferred. However, Some GPUs, e.g. my Intel Arc A750, and perhaps other Intel GPUs, have better performance with rgba16f, which is an emulated format, than rgba32f, which is non-emulated. This is confirmed by my test.

Content of my gpu->formats:

NAME                 TYPE   SIZE COMP CAPS         EMU DEPTH         HOST_BITS     GLSL_TYPE  GLSL_FMT   FOURCC
r8                   UNORM  1    R    SsLRbBV--HWG n   {8  0  0  0 } {8  0  0  0 } float      r8         R8    
rg8                  UNORM  2    RG   SsLRbBV--HWG n   {8  8  0  0 } {8  8  0  0 } vec2       rg8        GR88  
rgba8                UNORM  4    RGBA SsLRbBV--HWG n   {8  8  8  8 } {8  8  8  8 } vec4       rgba8      AB24  
bgra8                UNORM  4    BGRA SsLRbBV--HWG n   {8  8  8  8 } {8  8  8  8 } vec4       rgba8      AR24  
r16                  UNORM  2    R    SsLRbBV--HWG n   {16 0  0  0 } {16 0  0  0 } float      r16        R16   
rg16                 UNORM  4    RG   SsLRbBV--HWG n   {16 16 0  0 } {16 16 0  0 } vec2       rg16       GR32  
rgba16               UNORM  8    RGBA SsLRbBV--HWG n   {16 16 16 16} {16 16 16 16} vec4       rgba16           
r32f                 FLOAT  4    R    SsLRbBV--HWG n   {32 0  0  0 } {32 0  0  0 } float      r32f             
rg32f                FLOAT  8    RG   SsLRbBV--HWG n   {32 32 0  0 } {32 32 0  0 } vec2       rg32f            
rgba32f              FLOAT  16   RGBA SsLRbBV--HWG n   {32 32 32 32} {32 32 32 32} vec4       rgba32f          
r8u                  UINT   1    R    Ss-R-BV--HWG n   {8  0  0  0 } {8  0  0  0 } uint       r8ui             
rg8u                 UINT   2    RG   Ss-R-BV--HWG n   {8  8  0  0 } {8  8  0  0 } uvec2      rg8ui            
rgba8u               UINT   4    RGBA Ss-R-BV--HWG n   {8  8  8  8 } {8  8  8  8 } uvec4      rgba8ui          
r16u                 UINT   2    R    Ss-R-BV--HWG n   {16 0  0  0 } {16 0  0  0 } uint       r16ui            
rg16u                UINT   4    RG   Ss-R-BV--HWG n   {16 16 0  0 } {16 16 0  0 } uvec2      rg16ui           
rgba16u              UINT   8    RGBA Ss-R-BV--HWG n   {16 16 16 16} {16 16 16 16} uvec4      rgba16ui         
r16f                 FLOAT  4    R    SsLRbB---HWG y   {16 0  0  0 } {32 0  0  0 } float      r16f             
rg16f                FLOAT  8    RG   SsLRbB---HWG y   {16 16 0  0 } {32 32 0  0 } vec2       rg16f            
rgba16f              FLOAT  16   RGBA SsLRbB---HWG y   {16 16 16 16} {32 32 32 32} vec4       rgba16f          
rgb8                 UNORM  3    RGB  S-LRbBV--H-G y   {8  8  8  0 } {8  8  8  0 } vec3                  BG24  
rgb16                UNORM  6    RGB  S-LRbBV--H-G y   {16 16 16 0 } {16 16 16 0 } vec3                        
rgb32f               FLOAT  12   RGB  S-LRbBV--H-G y   {32 32 32 0 } {32 32 32 0 } vec3                        
rgb16f               FLOAT  12   RGB  S-LRbB---H-G y   {16 16 16 0 } {32 32 32 0 } vec3                        
rgb8u                UINT   3    RGB  S-----V--H-G y   {8  8  8  0 } {8  8  8  0 } uvec3                       
rgb16u               UINT   6    RGB  S-----V--H-G y   {16 16 16 0 } {16 16 16 0 } uvec3

It is not strange that even though rgba16f is emulated, it performs better in practice. The GPU can do some internal SIMD with 16f.

The text was updated successfully, but these errors were encountered:

ruihe774 changed the title ~~gpu: suboptimal selection of format in pl_find_fmt()~~ gpu: suboptimal preference of formats - emulated formats can have better performance May 18, 2024

This was referenced May 20, 2024

opengl: prefer rgba16f over rgba32f #264

Closed

opengl: add support for 16hf formats #265

Closed

ruihe774 closed this as completed Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu: suboptimal preference of formats - emulated formats can have better performance #263

gpu: suboptimal preference of formats - emulated formats can have better performance #263

ruihe774 commented May 18, 2024 •

edited

gpu: suboptimal preference of formats - emulated formats can have better performance #263

gpu: suboptimal preference of formats - emulated formats can have better performance #263

Comments

ruihe774 commented May 18, 2024 • edited

ruihe774 commented May 18, 2024 •

edited