diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2024-02-25 10:21:04 -0800 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2024-02-25 20:25:19 -0800 |
commit | 4972f97a265c574d51e20373ddefd66576051e5c (patch) | |
tree | 70c88063dc0bed75284b04fb6585ca705a1252b2 /gcc/tree-vect-loop-manip.cc | |
parent | ad178a2be7ea099d0dfc1452186035748c0828dd (diff) | |
download | gcc-4972f97a265c574d51e20373ddefd66576051e5c.zip gcc-4972f97a265c574d51e20373ddefd66576051e5c.tar.gz gcc-4972f97a265c574d51e20373ddefd66576051e5c.tar.bz2 |
x86: Properly implement AMX-TILE load/store intrinsics
ldtilecfg and sttilecfg take a 512-byte memory block. With
_tile_loadconfig implemented as
extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
__asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}
GCC sees:
(parallel [
(asm_operands/v ("ldtilecfg %X0") ("") 0
[(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -64 [0xffffffffffffffc0])) [1 MEM[(const void * *)&tile_data]+0 S8 A128])]
[(asm_input:DI ("m"))]
(clobber (reg:CC 17 flags))])
and the memory operand size is 1 byte. As the result, the rest of 511
bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.
gcc/ChangeLog:
PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.
gcc/testsuite/ChangeLog:
PR target/114098
* gcc.target/i386/amxtile-4.c: New test.
Diffstat (limited to 'gcc/tree-vect-loop-manip.cc')
0 files changed, 0 insertions, 0 deletions