riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Jessica Del <50999226+OutOfCache@users.noreply.github.com>	2023-10-30 16:23:49 +0100
committer	GitHub <noreply@github.com>	2023-10-30 16:23:49 +0100
commit	849297c97d9e87584cae7c83fcca9686f784d54a (patch)
tree	be596ade81faec9c00e8841301e6db04720f84bd /llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
parent	72e6c1c70d5e07bbc8cb7cae2ed915108daf93aa (diff)
download	llvm-849297c97d9e87584cae7c83fcca9686f784d54a.zip llvm-849297c97d9e87584cae7c83fcca9686f784d54a.tar.gz llvm-849297c97d9e87584cae7c83fcca9686f784d54a.tar.bz2

[AMDGPU][wmma] - Add tied wmma intrinsic (#69903)

These new intrinsics, `amdgcn_wmma_tied_f16_16x16x16_f16` and `amdgcn_wmma_tied_f16_16x16x16_f16`, explicitly tie the destination accumulator matrix to the input accumulator matrix. The `wmma_f16` and `wmma_bf16` intrinsics only write to 16-bit of the 32-bit destination VGPRs. Which half is determined via the `op_sel` argument. The other half of the destination registers remains unchanged. In some cases however, we expect the destination to copy the other halves from the input accumulator. For instance, when packing two separate accumulator matrices into one. In that case, the two matrices are tied into the same registers, but separate halves. Then it is important to copy the other matrix values to the new destination.

Diffstat (limited to 'llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: