aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
diff options
context:
space:
mode:
authorJessica Del <50999226+OutOfCache@users.noreply.github.com>2023-10-30 16:23:49 +0100
committerGitHub <noreply@github.com>2023-10-30 16:23:49 +0100
commit849297c97d9e87584cae7c83fcca9686f784d54a (patch)
treebe596ade81faec9c00e8841301e6db04720f84bd /llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
parent72e6c1c70d5e07bbc8cb7cae2ed915108daf93aa (diff)
downloadllvm-849297c97d9e87584cae7c83fcca9686f784d54a.zip
llvm-849297c97d9e87584cae7c83fcca9686f784d54a.tar.gz
llvm-849297c97d9e87584cae7c83fcca9686f784d54a.tar.bz2
[AMDGPU][wmma] - Add tied wmma intrinsic (#69903)
These new intrinsics, `amdgcn_wmma_tied_f16_16x16x16_f16` and `amdgcn_wmma_tied_f16_16x16x16_f16`, explicitly tie the destination accumulator matrix to the input accumulator matrix. The `wmma_f16` and `wmma_bf16` intrinsics only write to 16-bit of the 32-bit destination VGPRs. Which half is determined via the `op_sel` argument. The other half of the destination registers remains unchanged. In some cases however, we expect the destination to copy the other halves from the input accumulator. For instance, when packing two separate accumulator matrices into one. In that case, the two matrices are tied into the same registers, but separate halves. Then it is important to copy the other matrix values to the new destination.
Diffstat (limited to 'llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp')
0 files changed, 0 insertions, 0 deletions