diff options
author | Tom de Vries <tdevries@suse.de> | 2022-01-27 15:03:59 +0100 |
---|---|---|
committer | Tom de Vries <tdevries@suse.de> | 2022-02-01 19:28:57 +0100 |
commit | bba61d403d05202deb698b352a4faef3feb1f04d (patch) | |
tree | 9cd5df78b7fdb39765d24988310979b11fc54d23 /gcc/tree.h | |
parent | 8ff0669f6d1d6126b7c010da02fa6532abb5e1ca (diff) | |
download | gcc-bba61d403d05202deb698b352a4faef3feb1f04d.zip gcc-bba61d403d05202deb698b352a4faef3feb1f04d.tar.gz gcc-bba61d403d05202deb698b352a4faef3feb1f04d.tar.bz2 |
[nvptx] Add bar.warp.sync
On a GT 1030 (sm_61), with driver version 470.94 I run into:
...
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none \
-O2 execution test
...
which minimizes to the same test-case as listed in commit "[nvptx] Update
default ptx isa to 6.3".
The first divergent branch looks like:
...
{
.reg .u32 %x;
mov.u32 %x,%tid.x;
setp.ne.u32 %r59,%x,0;
}
@ %r59 bra $L15;
mov.u64 %r48,%ar0;
mov.u32 %r22,2;
ld.u64 %r53,[%r48];
mov.u32 %r55,%r22;
mov.u32 %r54,1;
$L15:
...
and when inspecting the generated SASS, the branch is not setup as a divergent
branch, but instead as a regular branch.
This causes us to execute a shfl.sync insn in divergent mode, which is likely
to cause trouble given a remark in the ptx isa version 6.3, which mentions
that for .target sm_6x or below, all threads must excute the same
shfl.sync instruction in convergence.
Fix this by placing a "bar.warp.sync 0xffffffff" at the desired convergence
point (in the example above, after $L15).
Tested on x86_64 with nvptx accelerator.
gcc/ChangeLog:
2022-01-31 Tom de Vries <tdevries@suse.de>
* config/nvptx/nvptx.cc (nvptx_single): Use nvptx_warpsync.
* config/nvptx/nvptx.md (define_c_enum "unspecv"): Add
UNSPECV_WARPSYNC.
(define_insn "nvptx_warpsync"): New define_insn.
Diffstat (limited to 'gcc/tree.h')
0 files changed, 0 insertions, 0 deletions