aboutsummaryrefslogtreecommitdiff
path: root/gcc
diff options
context:
space:
mode:
authorTom de Vries <tdevries@suse.de>2022-01-26 14:17:40 +0100
committerTom de Vries <tdevries@suse.de>2022-02-01 19:28:52 +0100
commit8ff0669f6d1d6126b7c010da02fa6532abb5e1ca (patch)
tree18bf26420bad3519dcecb79ccff84814e9978e5b /gcc
parent57f971f99209cc950d7e706b7b52f4c9ef1d10b0 (diff)
downloadgcc-8ff0669f6d1d6126b7c010da02fa6532abb5e1ca.zip
gcc-8ff0669f6d1d6126b7c010da02fa6532abb5e1ca.tar.gz
gcc-8ff0669f6d1d6126b7c010da02fa6532abb5e1ca.tar.bz2
[nvptx] Update default ptx isa to 6.3
With the following example, minimized from parallel-dims.c: ... int main (void) { int vectors_max = -1; #pragma acc parallel num_gangs (1) num_workers (1) copy (vectors_max) { for (int i = 0; i < 2; i++) for (int j = 0; j < 2; j++) #pragma acc loop vector reduction (max: vectors_max) for (int k = 0; k < 32; k++) vectors_max = k; } if (vectors_max != 31) __builtin_abort (); return 0; } ... I run into (T400, driver version 470.94): ... FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \ -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 \ execution test ... The FAIL does not happen with GOMP_NVPTX_JIT=-O0. The problem seems to be that the shfl insns for the vector reduction are not executed uniformly by the warp. Enforcing this by using shfl.sync fixes the problem. Fix this by setting the ptx isa to 6.3 by default, which allows the use of shfl.sync. Tested on x86_64 with nvptx accelerator. gcc/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.opt (mptx): Set to PTX_VERSION_6_3 by default.
Diffstat (limited to 'gcc')
-rw-r--r--gcc/config/nvptx/nvptx.opt2
1 files changed, 1 insertions, 1 deletions
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 6514dd3..6e12b1f 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -89,5 +89,5 @@ EnumValue
Enum(ptx_version) String(7.0) Value(PTX_VERSION_7_0)
mptx=
-Target RejectNegative ToLower Joined Enum(ptx_version) Var(ptx_version_option) Init(PTX_VERSION_3_1)
+Target RejectNegative ToLower Joined Enum(ptx_version) Var(ptx_version_option) Init(PTX_VERSION_6_3)
Specify the version of the ptx version to use.