aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTom de Vries <tdevries@suse.de>2020-09-22 16:38:07 +0200
committerTom de Vries <tdevries@suse.de>2020-10-05 08:53:11 +0200
commitab3f4b27abe8abc947e84ef84bfc9a18797c5868 (patch)
treefb97e9c578fe29054124c4db546c289101687455
parent4347d36f934ac6eeb807f73d48c70b29fc3fd8fb (diff)
downloadgcc-ab3f4b27abe8abc947e84ef84bfc9a18797c5868.zip
gcc-ab3f4b27abe8abc947e84ef84bfc9a18797c5868.tar.gz
gcc-ab3f4b27abe8abc947e84ef84bfc9a18797c5868.tar.bz2
[omp, ftracer] Don't duplicate blocks in SIMT region
When running the libgomp testsuite on x86_64-linux with nvptx accelerator on the test-case included in this patch, we run into: ... FAIL: libgomp.fortran/pr95654.f90 -O3 -fomit-frame-pointer -funroll-loops \ -fpeel-loops -ftracer -finline-functions execution test ... The test-case is a minimal version of this FAIL: ... FAIL: libgomp.fortran/pr66199-5.f90 -O3 -fomit-frame-pointer -funroll-loops \ -fpeel-loops -ftracer -finline-functions execution test ... but that one has stopped failing at commit c2ebf4f10de "openmp: Add support for non-rect simd and improve collapsed simd support". The problem is that ftracer duplicates a block containing GOMP_SIMT_VOTE_ANY. That is, before ftracer we have (dropping the GOMP_SIMT_ prefix): ... bb4(ENTER_ALLOC) *----------+ | \ | \ | v | * v bb8 *<------------* bb5(VOTE_ANY) *-------------+ | | | | | | | | | v | * v bb7(XCHG_IDX) *<------------* bb6(EXIT) ... The XCHG_IDX internal-fn does inter-SIMT-lane communication, which for nvptx maps onto shfl, an operator which has the requirement that the warp executing the operator is convergent. The warp diverges at bb4, and reconverges at bb5, and does not diverge by going to bb7, so the shfl is indeed executed by a convergent warp. After ftracer, we have: ... bb4(ENTER_ALLOC) *----------+ | \ | \ | \ | \ v v * * bb5(VOTE_ANY) bb8(VOTE_ANY) * * |\ /| | \ +--------+ | | \/ | | /\ | | / +----------v |/ * v bb7(XCHG_IDX) *<--------------* bb6(EXIT) ... The warp diverges again at bb5, but does not reconverge again before bb6, so the shfl is executed by a divergent warp, which causes the FAIL. Fix this by making ftracer ignore blocks containing ENTER_ALLOC, VOTE_ANY and EXIT, effectively treating the SIMT region conservatively. An argument can be made that the test needs to be added in a more generic place, like gimple_can_duplicate_bb_p or some such, and that ftracer then needs to use the generic test. But that's a discussion with a much broader scope, so I'm leaving that for another patch. Bootstrapped and reg-tested on x86_64-linux. Build on x86_64-linux with nvptx accelerator, tested with libgomp. gcc/ChangeLog: PR fortran/95654 * tracer.c (ignore_bb_p): Ignore GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_VOTE_ANY and GOMP_SIMT_EXIT. libgomp/ChangeLog: 2020-10-05 Tom de Vries <tdevries@suse.de> PR fortran/95654 * testsuite/libgomp.fortran/pr95654.f90: New test.
-rw-r--r--gcc/tracer.c18
-rw-r--r--libgomp/testsuite/libgomp.fortran/pr95654.f9011
2 files changed, 29 insertions, 0 deletions
diff --git a/gcc/tracer.c b/gcc/tracer.c
index 82ede72..5e51752 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -108,6 +108,24 @@ ignore_bb_p (const_basic_block bb)
return true;
}
+ for (gimple_stmt_iterator gsi = gsi_start_bb (CONST_CAST_BB (bb));
+ !gsi_end_p (gsi); gsi_next (&gsi))
+ {
+ gimple *g = gsi_stmt (gsi);
+
+ /* An IFN_GOMP_SIMT_ENTER_ALLOC/IFN_GOMP_SIMT_EXIT call must be
+ duplicated as part of its group, or not at all.
+ The IFN_GOMP_SIMT_VOTE_ANY is currently part of such a group,
+ so the same holds there, but it could be argued that the
+ IFN_GOMP_SIMT_VOTE_ANY could be generated after that group,
+ in which case it could be duplicated. */
+ if (is_gimple_call (g)
+ && (gimple_call_internal_p (g, IFN_GOMP_SIMT_ENTER_ALLOC)
+ || gimple_call_internal_p (g, IFN_GOMP_SIMT_EXIT)
+ || gimple_call_internal_p (g, IFN_GOMP_SIMT_VOTE_ANY)))
+ return true;
+ }
+
return false;
}
diff --git a/libgomp/testsuite/libgomp.fortran/pr95654.f90 b/libgomp/testsuite/libgomp.fortran/pr95654.f90
new file mode 100644
index 0000000..2dddd3d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/pr95654.f90
@@ -0,0 +1,11 @@
+! { dg-do run }
+program main
+ implicit none
+ integer :: d1
+ !$omp target map(from: d1)
+ !$omp teams distribute parallel do simd default(none) lastprivate(d1) num_teams (2) num_threads (1)
+ do d1 = 0, 31
+ end do
+ !$omp end target
+ if (d1 /= 32) stop 3
+end program main