diff options
author | Cesar Philippidis <cesar@codesourcery.com> | 2018-07-26 04:42:29 -0700 |
---|---|---|
committer | Tom de Vries <vries@gcc.gnu.org> | 2018-07-26 11:42:29 +0000 |
commit | 88a4654d03d0d05047aa168e45967ed2d94cb9ce (patch) | |
tree | 228f30ad516a6c3e25c474b2b38e70afb930256c /libgomp/plugin | |
parent | 0c6c2f5fc239121f70334a587e371aab2c7a60a4 (diff) | |
download | gcc-88a4654d03d0d05047aa168e45967ed2d94cb9ce.zip gcc-88a4654d03d0d05047aa168e45967ed2d94cb9ce.tar.gz gcc-88a4654d03d0d05047aa168e45967ed2d94cb9ce.tar.bz2 |
[libgomp, nvptx] Add error with recompilation hint for launch failure
Currently, when a kernel is lauched with too many workers, it results in a cuda
launch failure. This is triggered f.i. for parallel-loop-1.c at -O0 on a Quadro
M1200.
This patch detects this situation, and errors out with a hint on how to fix it.
Build and reg-tested on x86_64 with nvptx accelerator.
2018-07-26 Cesar Philippidis <cesar@codesourcery.com>
Tom de Vries <tdevries@suse.de>
* plugin/plugin-nvptx.c (nvptx_exec): Error if the hardware doesn't have
sufficient resources to launch a kernel, and give a hint on how to fix
it.
Co-Authored-By: Tom de Vries <tdevries@suse.de>
From-SVN: r262997
Diffstat (limited to 'libgomp/plugin')
-rw-r--r-- | libgomp/plugin/plugin-nvptx.c | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 5d9b515..3a4077a 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -1204,6 +1204,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, dims[i] = default_dims[i]; } + /* Check if the accelerator has sufficient hardware resources to + launch the offloaded kernel. */ + if (dims[GOMP_DIM_WORKER] * dims[GOMP_DIM_VECTOR] + > targ_fn->max_threads_per_block) + { + int suggest_workers + = targ_fn->max_threads_per_block / dims[GOMP_DIM_VECTOR]; + GOMP_PLUGIN_fatal ("The Nvidia accelerator has insufficient resources to" + " launch '%s' with num_workers = %d; recompile the" + " program with 'num_workers = %d' on that offloaded" + " region or '-fopenacc-dim=:%d'", + targ_fn->launch->fn, dims[GOMP_DIM_WORKER], + suggest_workers, suggest_workers); + } + /* This reserves a chunk of a pre-allocated page of memory mapped on both the host and the device. HP is a host pointer to the new chunk, and DP is the corresponding device pointer. */ |