diff options
author | Sandra Loosemore <sloosemore@baylibre.com> | 2025-01-13 20:18:12 +0000 |
---|---|---|
committer | Sandra Loosemore <sloosemore@baylibre.com> | 2025-01-14 16:29:30 +0000 |
commit | 1294b819e1207c0ae76db29a75256f3fafd5f262 (patch) | |
tree | 1efda36c3f27c7234248f5779f20ebb9f66e1fa0 /gcc/testsuite/c-c++-common/gomp | |
parent | 210a090e33ec4b51248077b701d432d36ef43fb3 (diff) | |
download | gcc-1294b819e1207c0ae76db29a75256f3fafd5f262.zip gcc-1294b819e1207c0ae76db29a75256f3fafd5f262.tar.gz gcc-1294b819e1207c0ae76db29a75256f3fafd5f262.tar.bz2 |
OpenMP: Re-work and extend context selector resolution
This patch reimplements the middle-end support for "declare variant"
and extends the resolution mechanism to also handle metadirectives
(PR112779). It also adds partial support for dynamic selectors
(PR113904) and fixes a selector scoring bug reported as PR114596. I hope
this rewrite also improves the engineering aspect of the code, e.g. more
comments to explain what it is doing.
In most cases, variant constructs can be resolved either in the front
end or during gimplification; if the variant with the highest score
has a static selector, then only that one is emitted. In the case
where it has a dynamic selector, it is resolved into a (possibly nested)
if/then/else construct, testing the run-time predicate for each selector
sorted by decreasing order of score until a static selector is found.
In some cases, notably a variant construct in a "declare simd"
function which may or may not expand into a simd clone, it may not be
possible to score or sort the variants until later in compilation (the
ompdevlow pass). In this case the gimplifier emits a loop containing
a switch statement with the variants in arbitrary order and uses the
OMP_NEXT_VARIANT tree node as a placeholder to control which variant
is tested on each iteration of the loop. It looks something like:
switch_var = OMP_NEXT_VARIANT (0, state);
loop_label:
switch (switch_var)
{
case 1:
if (dynamic_selector_predicate_1)
{
alternative_1;
goto end_label;
}
else
{
switch_var = OMP_NEXT_VARIANT (1, state);
goto loop_label;
}
case 2:
...
}
end_label:
Note that when there are no dynamic selectors, the loop is unnecessary
and only the switch is emitted.
Finally, in the ompdevlow pass, the OMP_NEXT_VARIANT magic cookies are
resolved and replaced with constants. When compiling with -O we can
expect that the loop and switch will be discarded by subsequent
optimizations and replaced with direct jumps between the cases,
eventually arriving at code with similar control flow to the
early-resolution cases.
This approach is somewhat simpler than the one currently used for
handling declare variant in that all possible code paths are already
included in the output of the gimplifier, so it is not necessary to
maintain hidden references or data structures pointing to expansions of
not-yet-resolved variant constructs and special logic for passing them
through LTO (see PR lto/96680).
A possible disadvantage of this expansion strategy is that dead code
for unused variants in the switch can remain when compiling without
-O. If this turns out to be a critical problem (e.g., an unused case
includes calls to functions not available to the linker) perhaps some
further processing could be performed by default after ompdevlow to
simplify such constructs.
In order to make this patch more readable for review purposes, it
leaves the existing code for "declare variant" resolution (including
the above-mentioned LTO hack) in place, in some cases just ifdef-ing
out functions that won't compile due to changed interfaces for
dependencies. The next patch in the series will delete all the
now-unused code.
gcc/ChangeLog
PR middle-end/114596
PR middle-end/112779
PR middle-end/113904
* Makefile.in (GTFILES): Move omp-general.h earlier; required
because of moving score_wide_int declaration to that file.
* cgraph.h (struct cgraph_node): Add has_omp_variant_constructs flag.
* cgraphclones.cc (cgraph_node::create_clone): Propagate
has_omp_variant_constructs flag.
* gimplify.cc (omp_resolved_variant_calls): New.
(expand_late_variant_directive): New.
(find_supercontext): New.
(gimplify_variant_call_expr): New.
(gimplify_call_expr): Adjust parameters to make fallback available.
Update processing for "declare variant" substitution.
(is_gimple_stmt): Add OMP_METADIRECTIVE.
(omp_construct_selector_matches): Ifdef out unused function.
(omp_get_construct_context): New.
(gimplify_omp_dispatch): Replace call to deleted function
omp_resolve_declare_variant with equivalent logic.
(expand_omp_metadirective): New.
(expand_late_variant_directive): New.
(gimplify_omp_metadirective): New.
(gimplify_expr): Adjust arguments to gimplify_call_expr. Add
cases for OMP_METADIRECTIVE, OMP_NEXT_VARIANT, and
OMP_TARGET_DEVICE_MATCHES.
(gimplify_function_tree): Initialize/clean up
omp_resolved_variant_calls.
* gimplify.h (omp_construct_selector_matches): Delete declaration.
(omp_get_construct_context): Declare.
* lto-cgraph.cc (lto_output_node): Write has_omp_variant_constructs.
(input_overwrite_node): Read has_omp_variant_constructs.
* omp-builtins.def (BUILT_IN_OMP_GET_NUM_DEVICES): New.
* omp-expand.cc (expand_omp_taskreg): Propagate
has_omp_variant_constructs.
(expand_omp_target): Likewise.
* omp-general.cc (omp_maybe_offloaded): Add construct_context
parameter; use it instead of querying gimplifier state. Add
comments.
(omp_context_name_list_prop): Do not test lang_GNU_Fortran in
offload compiler, just use the string as-is.
(expr_uses_parm_decl): New.
(omp_check_context_selector): Add metadirective_p parameter.
Remove sorry for target_device selector. Add additional checks
specific to metadirective or declare variant.
(make_omp_metadirective_variant): New.
(omp_construct_traits_match): New.
(omp_context_selector_matches): Temporarily ifdef out the previous
code, and add a new implementation based on the old one with
different parameters, some unnecessary loops removed, and code
re-indented.
(omp_target_device_matches_on_host): New.
(resolve_omp_target_device_matches): New.
(omp_construct_simd_compare): Support matching of "simdlen" and
"aligned" clauses.
(omp_context_selector_set_compare): Make static. Adjust call to
omp_construct_simd_compare.
(score_wide_int): Move declaration to omp-general.h.
(omp_selector_is_dynamic): New.
(omp_device_num_check): New.
(omp_dynamic_cond): New.
(omp_context_compute_score): Ifdef out the old version and
re-implement with different parameters.
(omp_complete_construct_context): New.
(omp_resolve_late_declare_variant): Ifdef out.
(omp_declare_variant_remove_hook): Likewise.
(omp_resolve_declare_variant): Likewise.
(sort_variant): New.
(omp_get_dynamic_candidates): New.
(omp_declare_variant_candidates): New.
(omp_metadirective_candidates): New.
(omp_early_resolve_metadirective): New.
(omp_resolve_variant_construct): New.
* omp-general.h (score_wide_int): Moved here from omp-general.cc.
(struct omp_variant): New.
(make_omp_metadirective_variant): Declare.
(omp_construct_traits_to_codes): Delete declaration.
(omp_check_context_selector): Adjust parameters.
(omp_context_selector_matches): Likewise.
(omp_context_selector_set_compare): Delete declaration.
(omp_resolve_declare_variant): Likewise.
(omp_declare_variant_candidates): Declare.
(omp_metadirective_candidates): Declare.
(omp_get_dynamic_candidates): Declare.
(omp_early_resolve_metadirective): Declare.
(omp_resolve_variant_construct): Declare.
(omp_dynamic_cond): Declare.
* omp-offload.cc (resolve_omp_variant_cookies): New.
(execute_omp_device_lower): Call the above function to resolve
variant directives. Remove call to omp_resolve_declare_variant.
(pass_omp_device_lower::gate): Check has_omp_variant_construct bit.
* omp-simd-clone.cc (simd_clone_create): Propagate
has_omp_variant_constructs bit.
* tree-inline.cc (expand_call_inline): Likewise.
(tree_function_versioning): Likewise.
gcc/c/ChangeLog
PR middle-end/114596
PR middle-end/112779
PR middle-end/113904
* c-parser.cc (c_finish_omp_declare_variant): Update for changes
to omp-general.h interfaces.
gcc/cp/ChangeLog
PR middle-end/114596
PR middle-end/112779
PR middle-end/113904
* decl.cc (omp_declare_variant_finalize_one): Update for changes
to omp-general.h interfaces.
* parser.cc (cp_finish_omp_declare_variant): Likewise.
gcc/fortran/ChangeLog
PR middle-end/114596
PR middle-end/112779
PR middle-end/113904
* trans-openmp.cc (gfc_trans_omp_declare_variant): Update for changes
to omp-general.h interfaces.
gcc/testsuite/
PR middle-end/114596
PR middle-end/112779
PR middle-end/113904
* c-c++-common/gomp/declare-variant-12.c: Adjust expected behavior
per PR114596.
* c-c++-common/gomp/declare-variant-13.c: Test that this is resolvable
after gimplification, not just final resolution.
* c-c++-common/gomp/declare-variant-14.c: Tweak testcase to ensure
that -O causes dead code to be optimized away.
* gfortran.dg/gomp/declare-variant-12.f90: Adjust expected behavior
per PR114596.
* gfortran.dg/gomp/declare-variant-13.f90: Test that this is resolvable
after gimplification, not just final resolution.
* gfortran.dg/gomp/declare-variant-14.f90: Tweak testcase to ensure
that -O causes dead code to be optimized away.
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Sandra Loosemore <sandra@codesourcery.com>
Co-Authored-By: Marcel Vollweiler <marcel@codesourcery.com>
Diffstat (limited to 'gcc/testsuite/c-c++-common/gomp')
-rw-r--r-- | gcc/testsuite/c-c++-common/gomp/declare-variant-12.c | 14 | ||||
-rw-r--r-- | gcc/testsuite/c-c++-common/gomp/declare-variant-13.c | 4 | ||||
-rw-r--r-- | gcc/testsuite/c-c++-common/gomp/declare-variant-14.c | 2 |
3 files changed, 11 insertions, 9 deletions
diff --git a/gcc/testsuite/c-c++-common/gomp/declare-variant-12.c b/gcc/testsuite/c-c++-common/gomp/declare-variant-12.c index 3515d9a..f915077 100644 --- a/gcc/testsuite/c-c++-common/gomp/declare-variant-12.c +++ b/gcc/testsuite/c-c++-common/gomp/declare-variant-12.c @@ -29,29 +29,29 @@ void f13 (void); void f14 (void); void f15 (void); void f16 (void); -#pragma omp declare variant (f14) match (construct={teams,parallel,for}) /* 16+8+4 */ -#pragma omp declare variant (f15) match (construct={parallel},user={condition(score(19):1)}) /* 8+19 */ -#pragma omp declare variant (f16) match (implementation={atomic_default_mem_order(score(27):seq_cst)}) +#pragma omp declare variant (f14) match (construct={teams,parallel,for}) /* 1+8+16 */ +#pragma omp declare variant (f15) match (construct={parallel},user={condition(score(16):1)}) /* 8+16 */ +#pragma omp declare variant (f16) match (implementation={atomic_default_mem_order(score(24):seq_cst)}) void f17 (void); void f18 (void); void f19 (void); void f20 (void); -#pragma omp declare variant (f18) match (construct={teams,parallel,for}) /* 16+8+4 */ +#pragma omp declare variant (f18) match (construct={teams,parallel,for}) /* 1+8+6 */ #pragma omp declare variant (f19) match (construct={for},user={condition(score(25):1)}) /* 4+25 */ #pragma omp declare variant (f20) match (implementation={atomic_default_mem_order(score(28):seq_cst)}) void f21 (void); void f22 (void); void f23 (void); void f24 (void); -#pragma omp declare variant (f22) match (construct={parallel,for}) /* 2+1 */ +#pragma omp declare variant (f22) match (construct={parallel,for}) /* 8+16 */ #pragma omp declare variant (f23) match (construct={for}) /* 0 */ #pragma omp declare variant (f24) match (implementation={atomic_default_mem_order(score(2):seq_cst)}) void f25 (void); void f26 (void); void f27 (void); void f28 (void); -#pragma omp declare variant (f26) match (construct={parallel,for}) /* 2+1 */ -#pragma omp declare variant (f27) match (construct={for},user={condition(1)}) /* 4 */ +#pragma omp declare variant (f26) match (construct={parallel,for}) /* 8+16 */ +#pragma omp declare variant (f27) match (construct={for},user={condition(score(25):1)}) /* 16 + 25 */ #pragma omp declare variant (f28) match (implementation={atomic_default_mem_order(score(3):seq_cst)}) void f29 (void); diff --git a/gcc/testsuite/c-c++-common/gomp/declare-variant-13.c b/gcc/testsuite/c-c++-common/gomp/declare-variant-13.c index 68e6a89..83c3d85 100644 --- a/gcc/testsuite/c-c++-common/gomp/declare-variant-13.c +++ b/gcc/testsuite/c-c++-common/gomp/declare-variant-13.c @@ -20,5 +20,7 @@ test1 (int x) isa has score 2^2 or 2^3. We can't decide on whether avx512f will match or not, that also depends on whether it is a declare simd clone or not and which one, but the f03 variant has a higher score anyway. */ - return f05 (x); /* { dg-final { scan-tree-dump-times "f03 \\\(x" 1 "gimple" } } */ + return f05 (x); + /* { dg-final { scan-tree-dump "f03 \\\(x" "gimple" } } */ + /* { dg-final { scan-tree-dump-not "f05 \\\(x" "gimple" } } */ } diff --git a/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c b/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c index 8a6bf09..8213b1a 100644 --- a/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c +++ b/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { { i?86-*-* x86_64-*-* } && vect_simd_clones } } } */ -/* { dg-additional-options "-mno-sse3 -fdump-tree-gimple -fdump-tree-optimized" } */ +/* { dg-additional-options "-O -mno-sse3 -fdump-tree-gimple -fdump-tree-optimized" } */ int f01 (int); int f02 (int); |