diff options
author | Chung-Lin Tang <cltang@codesourcery.com> | 2023-05-19 12:14:04 -0700 |
---|---|---|
committer | Chung-Lin Tang <cltang@codesourcery.com> | 2023-05-19 12:14:04 -0700 |
commit | 5f881613fa9128edae5bbfa4e19f9752809e4bd7 (patch) | |
tree | 9677f855effa09243c00f6530cb3b5b03b70ffdd /gcc/builtins.cc | |
parent | 17c41b39078fc8ad67fd1b82f74ef5174f34452e (diff) | |
download | gcc-devel/omp/gcc-12.zip gcc-devel/omp/gcc-12.tar.gz gcc-devel/omp/gcc-12.tar.bz2 |
Use OpenACC code to process OpenMP target regionsdevel/omp/gcc-12
This is a backport of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html
This patch implements '-fopenmp-target=acc', which enables internally handling
a subset of OpenMP target regions as OpenACC parallel regions. This basically
includes target, teams, parallel, distribute, for/do constructs, and atomics.
Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code
paths handle them, with various needed adjustments throughout middle-end and
nvptx backend. When using this "OMPACC" mode, if there are cases the patch
doesn't handle, it issues a warning, and reverts to normal processing for that
target region.
gcc/ChangeLog:
* builtins.cc (expand_builtin_omp_builtins): New function.
(expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER,
BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS,
BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using
expand_builtin_omp_builtins, enabled under -fopenmp-target=acc.
* cgraphunit.cc (analyze_functions): Add call to
omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc.
* common.opt (fopenmp-target=): Add new option and enums.
* config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=.
* config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New
prototype.
(nvptx_mem_shared_p): Likewise.
* config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX
symbol for number of threads in team.
(omp_num_threads_align): New var for alignment of omp_num_threads_sym.
(need_omp_num_threads): New bool for if any function references
omp_num_threads_sym.
(nvptx_option_override): Initialize omp_num_threads_sym/align.
(write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode.
(nvptx_declare_function_name): Disable shim function under OMPACC mode.
Disable soft-stack under OMPACC mode. Add generation of neutering init
code under OMPACC mode.
(nvptx_output_set_softstack): Return "" under OMPACC mode.
(nvptx_expand_call): Set parallelism to vector for function calls with
"ompacc for" attached.
(nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode.
(nvptx_expand_oacc_join): Likewise.
(nvptx_expand_omp_get_num_threads): New function.
(nvptx_mem_shared_p): New function.
(nvptx_mach_max_workers): Return 1 under OMPACC mode.
(nvptx_mach_vector_length): Return 32 under OMPACC mode.
(nvptx_single): Add adjustments for OMPACC mode, which have
parallel-construct fork/joins, and regions of code where neutering is
dynamically determined.
(nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for"
attribute is attached to function. Disable uniform-simt when under
OMPACC mode.
(nvptx_file_end): Write __nvptx_omp_num_threads out when needed.
(nvptx_goacc_fork_join): Return true under OMPACC mode.
* config/nvptx/nvptx.h (struct GTY(()) machine_function): Add
omp_parallel_predicate and omp_fn_entry_num_threads_reg fields.
* config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID,
UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID,
UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries.
(nvptx_shared_mem_operand): New predicate.
(gomp_barrier): New expand pattern.
(omp_get_num_threads): New expand pattern.
(omp_get_num_teams): New insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_team_num): Likewise.
(get_ntid): Likewise.
(nvptx_omp_parallel_fork): Likewise.
(nvptx_omp_parallel_join): Likewise.
* flag-types.h (omp_target_mode_kind): New flag value enum.
* gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_ctx_ompacc_p): New function.
(gimplify_omp_for): Handle combined loops under OMPACC.
* lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_.
* omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST.
(BUILT_IN_OMP_GET_NUM_THREADS): Likewise.
* omp-expand.cc (remove_exit_barrier): Disable addressable-var
processing for parallel construct child functions under OMPACC mode.
(expand_oacc_for): Add OMPACC mode handling.
(get_target_arguments): Force thread_limit clause value to 1 under
OMPACC mode.
(expand_omp): Under OMPACC mode, avoid child function expanding of
GIMPLE_OMP_PARALLEL.
* omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode.
* omp-low.cc (struct omp_context): Add 'bool ompacc_p' field.
(scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_.
(ompacc_ctx_p): New function.
(scan_omp_parallel): Handle OMPACC mode, avoid creating child function.
(scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target
construct child function, remove OMP_CLAUSE__OMPACC_ clauses.
(lower_oacc_head_mark): Handle OMPACC mode cases.
(lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add
vector/gang clauses as needed. Add other OMPACC handling.
(lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case.
(lower_omp_target): Do OpenACC gang privatization under OMPACC case.
(lower_omp_teams): Forward OpenACC privatization variables to outer
target region under OMPACC mode.
(lower_omp_1): Do OpenACC gang privatization under OMPACC case for
GIMPLE_BIND.
* omp-offload.cc (ompacc_supported_clauses_p): New function.
(struct target_region_data): New struct type for tree walk.
(scan_fndecl_for_ompacc): New function.
(scan_omp_target_region_r): New function.
(scan_omp_target_construct_r): New function.
(omp_ompacc_attribute_tagging): New function.
(oacc_dim_call): Add OMPACC case handling.
(execute_oacc_device_lower): Make parts explicitly only OpenACC enabled.
(pass_oacc_device_lower::gate): Enable pass under OMPACC mode.
* omp-offload.h (omp_ompacc_attribute_tagging): New prototype.
* opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp
and no -fopenacc.
* target-insns.def (gomp_barrier): New defined insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_num_threads): Likewise.
(omp_get_team_num): Likewise.
(omp_get_num_teams): Likewise.
* tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry
for internal clause.
* tree-nested.cc (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE__OMPACC_.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_.
* tree-ssa-loop.cc (pass_oacc_only::gate): Enable pass under OMPACC
mode cases.
libgomp/ChangeLog:
* config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in
shared memory.
Diffstat (limited to 'gcc/builtins.cc')
-rw-r--r-- | gcc/builtins.cc | 71 |
1 files changed, 71 insertions, 0 deletions
diff --git a/gcc/builtins.cc b/gcc/builtins.cc index b8cd75d..f36fe15 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -6785,6 +6785,62 @@ expand_builtin_goacc_parlevel_id_size (tree exp, rtx target, int ignore) return target; } +static rtx +expand_builtin_omp_builtins (tree exp, rtx target, int ignore) +{ + rtx ret = NULL; + rtx_insn *(*gen_fn) (rtx) = NULL; + + switch (DECL_FUNCTION_CODE (get_callee_fndecl (exp))) + { + case BUILT_IN_GOMP_BARRIER: + if (targetm.have_gomp_barrier ()) + { + emit_insn (targetm.gen_gomp_barrier ()); + return target; + } + break; + + case BUILT_IN_OMP_GET_THREAD_NUM: + if (targetm.have_omp_get_thread_num ()) + gen_fn = targetm.gen_omp_get_thread_num; + break; + + case BUILT_IN_OMP_GET_NUM_THREADS: + if (targetm.have_omp_get_num_threads ()) + gen_fn = targetm.gen_omp_get_num_threads; + break; + + case BUILT_IN_OMP_GET_TEAM_NUM: + if (targetm.have_omp_get_team_num ()) + gen_fn = targetm.gen_omp_get_team_num; + break; + + case BUILT_IN_OMP_GET_NUM_TEAMS: + if (targetm.have_omp_get_num_teams ()) + gen_fn = targetm.gen_omp_get_num_teams; + break; + + default: + gcc_unreachable (); + } + + if (ignore) + return const0_rtx; + + if (gen_fn) + { + rtx reg = (MEM_P (target) + ? gen_reg_rtx (GET_MODE (target)) + : target); + emit_insn (gen_fn (reg)); + if (reg != target) + emit_move_insn (target, reg); + ret = target; + } + return ret; +} + /* Expand a string compare operation using a sequence of char comparison to get rid of the calling overhead, with result going to TARGET if that's convenient. @@ -8113,6 +8169,21 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, case BUILT_IN_GOACC_PARLEVEL_SIZE: return expand_builtin_goacc_parlevel_id_size (exp, target, ignore); + case BUILT_IN_GOMP_BARRIER: + case BUILT_IN_OMP_GET_THREAD_NUM: + case BUILT_IN_OMP_GET_NUM_THREADS: + case BUILT_IN_OMP_GET_TEAM_NUM: + case BUILT_IN_OMP_GET_NUM_TEAMS: + if (flag_openmp_target == OMP_TARGET_MODE_OMPACC + && lookup_attribute ("ompacc", + DECL_ATTRIBUTES (current_function_decl))) + { + target = expand_builtin_omp_builtins (exp, target, ignore); + if (target) + return target; + } + break; + case BUILT_IN_SPECULATION_SAFE_VALUE_PTR: return expand_speculation_safe_value (VOIDmode, exp, target, ignore); |