Age | Commit message (Collapse) | Author | Files | Lines |
|
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_vectorize_builtin_vectorized_function): Remove
unused in_mode/in_n variables.
(cherry picked from commit 124bc77ba9868db07d284b4dea31797da1468515)
|
|
While OpenMP 5.0 required a single structured block before and after the
'omp scan' directive, OpenMP 5.1 changed this to a 'structured block sequence,
denoting 2 or more executable statements in OpenMP 5.1 (whoops!) and zero or
more in OpenMP 5.2. This commit updates C/C++ to accept zero statements (but
till requires the '{' ... '}' for the final-loop-body) and updates Fortran
to accept zero or more than one statements.
If there is no preceeding or succeeding executable statement, a warning is
shown.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_scan_loop_body): Handle
zero exec statements before/after 'omp scan'.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_scan_loop_body): Handle
zero exec statements before/after 'omp scan'.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_resolve_omp_do_blocks): Handle zero
or more than one exec statements before/after 'omp scan'.
* trans-openmp.cc (gfc_trans_omp_do): Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/scan-1.c: New test.
* testsuite/libgomp.c/scan-23.c: New test.
* testsuite/libgomp.fortran/scan-2.f90: New test.
gcc/testsuite/ChangeLog:
* g++.dg/gomp/attrs-7.C: Update dg-error/dg-warning.
* gfortran.dg/gomp/loop-2.f90: Likewise.
* gfortran.dg/gomp/reduction5.f90: Likewise.
* gfortran.dg/gomp/reduction6.f90: Likewise.
* gfortran.dg/gomp/scan-1.f90: Likewise.
* gfortran.dg/gomp/taskloop-2.f90: Likewise.
* c-c++-common/gomp/scan-6.c: New test.
* gfortran.dg/gomp/scan-8.f90: New test.
(cherry picked from commit 1c101fcfaac8f609d618f83b124bd50aea012132)
|
|
Merge up to r13-7388-g8f1e18c75e032c2e84c4a4bc51b734c36b11c7fd (30th May 2023)
|
|
|
|
|
|
This patch is to enhance optimization for auto-vectorization.
Before this patch:
Loop:
vsetvl a5,a2...
vsetvl zero,a5...
vle
After this patch:
Loop:
vsetvl a5,a2
vle
gcc/ChangeLog:
PR target/109547
* config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn): New function.
(vector_insn_info::skip_avl_compatible_p): Ditto.
(vector_insn_info::merge): Remove default value.
(pass_vsetvl::compute_local_backward_infos): Ditto.
(pass_vsetvl::cleanup_insns): Add local vsetvl elimination.
* config/riscv/riscv-vsetvl.h: Ditto.
gcc/testsuite/ChangeLog:
PR target/109547
* gcc.target/riscv/rvv/vsetvl/pr109547.c: New.
* gcc.target/riscv/rvv/vsetvl/vsetvl-17.c: Update scan
condition.
(cherry picked from commit d51f2456ee51bd59a79b4725ca0e488c25260bbf)
|
|
Seems like I screw up bare-metal toolchian multi lib selection during
finxing linux multi-lib selction...
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_compute_multilib):
Fix wrong select_kind...
(cherry picked from commit 008cbecf622a413ebcc8b41a737f30fd7e2a1abf)
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib_by_abi):
Drop unused parameter.
(riscv_select_multilib): Ditto.
(riscv_compute_multilib): Update call site of
riscv_select_multilib_by_abi and riscv_select_multilib_by_abi.
(cherry picked from commit 7a7f6b26259d22115ee4813ce130622ad1073d16)
|
|
RISC-V Linux encodes the ABI into the path, so in theory, we can only use that
to select multi-lib paths, and no way to use different multi-lib paths between
`rv32i/ilp32` and `rv32ima/ilp32`, we'll mapping both to `/lib/ilp32`.
It's hard to do that with GCC's builtin multi-lib selection mechanism; builtin
mechanism did the option string compare and then enumerate all possible reuse
rules during the build time. However, it's impossible to RISC-V; we have a huge
number of combinations of `-march`, so implementing a customized multi-lib
selection becomes the only solution.
Multi-lib configuration is only used for determines which ISA should be used
when compiling the corresponding ABI variant after this patch.
During the multi-lib selection stage, only consider -mabi as the only key to
select the multi-lib path.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib_by_abi): New.
(riscv_select_multilib): New.
(riscv_compute_multilib): Extract logic to riscv_select_multilib and
also handle select_by_abi.
* config/riscv/elf.h (RISCV_USE_CUSTOMISED_MULTI_LIB): Change it
to select_by_abi_arch_cmodel from 1.
* config/riscv/linux.h (RISCV_USE_CUSTOMISED_MULTI_LIB): Define.
* config/riscv/riscv-opts.h (enum riscv_multilib_select_kind): New.
(cherry picked from commit 17d683d4d3dc95d13096ec52ad9937a2b011f9a4)
|
|
Fix bug of vector.md which generate incorrect information to
VSETVL PASS when testing FMA auto vectorization ternop-3.c.
Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* config/riscv/vector.md: Fix vimuladd instruction bug.
(cherry picked from commit f8af48d8755018272cdb0cf2f250cf278829d7be)
|
|
|
|
|
|
This OG12 patch was accidentally ported to OG13. The proper support
is in GCC 13 but was not backported to OG12, hence, the fully GCC 13
version can be used.
REVERTED:
gcc/testsuite/
* gfortran.dg/gomp/openmp-simd-8.f90: Remove dump test.
|
|
|
|
avr-common.cc introduces the following options that are set depending
on optimization level: -mgas-isr-prologues, -mmain-is-OS-task and
-fsplit-wide-types-early. The inliner thinks that different options
disallow cross-optimization inlining, so provide can_inline_p.
gcc/
PR target/104327
* config/avr/avr.cc (avr_can_inline_p): New static function.
(TARGET_CAN_INLINE_P): Define to that function.
|
|
There is already a pattern in avr.md that matches single-bit transfers
from one register to another one, but it only handled bit 0 of 8-bit
registers. This change makes that pattern more generic so it matches
more of similar single-bit transfers.
gcc/
PR target/82931
* config/avr/avr.md (*movbitqi.0): Rename to *movbit<mode>.0-6.
Handle any bit position and use mode QISI.
* config/avr/avr.cc (avr_rtx_costs_1) [IOR]: Return a cost
of 2 insns for bit-transfer of respective style.
gcc/testsuite/
PR target/82931
* gcc.target/avr/pr82931.c: New test.
|
|
|
|
(forward ported from devel/omp/gcc-12)
This is a backport of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html
This patch implements '-fopenmp-target=acc', which enables internally handling
a subset of OpenMP target regions as OpenACC parallel regions. This basically
includes target, teams, parallel, distribute, for/do constructs, and atomics.
Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code
paths handle them, with various needed adjustments throughout middle-end and
nvptx backend. When using this "OMPACC" mode, if there are cases the patch
doesn't handle, it issues a warning, and reverts to normal processing for that
target region.
gcc/ChangeLog:
* builtins.cc (expand_builtin_omp_builtins): New function.
(expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER,
BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS,
BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using
expand_builtin_omp_builtins, enabled under -fopenmp-target=acc.
* cgraphunit.cc (analyze_functions): Add call to
omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc.
* common.opt (fopenmp-target=): Add new option and enums.
* config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=.
* config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New
prototype.
(nvptx_mem_shared_p): Likewise.
* config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX
symbol for number of threads in team.
(omp_num_threads_align): New var for alignment of omp_num_threads_sym.
(need_omp_num_threads): New bool for if any function references
omp_num_threads_sym.
(nvptx_option_override): Initialize omp_num_threads_sym/align.
(write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode.
(nvptx_declare_function_name): Disable shim function under OMPACC mode.
Disable soft-stack under OMPACC mode. Add generation of neutering init
code under OMPACC mode.
(nvptx_output_set_softstack): Return "" under OMPACC mode.
(nvptx_expand_call): Set parallelism to vector for function calls with
"ompacc for" attached.
(nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode.
(nvptx_expand_oacc_join): Likewise.
(nvptx_expand_omp_get_num_threads): New function.
(nvptx_mem_shared_p): New function.
(nvptx_mach_max_workers): Return 1 under OMPACC mode.
(nvptx_mach_vector_length): Return 32 under OMPACC mode.
(nvptx_single): Add adjustments for OMPACC mode, which have
parallel-construct fork/joins, and regions of code where neutering is
dynamically determined.
(nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for"
attribute is attached to function. Disable uniform-simt when under
OMPACC mode.
(nvptx_file_end): Write __nvptx_omp_num_threads out when needed.
(nvptx_goacc_fork_join): Return true under OMPACC mode.
* config/nvptx/nvptx.h (struct GTY(()) machine_function): Add
omp_parallel_predicate and omp_fn_entry_num_threads_reg fields.
* config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID,
UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID,
UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries.
(nvptx_shared_mem_operand): New predicate.
(gomp_barrier): New expand pattern.
(omp_get_num_threads): New expand pattern.
(omp_get_num_teams): New insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_team_num): Likewise.
(get_ntid): Likewise.
(nvptx_omp_parallel_fork): Likewise.
(nvptx_omp_parallel_join): Likewise.
* flag-types.h (omp_target_mode_kind): New flag value enum.
* gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_ctx_ompacc_p): New function.
(gimplify_omp_for): Handle combined loops under OMPACC.
* lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_.
* omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST.
(BUILT_IN_OMP_GET_NUM_THREADS): Likewise.
* omp-expand.cc (remove_exit_barrier): Disable addressable-var
processing for parallel construct child functions under OMPACC mode.
(expand_oacc_for): Add OMPACC mode handling.
(get_target_arguments): Force thread_limit clause value to 1 under
OMPACC mode.
(expand_omp): Under OMPACC mode, avoid child function expanding of
GIMPLE_OMP_PARALLEL.
* omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode.
* omp-low.cc (struct omp_context): Add 'bool ompacc_p' field.
(scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_.
(ompacc_ctx_p): New function.
(scan_omp_parallel): Handle OMPACC mode, avoid creating child function.
(scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target
construct child function, remove OMP_CLAUSE__OMPACC_ clauses.
(lower_oacc_head_mark): Handle OMPACC mode cases.
(lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add
vector/gang clauses as needed. Add other OMPACC handling.
(lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case.
(lower_omp_target): Do OpenACC gang privatization under OMPACC case.
(lower_omp_teams): Forward OpenACC privatization variables to outer
target region under OMPACC mode.
(lower_omp_1): Do OpenACC gang privatization under OMPACC case for
GIMPLE_BIND.
* omp-offload.cc (ompacc_supported_clauses_p): New function.
(struct target_region_data): New struct type for tree walk.
(scan_fndecl_for_ompacc): New function.
(scan_omp_target_region_r): New function.
(scan_omp_target_construct_r): New function.
(omp_ompacc_attribute_tagging): New function.
(oacc_dim_call): Add OMPACC case handling.
(execute_oacc_device_lower): Make parts explicitly only OpenACC enabled.
(pass_oacc_device_lower::gate): Enable pass under OMPACC mode.
* omp-offload.h (omp_ompacc_attribute_tagging): New prototype.
* opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp
and no -fopenacc.
* target-insns.def (gomp_barrier): New defined insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_num_threads): Likewise.
(omp_get_team_num): Likewise.
(omp_get_num_teams): Likewise.
* tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry
for internal clause.
* tree-nested.cc (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE__OMPACC_.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_.
* tree-ssa-loop.cc (pass_oacc_only::gate): Enable pass under OMPACC
mode cases.
libgomp/ChangeLog:
* config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in
shared memory.
(cherry picked from commit 5f881613fa9128edae5bbfa4e19f9752809e4bd7)
|
|
|
|
Some miscomputation of rtx_costs lead to sub-optimal code for
single-bit bit insertions. This patch implements TARGET_INSN_COST,
which has a chance to see the whole insn during insn combination;
in particular the SET_DEST of (set (zero_extract (...) ...)).
gcc/
* config/avr/avr.cc (avr_insn_cost): New static function.
(TARGET_INSN_COST): Define to that function.
|
|
The encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD)
have integral types, but that's not the case in Ada where they may have
pretty much any type, resulting in a wrong encoding for them
gcc/
* fold-const.cc (native_encode_initializer) <CONSTRUCTOR>: Apply the
specific treatment for bit-fields only if they have an integral type
and filter out non-integral bit-fields that do not start and end on
a byte boundary.
gcc/testsuite/
* gnat.dg/opt101.adb: New test.
* gnat.dg/opt101_pkg.ads: New helper.
|
|
|
|
|
|
On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.
So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.
2023-05-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.
* gcc.target/aarch64/sve/pr109505.c: New test.
(cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)
|
|
When working on a cost tweaking patch, I found that a newly
added test case has different dumpings with stage-1 and
bootstrapped gcc. By looking into it, the apparent reason
is vect_analyze_loop_2 doesn't get slp_done_for_suggested_uf
set expectedly, the following retrying will use the garbage
slp_done_for_suggested_uf instead. In fact, the setting of
slp_done_for_suggested_uf only happens when the previous
analysis succeeds, for the mentioned test case, its previous
analysis does fail, it's unexpected to use the value of
slp_done_for_suggested_uf any more.
In function vect_analyze_loop_1, we only return success when
res is true, which is the result of 1st analysis. It means
we never try to vectorize with unroll_vinfo if the previous
analysis fails. So this patch shouldn't break anything, and
just stop some useless analysis early.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_1): Don't retry analysis with
suggested unroll factor once the previous analysis fails.
(cherry picked from commit a04bf39f61ce7814d197d712760f08c206daf4f1)
|
|
|
|
|
|
This patch removes the superfluous parallel in [u]divmod patterns in
the AVR backend. Effect of extra parallel is that add_clobbers reaches
gcc_unreachable() because the clobbers for [u]divmod are missing.
If an insn has multiple parts like clobbers, the parallel around the
parts of the insn pattern is implicit.
gcc/
PR target/105753
Backport from 2023-05-20 https://gcc.gnu.org/r14-1016
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod<mode>4): Tidy code. Use gcc_unreachable() instead of
printing error text to assembly.
gcc/testsuite/
PR target/105753
Backport from 2023-05-20 https://gcc.gnu.org/r14-1016
* gcc.target/avr/torture/pr105753.c: New test.
|
|
|
|
Now that we have support for inline subword atomic operations, it is no
longer necessary to link against libatomic. This also fixes testsuite
failures because the framework does not properly set up the linker flags
for finding libatomic.
The use of atomic operations is also independent of the use of libpthread.
gcc/
* config/riscv/linux.h (LIB_SPEC): Don't redefine.
|
|
This patch adds support for running constructors and destructors for
static (file-scope) aggregates for C++ objects which are marked with
"declare target" directives on OpenMP offload targets.
At present, space is allocated on the target for such aggregates, but
nothing ever constructs them properly, so they end up zero-initialised.
The approach taken is to generate a set of constructors to run on the
target: this currently works for AMD GCN, but fails on NVPTX due
to lack of constructor/destructor support there so far on mainline.
(See the new test static-aggr-constructor-destructor-3.C for a reason
why running constructors on the target is preferable to e.g. constructing
on the host and then copying the resulting object to the target.)
2023-05-12 Julian Brown <julian@codesourcery.com>
gcc/cp/
* decl2.cc (tree-inline.h): Include.
(static_init_fini_fns): Bump to four entries. Update comment.
(start_objects, start_partial_init_fini_fn): Add 'omp_target'
parameter. Support "declare target" decls. Update forward declaration.
(emit_partial_init_fini_fn): Add 'host_fn' parameter. Return tree for
the created function. Support "declare target".
(OMP_SSDF_IDENTIFIER): New macro.
(partition_vars_for_init_fini): Support partitioning "declare target"
variables also.
(generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support
"declare target" decls.
(c_parse_final_cleanups): Support constructors/destructors on OpenMP
offload targets.
gcc/
* omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): New builtin.
* tree.cc (get_file_function_name): Support names for on-target
constructor/destructor functions.
libgomp/
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New
test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New
test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-3.C: New
test.
|
|
add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertently being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the initializer list as if
it's non-designated. This patch fixes this by making us check for invalid
designated initialization sooner.
PR c++/109871
gcc/cp/ChangeLog:
* call.cc (add_list_candidates): Check for invalid designated
initialization sooner and even for types that have a list
constructor.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/desig27.C: New test.
(cherry picked from commit d5e5007c4b534391c0a97be56f6024fde1a88682)
|
|
This adds the feature-test macro for PR0849R8, as per
https://github.com/cplusplus/CWG/issues/281.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_auto_cast
for C++23.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/feat-cxx2b.C: Test __cpp_auto_cast.
(cherry picked from commit 32b81d897629b6c3bd9e2780831a1c45b38b5ac3)
|
|
The index variable initialization for the 'omp unroll'
directive with 'full' clause got lost and the testsuite
did not catch it.
Add the initialization and add -Wall to some tests
to detect uninitialized variable uses and other
potential problems in the code generation.
gcc/ChangeLog:
* omp-transform-loops.cc (full_unroll): Add initialization of index variable.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c:
Use -Wall and add -Wno-unknown-pragmas to disable warnings about empty pragmas.
Use -O2.
* testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C:
Copy of testsuite/libgomp.c-c++-common/matrix-no-directive-unroll-full-1.c,
but using -O0 which works only for C++.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: Use -Wall
and use -Wno-unknown-pragmas to disable warnings about empty pragmas.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c:
Likewise and fix broken function calls found by -Wall.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/109846
* expr.cc (gfc_check_vardef_context): Check appropriate pointer
attribute for CLASS vs. non-CLASS function result in variable
definition context.
gcc/testsuite/ChangeLog:
PR fortran/109846
* gfortran.dg/ptr-func-5.f90: New test.
(cherry picked from commit fa0569e90efe8a5cb895a3f50dd502f849940828)
|
|
Fortran allows overloading of intrinsic operators also for operands of
numeric intrinsic types. The intrinsic operator versions are used
according to the rules of F2018 table 10.2 and imply type conversion as
long as the operand ranks are conformable. Otherwise no type conversion
shall be performed to allow the resolution of a matching user-defined
operator.
gcc/fortran/ChangeLog:
PR fortran/109641
* arith.cc (eval_intrinsic): Check conformability of ranks of operands
for intrinsic binary operators before performing type conversions.
* gfortran.h (gfc_op_rank_conformable): Add prototype.
* resolve.cc (resolve_operator): Check conformability of ranks of
operands for intrinsic binary operators before performing type
conversions.
(gfc_op_rank_conformable): New helper function to compare ranks of
operands of binary operator.
gcc/testsuite/ChangeLog:
PR fortran/109641
* gfortran.dg/overload_5.f90: New test.
(cherry picked from commit 185da7c2014ba41f38dd62cc719873ebf020b076)
|
|
The order of the map clauses in the Gimple representation has changed.
2023-05-10 Kwok Cheung Yeung <kcy@codesourcery.com>
* gfortran.dg/gomp/target-exit-data.f90: Update expected outputs.
|
|
2023-05-03 Kwok Cheung Yeung <kcy@codesourcery.com>
* gfortran.dg/gomp/map-10a.f90: Update expected outputs.
|
|
NULLs cannot be added to a hash_set in GCC 13, so test for NULL before adding.
Should probably be a fixup to 'Kernels loops annotation: Fortran'.
2023-04-27 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/fortran/
* openmp.cc (compute_goto_targets): Check label before adding to
goto_targets.
|
|
The GOMP_MAP_NONCONTIG_ARRAY_* map types need to be handled in omp_group_base
(which was added in GCC 13).
This should probably be a fixup to 'Merge non-contiguous array support
patches'.
2023-04-18 Kwok Cheung Yeung <kcy@codesourcery.com>
* gimplify.cc (omp_group_base): Handle GOMP_MAP_NONCONTIG_ARRAY_*
map types.
|
|
GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE need to be handled
in omp_group_base (which was added in GCC 13).
This should probably be a fixup to 'Fortran "declare create"/allocate support
for OpenACC'.
2023-04-17 Kwok Cheung Yeung <kcy@codesourcery.com>
* gimplify.cc (omp_group_base): Handle GOMP_MAP_DECLARE_ALLOCATE
and GOMP_MAP_DECLARE_DEALLOCATE.
|
|
Implemented for nvptx offloading via 'cuMemHostAlloc', 'cuMemHostRegister'.
gcc/
* doc/invoke.texi (-foffload-memory=pinned): Document.
include/
* cuda/cuda.h (CUresult): Add
'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED'.
(CUdevice_attribute): Add
'CU_DEVICE_ATTRIBUTE_READ_ONLY_HOST_REGISTER_SUPPORTED'.
(CU_MEMHOSTREGISTER_READ_ONLY): Add.
(cuMemHostGetFlags, cuMemHostRegister, cuMemHostUnregister): Add.
libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_free): Add
'struct goacc_asyncqueue *' formal parameter.
(GOMP_OFFLOAD_page_locked_host_register)
(GOMP_OFFLOAD_page_locked_host_unregister)
(GOMP_OFFLOAD_page_locked_host_p): Add.
* libgomp.h (always_pinned_mode)
(gomp_page_locked_host_register_dev)
(gomp_page_locked_host_unregister_dev): Add.
(struct splay_tree_key_s): Add 'page_locked_host_p'.
(struct gomp_device_descr): Add
'GOMP_OFFLOAD_page_locked_host_register',
'GOMP_OFFLOAD_page_locked_host_unregister',
'GOMP_OFFLOAD_page_locked_host_p'.
* libgomp.texi (-foffload-memory=pinned): Document.
* plugin/cuda-lib.def (cuMemHostGetFlags, cuMemHostRegister_v2)
(cuMemHostRegister, cuMemHostUnregister): Add.
* plugin/plugin-nvptx.c (struct ptx_device): Add
'read_only_host_register_supported'.
(nvptx_open_device): Initialize it.
(free_host_blocks, free_host_blocks_lock)
(nvptx_run_deferred_page_locked_host_free)
(nvptx_page_locked_host_free_callback, nvptx_page_locked_host_p)
(GOMP_OFFLOAD_page_locked_host_register)
(nvptx_page_locked_host_unregister_callback)
(GOMP_OFFLOAD_page_locked_host_unregister)
(GOMP_OFFLOAD_page_locked_host_p)
(nvptx_run_deferred_page_locked_host_unregister)
(nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback):
Add.
(GOMP_OFFLOAD_fini_device, GOMP_OFFLOAD_page_locked_host_alloc)
(GOMP_OFFLOAD_run): Call
'nvptx_run_deferred_page_locked_host_free'.
(struct goacc_asyncqueue): Add
'page_locked_host_unregister_blocks_lock',
'page_locked_host_unregister_blocks'.
(nvptx_goacc_asyncqueue_construct)
(nvptx_goacc_asyncqueue_destruct): Handle those.
(GOMP_OFFLOAD_page_locked_host_free): Handle
'struct goacc_asyncqueue *' formal parameter.
(GOMP_OFFLOAD_openacc_async_test)
(nvptx_goacc_asyncqueue_synchronize): Call
'nvptx_run_deferred_page_locked_host_unregister'.
(GOMP_OFFLOAD_openacc_async_serialize): Call
'nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback'.
* config/linux/allocator.c (linux_memspace_alloc)
(linux_memspace_calloc, linux_memspace_free)
(linux_memspace_realloc): Remove 'always_pinned_mode' handling.
(GOMP_enable_pinned_mode): Move...
* target.c: ... here.
(always_pinned_mode, verify_always_pinned_mode)
(gomp_verify_always_pinned_mode, gomp_page_locked_host_alloc_dev)
(gomp_page_locked_host_free_dev)
(gomp_page_locked_host_aligned_alloc_dev)
(gomp_page_locked_host_aligned_free_dev)
(gomp_page_locked_host_register_dev)
(gomp_page_locked_host_unregister_dev): Add.
(gomp_copy_host2dev, gomp_map_vars_internal)
(gomp_remove_var_internal, gomp_unmap_vars_internal)
(get_gomp_offload_icvs, gomp_load_image_to_device)
(gomp_target_rev, omp_target_memcpy_copy)
(omp_target_memcpy_rect_worker): Handle 'always_pinned_mode'.
(gomp_copy_host2dev, gomp_copy_dev2host): Handle
'verify_always_pinned_mode'.
(GOMP_target_ext): Add 'assert'.
(gomp_page_locked_host_alloc): Use
'gomp_page_locked_host_alloc_dev'.
(gomp_page_locked_host_free): Use
'gomp_page_locked_host_free_dev'.
(omp_target_associate_ptr): Adjust.
(gomp_load_plugin_for_device): Handle 'page_locked_host_register',
'page_locked_host_unregister', 'page_locked_host_p'.
* oacc-mem.c (memcpy_tofrom_device): Handle 'always_pinned_mode'.
* libgomp_g.h (GOMP_enable_pinned_mode): Adjust.
* testsuite/libgomp.c/alloc-pinned-7.c: Remove.
|
|
gcc/ChangeLog:
* omp-transform-loops.cc (walk_omp_for_loops): Handle
GIMPLE_OMP_METADIRECTIVE.
|
|
Add the parsing of loop transformations on inner loops of a loop-nest.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_nested_loop_transform_clauses):
Add argument for the level of loop-nest at which the clauses
appear, ...
(c_parser_omp_tile): ... adjust use here,
(c_parser_omp_unroll): ... and here,
(c_parser_omp_for_loop): ... and here. Stop treating loop
transformations like intervening code, parse them, and adjust
the loop-nest depth if necessary for tiling.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_is_pragma): New function.
(cp_parser_omp_nested_loop_transform_clauses):
Add argument for the level of loop-nest at which the clauses
appear, ...
(cp_parser_omp_tile): ... adjust use here,
(cp_parser_omp_unroll): ... and here,
(cp_parser_omp_for_loop): ... and here. Stop treating loop
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/loop-transforms/unroll-inner-1.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-inner-2.c: New test.
libgomp/ChangeLog
* testsuite/libgomp.c++/loop-transforms/tile-1.C: Deleted, replaced by
matrix-* tests.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h:
New header file for new tests.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h:
Likewise.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h:
New test.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c:
New test.
|
|
So far the implementation of the "omp tile" and "omp unroll"
directives restricted their use to the outermost loop of a loop-nest.
This commit changes the Fortran front end to parse and verify the
directives on inner loops. The transformation clauses are extended to
carry the information about the level of the loop-nest at which a
transformation should be applied. The middle end transformation pass
is adjusted to apply the transformations at the right level of a loop
nest and to take their effect on the loop nest depth into account.
gcc/fortran/ChangeLog:
* openmp.cc (omp_unroll_removes_loop_nest): Move down in file.
(resolve_loop_transform_generic): Remove, and ...
(resolve_omp_unroll): ... inline and adapt here. Move function.
Move functin.
(find_nested_loop_in_block): New function.
(find_nested_loop_in_chain): New function, used ...
(is_outer_iteration_variable): ... here, and ...
(expr_is_invariant): ... here.
(resolve_omp_do): Adjust code for resolving loop transformations.
(resolve_omp_tile): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL
on new clause.
(compute_transformed_depth): New function to compute the depth
("collapse") of a transformed loop nest, used
(gfc_trans_omp_do): ... here.
gcc/ChangeLog:
* omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type
in comment.
(gomp_for_uncollapse): Adjust "collapse" value after uncollapse.
(partial_unroll): Add argument for the loop nest level to be transformed.
(tile): Likewise.
(transform_gomp_for): Pass level to transformatoin functions.
(optimize_transformation_clauses): Handle transformation clauses for all
levels recursively.
* tree-pretty-print.cc (dump_omp_clause): Print
OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE.
* tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access
clause operand 0.
(OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0.
(OMP_CLAUSE_TILE_SIZES): Likewise.
gcc/cp/ChangeLog
* parser.cc (cp_parser_omp_clause_unroll_full): Set new
OMP_CLAUSE_TRANSFORM_LEVEL operand to default value.
(cp_parser_omp_clause_unroll_partial): Likewise.
(cp_parser_omp_tile_sizes): Likewise.
(cp_parser_omp_loop_transform_clause): Likewise.
(cp_parser_omp_nested_loop_transform_clauses): Likewise.
(cp_parser_omp_unroll): Likewise.
* pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL
and OMP_CLAUSE_TILE handling to changed number of operands.
gcc/c/ChangeLog
* c-parser.cc (c_parser_omp_clause_unroll_full): Set new
OMP_CLAUSE_TRANSFORM_LEVEL operand to default value.
(c_parser_omp_clause_unroll_partial): Likewise.
(c_parser_omp_tile_sizes): Likewise.
(c_parser_omp_loop_transform_clause): Likewise.
(c_parser_omp_nested_loop_transform_clauses): Likewise.
(c_parser_omp_unroll): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to
changed diagnostic messages.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test.
|
|
This commit adds the C and C++ front end support for the "omp tile"
directive. The middle end support for the transformation is
implemented in a previous commit.
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE.
* c-pragma.cc (omp_pragmas_simd): Likewise.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_TILE
gcc/c/ChangeLog:
* c-parser.cc (c_parser_nested_omp_unroll_clauses): Rename and
generalize ...
(c_parser_omp_nested_loop_transform_clauses): ... to this.
(c_parser_omp_for_loop): Handle "omp tile" parsing in loop nests.
(c_parser_omp_tile_sizes): Parse single "sizes" clause.
(c_parser_omp_loop_transform_clause): New function.
(c_parser_omp_tile): New function for parsing "omp tile"
(c_parser_omp_unroll): Adjust to renaming.
(c_parser_omp_construct): Handle PRAGMA_OMP_TILE.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_unroll_partial): Adjust.
(cp_parser_nested_omp_unroll_clauses): Rename ...
(cp_parser_omp_nested_loop_transform_clauses): ... to this.
(cp_parser_omp_for_loop): Handle "omp tile" parsing in loop nests.
(cp_parser_omp_tile_sizes): New function, parses single "sizes" clause
(cp_parser_omp_tile): New function for parsing "omp tile".
(cp_parser_omp_loop_transform_clause): New function.
(cp_parser_omp_unroll): Adjust to renaming.
(cp_parser_omp_construct): Handle PRAGMA_OMP_TILE.
(cp_parser_pragma): Likewise.
* pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE.
* semantics.cc (finish_omp_clauses): Likewise.
gcc/ChangeLog:
* gimplify.cc (omp_for_drop_tile_clauses): New function, ...
(gimplify_omp_for): ... used here.
libgomp/ChangeLog:
* testsuite/libgomp.c++/loop-transforms/tile-1.C: New test.
* testsuite/libgomp.c++/loop-transforms/tile-2.C: New test.
* testsuite/libgomp.c++/loop-transforms/tile-3.C: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/loop-transforms/tile-1.c: New test.
* c-c++-common/gomp/loop-transforms/tile-2.c: New test.
* c-c++-common/gomp/loop-transforms/tile-3.c: New test.
* c-c++-common/gomp/loop-transforms/tile-4.c: New test.
* c-c++-common/gomp/loop-transforms/tile-5.c: New test.
* c-c++-common/gomp/loop-transforms/tile-6.c: New test.
* c-c++-common/gomp/loop-transforms/tile-7.c: New test.
* c-c++-common/gomp/loop-transforms/tile-8.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-2.c: Adapt
to changed diagnostic messages.
* g++.dg/gomp/loop-transforms/tile-1.h: New test.
* g++.dg/gomp/loop-transforms/tile-1a.C: New test.
* g++.dg/gomp/loop-transforms/tile-1b.C: New test.
|
|
This commit implements the Fortran front end support for the "omp
tile" directive and the corresponding middle end transformation.
gcc/fortran/ChangeLog:
* gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE.
(enum gfc_exec_op): Add EXEC_OMP_TILE.
(loop_transform_p): New declaration.
(struct gfc_omp_clauses): Add "tile_sizes" field.
* dump-parse-tree.cc (show_omp_clauses): Handle "tile_sizes" dumping.
(show_omp_node): Handle EXEC_OMP_TILE.
(show_code_node): Likewise.
* match.h (gfc_match_omp_tile): New declaration.
* openmp.cc (gfc_free_omp_clauses): Free "tile_sizes" field.
(match_tile_sizes): New function.
(OMP_TILE_CLAUSES): New macro.
(gfc_match_omp_tile): New function.
(resolve_omp_do): Handle EXEC_OMP_TILE.
(resolve_omp_tile): New function.
(omp_code_to_statement): Handle EXEC_OMP_TILE.
(gfc_resolve_omp_directive): Likewise.
* parse.cc (decode_omp_directive): Handle ST_OMP_END_TILE
and ST_OMP_TILE.
(next_statement): Handle ST_OMP_TILE.
(gfc_ascii_statement): Likewise.
(parse_omp_do): Likewise.
(parse_executable): Likewise.
* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE.
(gfc_resolve_code): Likewise.
* st.cc (gfc_free_statement): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle "tile_sizes" field.
(loop_transform_p): New function.
(gfc_expr_list_len): New function.
(gfc_trans_omp_do): Handle EXEC_OMP_TILE.
(gfc_trans_omp_directive): Likewise.
* trans.cc (trans_code): Likewise.
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_TILE.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_loop): Likewise.
* omp-transform-loops.cc (walk_omp_for_loops): New declaration.
(subst_var_in_op): New function.
(subst_var): New function.
(gomp_for_number_of_iterations): Adjust.
(gomp_for_iter_count_type): New function.
(gimple_assign_rhs_to_tree): New function.
(subst_defs): New function.
(gomp_for_uncollapse): Adjust.
(transformation_clause_p): Add OMP_CLAUSE_TILE.
(tile): New function.
(transform_gomp_for): Handle OMP_CLAUSE_TILE.
(optimize_transformation_clauses): Handle OMP_CLAUSE_TILE.
* omp-general.cc (omp_loop_transform_clause_p): Add
OMP_CLAUSE_TILE.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_TILE.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_TILE.
* tree.cc: Add OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TILE_SIZES): New macro.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/loop-transforms/tile-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-1a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New test.
|
|
OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation
construct "omp tile".
gcc/ChangeLog:
* tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TILE_LIST): Rename to ...
(OMP_CLAUSE_OACC_TILE_LIST): ... this.
(OMP_CLAUSE_TILE_ITERVAR): Rename to ...
(OMP_CLAUSE_OACC_TILE_ITERVAR): ... this.
(OMP_CLAUSE_TILE_COUNT): Rename to ...
(OMP_CLAUSE_OACC_TILE_COUNT): this.
* gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_for): Likewise.
* omp-general.cc (omp_extract_for_data): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
(lower_oacc_head_mark): Likewise.
* tree-nested.cc (convert_nonlocal_omp_clauses): Likewise.
(convert_local_omp_clauses): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.cc: Likewise.
gcc/c-family/ChangeLog:
* c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings.
(c_parser_oacc_clause_tile): Likewise.
(c_parser_omp_for_loop): Likewise.
* c-typeck.cc (c_finish_omp_clauses): Likewise.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings.
(cp_parser_omp_clause_collapse): Likewise.
(cp_parser_omp_for_loop): Likewise.
* pt.cc (tsubst_omp_clauses): Likewise.
* semantics.cc (finish_omp_clauses): Likewise.
(finish_omp_for): Likewise.
gcc/fortran/ChangeLog:
* openmp.cc (enum omp_mask2): Adjust to renamings.
(gfc_match_omp_clauses): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
|
|
This commit implements the C and the C++ front end changes to support
the "omp unroll" directive. The execution of the loop transformation
relies on the pass that has been added as a part of the earlier
Fortran patch.
gcc/c-family/ChangeLog:
* c-gimplify.cc (c_genericize_control_stmt): Handle OMP_UNROLL.
* c-omp.cc: Add "unroll" to omp_directives[].
* c-pragma.cc: Add "unroll" to omp_pragmas_simd[].
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_UNROLL to
pragma_kind and adjust PRAGMA_OMP__LAST_.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FULL and
PRAGMA_OMP_CLAUSE_PARTIAL.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_name): Handle "full" and
"partial" clauses.
(check_no_duplicate_clause): Change return type to bool and
return check result.
(c_parser_omp_clause_unroll_full): New function for parsing
the "unroll clause".
(c_parser_omp_clause_unroll_partial): New function for
parsing the "partial" clause.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL
and PRAGMA_OMP_CLAUSE_PARTIAL.
(c_parser_nested_omp_unroll_clauses): New function for parsing
"omp unroll" directives following another directive.
(OMP_UNROLL_CLAUSE_MASK): New definition.
(c_parser_omp_unroll): New function for parsing "omp unroll"
loops that are not associated with another directive.
(c_parser_omp_construct): Handle PRAGMA_OMP_UNROLL.
* c-typeck.cc (c_finish_omp_clauses): Handle
OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL,
and OMP_CLAUSE_UNROLL_NONE.
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_gimplify_expr): Handle OMP_UNROLL.
(cp_fold_r): Likewise.
(cp_genericize_r): Likewise.
* parser.cc (cp_parser_omp_clause_name): Handle "full" clause.
(check_no_duplicate_clause): Change return type to bool and
return check result.
(cp_parser_omp_clause_unroll_full): New function for parsing
the "unroll clause".
(cp_parser_omp_clause_unroll_partial): New function for
parsing the "partial" clause.
(cp_parser_omp_all_clauses): Handle OMP_CLAUSE_UNROLL and
OMP_CLAUSE_FULL.
(cp_parser_nested_omp_unroll_clauses): New function for parsing
"omp unroll" directives following another directive.
(cp_parser_omp_for_loop): Handle "omp unroll" directives
between directive and loop.
(OMP_UNROLL_CLAUSE_MASK): New definition.
(cp_parser_omp_unroll): New function for parsing "omp unroll"
loops that are not associated with another directive.
(cp_parser_omp_construct): Handle PRAGMA_OMP_UNROLL.
(cp_parser_pragma): Handle PRAGMA_OMP_UNROLL.
* pt.cc (tsubst_omp_clauses): Handle
OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_FULL, and
OMP_CLAUSE_UNROLL_NONE.
(tsubst_expr): Handle OMP_UNROLL.
* semantics.cc (finish_omp_clauses): Handle
OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL,
and OMP_CLAUSE_UNROLL_NONE.
libgomp/ChangeLog:
* testsuite/libgomp.c++/loop-transforms/unroll-1.C: New test.
* testsuite/libgomp.c++/loop-transforms/unroll-2.C: New test.
* testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/loop-transforms/unroll-1.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-2.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-3.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-4.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-5.c: New test.
* c-c++-common/gomp/loop-transforms/unroll-6.c: New test.
* g++.dg/gomp/loop-transforms/unroll-1.C: New test.
* g++.dg/gomp/loop-transforms/unroll-2.C: New test.
* g++.dg/gomp/loop-transforms/unroll-3.C: New test.
|