Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
OpenMP 5.0 removed the restriction that multiple collapsed loops must
be perfectly nested, allowing "intervening code" (including nested
BLOCKs) before or after each nested loop. In GCC this code is moved
into the inner loop body by the respective front ends.
In the Fortran front end, most of the semantic processing happens during
the translation phase, so the parse phase just collects the intervening
statements, checks them for errors, and splices them around the loop body.
gcc/fortran/ChangeLog
* openmp.cc: Include omp-api.h.
(resolve_omp_clauses): Consolidate inscan reduction clause conflict
checking here.
(scan_for_next_loop_in_chain): New.
(scan_for_next_loop_in_block): New.
(gfc_resolve_omp_do_blocks): Set omp_current_do_collapse properly.
Handle imperfectly-nested loops when looking for nested omp scan.
Refactor to move inscan reduction clause conflict checking to
resolve_omp_clauses.
(gfc_resolve_do_iterator): Handle imperfectly-nested loops.
(struct icode_error_state): New.
(icode_code_error_callback): New.
(icode_expr_error_callback): New.
(diagnose_intervening_code_errors_1): New.
(diagnose_intervening_code_errors): New.
(restructure_intervening_code): New.
(resolve_nested_loops): Update error handling, and extend to
detect imperfect nesting errors and check validity of
intervening code. Call restructure_intervening_code if needed.
(resolve_omp_do): Rename collapse -> count.
gcc/testsuite/ChangeLog
* gfortran.dg/gomp/collapse1.f90: Adjust expected errors.
* gfortran.dg/gomp/collapse2.f90: Likewise.
* gfortran.dg/gomp/imperfect1.f90: New.
* gfortran.dg/gomp/imperfect2.f90: New.
* gfortran.dg/gomp/imperfect3.f90: New.
* gfortran.dg/gomp/imperfect4.f90: New.
* gfortran.dg/gomp/imperfect5.f90: New.
* gfortran.dg/gomp/loop-transforms/tile-1.f90: Adjust expected errors.
* gfortran.dg/gomp/loop-transforms/tile-2.f90: Likewise.
* gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: Likewise.
libgomp/ChangeLog
* testsuite/libgomp.fortran/imperfect-destructor.f90: New.
* testsuite/libgomp.fortran/imperfect-transform-1.f90: New.
* testsuite/libgomp.fortran/imperfect-transform-2.f90: New.
* testsuite/libgomp.fortran/imperfect1.f90: New.
* testsuite/libgomp.fortran/imperfect2.f90: New.
* testsuite/libgomp.fortran/imperfect3.f90: New.
* testsuite/libgomp.fortran/imperfect4.f90: New.
* testsuite/libgomp.fortran/target-imperfect-transform-1.f90: New.
* testsuite/libgomp.fortran/target-imperfect-transform-2.f90: New.
* testsuite/libgomp.fortran/target-imperfect1.f90: New.
* testsuite/libgomp.fortran/target-imperfect2.f90: New.
* testsuite/libgomp.fortran/target-imperfect3.f90: New.
* testsuite/libgomp.fortran/target-imperfect4.f90: New.
|
|
This patch rearranges some code previously added to support loop
transformations to simplify merging support for imperfectly-nested loops
in a subsequent patch. There is no new functionality added here.
gcc/fortran/ChangeLog
* openmp.cc (find_nested_loop_in_chain): Move up in file.
(find_nested_loop_in_block): Likewise.
(resolve_nested_loops): New helper function to consolidate code
from...
(resolve_omp_do, resolve_omp_tile): ...these functions. Also,
remove the redundant call to resolve_nested_loop_transforms, and
use uniform error message wording.
gcc/testsuite/ChangeLog
* gfortran.dg/gomp/collapse1.f90: Adjust expected error message.
* gfortran.dg/gomp/collapse2.f90: Likewise.
* gfortran.dg/gomp/loop-transforms/tile-2.f90: Likewise.
|
|
gcc/testsuite/ChangeLog
* c-c++-common/gomp/imperfect1.c: New.
* c-c++-common/gomp/imperfect2.c: New.
* c-c++-common/gomp/imperfect3.c: New.
* c-c++-common/gomp/imperfect4.c: New.
* c-c++-common/gomp/imperfect5.c: New.
libgomp/ChangeLog
* testsuite/libgomp.c-c++-common/imperfect-transform-1.c: New.
* testsuite/libgomp.c-c++-common/imperfect-transform-2.c: New.
* testsuite/libgomp.c-c++-common/imperfect1.c: New.
* testsuite/libgomp.c-c++-common/imperfect2.c: New.
* testsuite/libgomp.c-c++-common/imperfect3.c: New.
* testsuite/libgomp.c-c++-common/imperfect4.c: New.
* testsuite/libgomp.c-c++-common/imperfect5.c: New.
* testsuite/libgomp.c-c++-common/imperfect6.c: New.
* testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c: New.
* testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c: New.
* testsuite/libgomp.c-c++-common/target-imperfect1.c: New.
* testsuite/libgomp.c-c++-common/target-imperfect2.c: New.
* testsuite/libgomp.c-c++-common/target-imperfect3.c: New.
* testsuite/libgomp.c-c++-common/target-imperfect4.c: New.
|
|
OpenMP 5.0 removed the restriction that multiple collapsed loops must
be perfectly nested, allowing "intervening code" (including nested
BLOCKs) before or after each nested loop. In GCC this code is moved
into the inner loop body by the respective front ends.
This patch changes the C++ front end to use recursive descent parsing
on nested loops within an "omp for" construct, rather than an
iterative approach, in order to preserve proper nesting of compound
statements. Preserving cleanups (destructors) for class objects
declared in intervening code and loop initializers complicates moving
the former into the body of the loop; this is handled by parsing the
entire construct before reassembling any of it.
New common C/C++ testcases are in a separate patch.
gcc/cp/ChangeLog
* cp-tree.h (cp_convert_omp_range_for): Adjust declaration.
* parser.cc (struct omp_for_parse_data): New.
(cp_parser_postfix_expression): Diagnose calls to OpenMP runtime
in intervening code.
(check_omp_intervening_code): New.
(cp_parser_statement_seq_opt): Special-case nested OMP loops and
blocks in intervening code.
(cp_parser_iteration_statement): Reject loops in intervening code.
(cp_parser_omp_for_loop_init): Expand comments and tweak the
interface slightly to better distinguish input/output parameters.
(cp_parser_omp_range_for): Likewise.
(cp_convert_omp_range_for): Likewise.
(cp_parser_see_omp_loop_nest): New.
(cp_parser_omp_loop_nest): New, split from cp_parser_omp_for_loop
and largely rewritten. Add more comments.
(struct sit_data, substitute_in_tree_walker, substitute_in_tree):
New.
(fixup_blocks_walker): New.
(cp_parser_omp_for_loop): Rewrite to use recursive descent instead
of a loop. Add logic to reshuffle the bits of code collected
during parsing so intervening code gets moved to the loop body.
(cp_parser_omp_loop): Remove call to finish_omp_for_block, which
is now redundant.
(cp_parser_omp_simd): Likewise.
(cp_parser_omp_for): Likewise.
(cp_parser_omp_distribute): Likewise.
(cp_parser_oacc_loop): Likewise.
(cp_parser_omp_taskloop): Likewise.
(cp_parser_pragma): Reject OpenMP pragmas in intervening code.
* parser.h (struct cp_parser): Add omp_for_parse_state field.
* pt.cc (tsubst_omp_for_iterator): Adjust call to
cp_convert_omp_range_for.
* semantics.cc (struct fofb_data, finish_omp_for_block_walker): New.
(finish_omp_for_block): Allow variables to be bound in a BIND_EXPR
nested inside BIND instead of directly in BIND itself.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/tile-2.c: Adjust expected error patterns.
* c-c++-common/gomp/loop-transforms/imperfect-loop-nest: Likewise.
* c-c++-common/gomp/loop-transforms/tile-1.c: Likewise.
* c-c++-common/gomp/loop-transforms/tile-2.c: Likewise.
* c-c++-common/gomp/loop-transforms/tile-3.c: Likewise.
* c-c++-common/gomp/loop-transforms/unroll-inner-2.c: Likewise.
* g++.dg/gomp/attrs-4.C: Likewise.
* g++.dg/gomp/for-1.C: Likewise.
* g++.dg/gomp/pr41967.C: Likewise.
* g++.dg/gomp/pr94512.C: Likewise.
libgomp/ChangeLog
* testsuite/libgomp.c++/imperfect-class-1.C: New.
* testsuite/libgomp.c++/imperfect-class-2.C: New.
* testsuite/libgomp.c++/imperfect-class-3.C: New.
* testsuite/libgomp.c++/imperfect-destructor.C: New.
* testsuite/libgomp.c++/imperfect-template-1.C: New.
* testsuite/libgomp.c++/imperfect-template-2.C: New.
* testsuite/libgomp.c++/imperfect-template-3.C: New.
|
|
OpenMP 5.0 removed the restriction that multiple collapsed loops must
be perfectly nested, allowing "intervening code" (including nested
BLOCKs) before or after each nested loop. In GCC this code is moved
into the inner loop body by the respective front ends.
This patch changes the C front end to use recursive descent parsing
on nested loops within an "omp for" construct, rather than an iterative
approach, in order to preserve proper nesting of compound statements.
New common C/C++ testcases are in a separate patch.
gcc/c/ChangeLog
* c-parser.cc (struct c_parser): Add omp_for_parse_state field.
(struct omp_for_parse_data): New.
(check_omp_intervening_code): New.
(c_parser_compound_statement_nostart): Recognize intervening code
and nested loops in OpenMP loop constructs, and handle each
appropriately.
(c_parser_while_statement): Error on loop in intervening code.
(c_parser_do_statement): Likewise.
(c_parser_for_statement): Likewise.
(c_parser_postfix_expression_after_primary): Error on calls to
the OpenMP runtime in intervening code.
(c_parser_pragma): Error on OpenMP pragmas in intervening code.
(c_parser_see_omp_loop_nest): New.
(c_parser_omp_loop_nest): New.
(c_parser_omp_for_loop): Rewrite to use recursive descent, calling
c_parser_omp_loop_nest to do the heavy lifting.
gcc/ChangeLog
* omp-api.h: New.
* omp-general.cc (omp_runtime_api_procname): New.
(omp_runtime_api_call): Moved here from omp-low.cc, and make
non-static.
* omp-general.h: Include omp-api.h.
* omp-low.cc (omp_runtime_api_call): Delete this copy.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/collapse-1.c: Update for new C error behavior.
* c-c++-common/goacc/tile-2.c: Likewise.
* c-c++-common/gomp/loop-transforms/imperfect-loop-nest.c: Likewise.
* c-c++-common/gomp/loop-transforms/tile-1.c: Likewise.
* c-c++-common/gomp/loop-transforms/tile-2.c: Likewise.
* c-c++-common/gomp/loop-transforms/tile-3.c: Likewise.
* c-c++-common/gomp/loop-transforms/unroll-inner-2.c: Likewise.
* c-c++-common/gomp/metadirective-1.c: Likewise.
* gcc.dg/gomp/collapse-1.c: Likewise.
* gcc.dg/gomp/for-1.c: Likewise.
* gcc.dg/gomp/for-11.c: Likewise.
|
|
The new internal clauses introduced for loop transformations were missing
from the big switch statements over all clauses in these functions.
gcc/ChangeLog:
* tree-nested.cc (convert_nonlocal_omp_clauses): Handle loop
transformation clauses.
(convert_local_omp_clauses): Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/nested-fn.f90: New test.
Co-Authored-By: Frederik Harwath <frederik@codesourcery.com>
|
|
On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 14.06.23 10:09, Thomas Schwinge wrote:
>> Let me know if I should also adjust the new 'target { ! offload_device }'
>> diagnostic "[...] MANDATORY but only the host device is available" to
>> include a comma before 'but', for consistency with the other existing
>> diagnostics (cited above)?
>
> I think it makes sense to be consistent. Thus: Yes, please add the commas.
Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".
libgomp/
* target.c (resolve_device): Align a
'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others.
* testsuite/libgomp.c/target-51.c: Adjust.
(cherry picked from commit f2ef1dabbc18eb6efc0eb47bbb0eebbc6d72e09e)
|
|
..., and therefore, given 'target offload_device':
PASS: libgomp.c/target-51.c (test for excess errors)
PASS: libgomp.c/target-51.c execution test
[-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test
Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".
libgomp/
* testsuite/libgomp.c/target-51.c: Fix typo.
(cherry picked from commit 58edbc8a16878bd679d8fad183b8191af4695c15)
|
|
Merge up to r13-7446-ga79f49f934bd0d19cefd57e1fea71ad07b5d1f83 (14th June 2023)
|
|
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.
Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)
To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.
libgomp/ChangeLog:
* env.c (gomp_default_icv_values): Init default_device_var to
an nonconforming value - INT_MIN.
(initialize_env): After env-var parsing, set default_device_var to
device 0 unless OMP_TARGET_OFFLOAD=mandatory.
(omp_display_env): If default_device_var is INT_MIN, call
gomp_init_targets_once.
* icv-device.c (omp_get_default_device): Likewise.
* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
* target.c (resolve_device): Improve error message device-num < 0
with 'mandatory' and no no-host devices available.
(gomp_target_init): Set default-device-var if INT_MIN.
* testsuite/libgomp.c/target-48.c: New test.
* testsuite/libgomp.c/target-49.c: New test.
* testsuite/libgomp.c/target-50.c: New test.
* testsuite/libgomp.c/target-50a.c: New test.
* testsuite/libgomp.c/target-51.c: New test.
* testsuite/libgomp.c/target-52.c: New test.
* testsuite/libgomp.c/target-53.c: New test.
* testsuite/libgomp.c/target-54.c: New test.
(cherry picked from commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc)
|
|
I've noticed that standard_sse_constant_opcode emits some spurious
whitespace around tab, that isn't something which is done for
any other instruction and looks wrong.
2023-06-13 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386.cc (standard_sse_constant_opcode): Remove
superfluous spaces around \t for vpcmpeqd.
(cherry picked from commit 1c188877bfc5a4aee3f777f0d4d60500bc222b54)
|
|
Since there's no evex version for vpcmpeq ymm, ymm, ymm.
gcc/ChangeLog:
PR target/110227
* config/i386/sse.md (mov<mode>_internal>): Use x instead of v
for alternative 2 since there's no evex version for vpcmpeqd
ymm, ymm, ymm.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110227.c: New test.
|
|
|
|
Alias analysis was treating .MASK_LOAD as storing a full vector
which means we disambiguate against decls of smaller than vector size.
This complements the previous patch handling .MASK_STORE and fixes
runtime execution FAILs of gfortran.dg/matmul_3.f90 and
gfortran.dg/inline_sum_2.f90 when using AVX512 with full masked loop
vectorization on Zen4.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): For
.MASK_LOAD and friends set the size of the access to unknown.
(cherry picked from commit 1c3661e224e3ddfc6f773b095740c0f5a7ddf5fc)
|
|
Alias analysis was treating .MASK_STORE as storing a full vector
which means we disambiguate against decls of smaller than vector size.
That's of course wrong and a similar issue was fixed for DSE already.
The following makes sure we set the size of the access to unknown
and only constrain max_size.
This fixes runtime execution FAILs of gfortran.dg/matmul_2.f90,
gfortran.dg/matmul_6.f90 and gfortran.dg/pr91577.f90 when using
AVX512 with full masked loop vectorization on Zen4.
* tree-ssa-alias.cc (call_may_clobber_ref_p_1): For
.MASK_STORE and friend set the size of the access to
unknown.
(cherry picked from commit 8d3eb3ad5388d2f523e4a6f886c4b3364f77f51f)
|
|
Add a testcase for 'omp requires unified_address' that is currently supported
by all devices but was not tested for.
libgomp/
PR libgomp/109837
* testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test.
* testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.
(cherry picked from commit d5c58ad1ebaff924c2546df074174cffb128feb8)
|
|
C++ requires inline functions to be declared inline and defined in
every translation unit that uses them. frange_nextafter is used in
gimple-range-op.cc but it's only defined as inline in
range-op-float.cc. Drop the extraneous inline specifier.
Other non-static inline functions in range-op-float.cc are not
referenced elsewhere, so I'm making them static.
for gcc/ChangeLog
* range-op-float.cc (frange_nextafter): Drop inline.
(frelop_early_resolve): Add static.
(frange_float): Likewise.
(cherry picked from commit d438b67e005bf8fc9e4af26410bf69816c30e969)
|
|
|
|
Merge up to r13-7439-g73ae34bb693038829c05bed30d7ac623e67bde2e (12th June 2023)
|
|
Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.
Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
* semantics.cc (handle_omp_array_sections): Handle
GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.
gcc/ChangeLog:
* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
(gimplify_adjust_omp_clauses): Change
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
GOMP_MAP_FORCE_PRESENT.
* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
to/from clauses with present modifier.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind): Change the enum values
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.
libgomp/ChangeLog:
* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
error message.
* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
changed error message.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
extend testcase to check that data is copied when needed.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
* c-c++-common/gomp/map-9.c: Likewise.
* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
* gfortran.dg/gomp/map-11.f90: Likewise.
* gfortran.dg/gomp/target-update-1.f90: Likewise.
* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
* c-c++-common/gomp/map-6.c: Update dg-error and also check
clause error with 'target (enter/exit) data'.
gcc/fortran/ChangeLog.omp:
* trans-openmp.cc (gfc_omp_deep_map_kind_p): Fix conditions for
present modifier.
(cherry picked from commit 38944ec2a6fa108d24e5cfbb24c52020f9aa3015)
|
|
The following fixes code GENERIC generation for (convert! ...)
which currently generates
if (TREE_TYPE (_o1[0]) != type)
_r1 = fold_build1_loc (loc, NOP_EXPR, type, _o1[0]);
if (EXPR_P (_r1))
goto next_after_fail867;
else
_r1 = _o1[0];
where obviously braces are missing.
PR middle-end/110200
* genmatch.cc (expr::gen_transform): Put braces around
the if arm for the (convert ...) short-cut.
(cherry picked from commit 820d1aec89c43dbbc70d3d0b888201878388454c)
|
|
|
|
|
|
This patch fixes a wrong-code bug in the wake of PR92729, the transition that
turned the AVR backend from cc0 to CCmode. In cc0, the insn that uses cc0 like
a conditional branch always follows the cc0 setter, which is no more the case
with CCmode where set and use of REG_CC might be in different basic blocks.
This patch removes the machine-dependent reorg pass in avr_reorg entirely.
It is replaced by a new, AVR specific mini-pass that runs prior to split2.
Canonicalization of comparisons away from the "difficult" codes GT[U] and LE[U]
is now mostly performed by implementing TARGET_CANONICALIZE_COMPARISON.
Moreover:
* Text peephole conditions get "dead_or_set_regno_p (*, REG_CC)" as needed.
* RTL peephole conditions get "peep2_regno_dead_p (*, REG_CC)" as needed.
* Conditional branches no more clobber REG_CC.
* insn output for compares looks ahead to determine the branch mode in use.
This needs also "dead_or_set_regno_p (*, REG_CC)".
* Add RTL peepholes for decrement-and-branch detection.
* Some of the patterns like "*cmphi.zero-extend.0" lost their
combine-ational part wit PR92729. Restore them.
Finally, it fixes some of the many indentation glitches left over from PR92729.
gcc/
PR target/109650
PR target/92729
Backport from 2023-05-10 master r14-1688.
* config/avr/avr-passes.def (avr_pass_ifelse): Insert new pass.
* config/avr/avr.cc (avr_pass_ifelse): New RTL pass.
(avr_pass_data_ifelse): New pass_data for it.
(make_avr_pass_ifelse, avr_redundant_compare, avr_cbranch_cost)
(avr_canonicalize_comparison, avr_out_plus_set_ZN)
(avr_out_cmp_ext): New functions.
(compare_condtition): Make sure REG_CC dies in the branch insn.
(avr_rtx_costs_1): Add computation of cbranch costs.
(avr_adjust_insn_length) [ADJUST_LEN_ADD_SET_ZN, ADJUST_LEN_CMP_ZEXT]:
[ADJUST_LEN_CMP_SEXT]Handle them.
(TARGET_CANONICALIZE_COMPARISON): New define.
(avr_simplify_comparison_p, compare_diff_p, avr_compare_pattern)
(avr_reorg_remove_redundant_compare, avr_reorg): Remove functions.
(TARGET_MACHINE_DEPENDENT_REORG): Remove define.
* config/avr/avr-protos.h (avr_simplify_comparison_p): Remove proto.
(make_avr_pass_ifelse, avr_out_plus_set_ZN, cc_reg_rtx)
(avr_out_cmp_zext): New Protos
* config/avr/avr.md (branch, difficult_branch): Don't split insns.
(*cbranchhi.zero-extend.0", *cbranchhi.zero-extend.1")
(*swapped_tst<mode>, *add.for.eqne.<mode>): New insns.
(*cbranch<mode>4): Rename to cbranch<mode>4_insn.
(define_peephole): Add dead_or_set_regno_p(insn,REG_CC) as needed.
(define_deephole2): Add peep2_regno_dead_p(*,REG_CC) as needed.
Add new RTL peepholes for decrement-and-branch and *swapped_tst<mode>.
Rework signtest-and-branch peepholes for *sbrx_branch<mode>.
(adjust_len) [add_set_ZN, cmp_zext]: New.
(QIPSI): New mode iterator.
(ALLs1, ALLs2, ALLs4, ALLs234): New mode iterators.
(gelt): New code iterator.
(gelt_eqne): New code attribute.
(rvbranch, *rvbranch, difficult_rvbranch, *difficult_rvbranch)
(branch_unspec, *negated_tst<mode>, *reversed_tst<mode>)
(*cmpqi_sign_extend): Remove insns.
(define_c_enum "unspec") [UNSPEC_IDENTITY]: Remove.
* config/avr/avr-dimode.md (cbranch<mode>4): Canonicalize comparisons.
* config/avr/predicates.md (scratch_or_d_register_operand): New.
* config/avr/constraints.md (Yxx): New constraint.
gcc/testsuite/
PR target/109650
Backport from 2023-05-10 master r14-1688.
* gcc.target/avr/torture/pr109650-1.c: New test.
* gcc.target/avr/torture/pr109650-2.c: New test.
|
|
|
|
So for the attached testcase, we assumed that zero_one_valued_p would
be the value [0,1] but currently zero_one_valued_p matches also
signed 1 bit integers.
This changes that not to match that and fixes the 2 new testcases at
all optimization levels.
OK for GCC 13? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/110165
PR tree-optimization/110166
gcc/ChangeLog:
* match.pd (zero_one_valued_p): Don't accept
signed 1-bit integers.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr110165-1.c: New test.
* gcc.c-torture/execute/pr110166-1.c: New test.
(cherry picked from commit 72e652f3425079259faa4edefe1dc571f72f91e0)
|
|
Merge up to r13-7433-gb6118b8155a679ced926e8ff900e0ed969cd23a7 (9th June 2023)
|
|
One of the testcases lacked variables in a map clause such that
the fail occurred too early. Additionally, it would have failed
for all those non-host devices where 'present' is always true, i.e.
non-host devices which can access all of the host memory
(shared-memory devices). [There are currently none.]
The commit now runs the code on all devices, which should succeed
for host fallback and for shared-memory devices, finding potenial issues
that way. Additionally, a checkpoint (required stdout output) is used
to ensure that the execution won't fail (with the same error) before
reaching the expected fail location.
2023-06-07 Thomas Schwinge <thomas@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
libgomp/
* testsuite/libgomp.c-c++-common/target-present-1.c: Run code
also for non-offload_device targets; check that it runs
successfully for those and for all until a checkpoint for all
* testsuite/libgomp.c-c++-common/target-present-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise;
add missing vars to map clause.
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
(cherry picked from commit dd958667821e38b7d6b8efe448044901b4762b3a)
|
|
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct. It is also supported in defaultmap.
The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device. It can also be combined
with 'always' in map clauses.
2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/c/
* c-parser.cc (c_parser_omp_clause_defaultmap,
c_parser_omp_clause_map): Parse 'present'.
(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
(c_parser_omp_clause_from_to): New; parse to/from clauses with
optional present modifer.
(c_parser_omp_all_clauses): Update call.
(c_parser_omp_target_data, c_parser_omp_target_enter_data,
c_parser_omp_target_exit_data): Handle new map enum values
for 'present' mapping.
gcc/cp/
* parser.cc (cp_parser_omp_clause_defaultmap,
cp_parser_omp_clause_map): Parse 'present'.
(cp_parser_omp_clause_from_to): New; parse to/from
clauses with optional 'present' modifier.
(cp_parser_omp_all_clauses): Update call.
(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
cp_parser_omp_target_exit_data): Handle new enum value for
'present' mapping.
* semantics.cc (finish_omp_target): Likewise.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
modifier.
(show_omp_clauses): Display 'present' motion modifier for 'to'
and 'from' clauses.
* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
modifiers.
(struct gfc_omp_namelist): Add 'present_modifer'.
* openmp.cc (gfc_match_motion_var_list): New, handles optional
'present' modifier for to/from clauses.
(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
in defaultmap and map clauses.
(resolve_omp_clauses): Allow 'present' modifiers on 'target',
'target data', 'target enter' and 'target exit' directives.
* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for
defaultmap.
gcc/
* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
set.
(omp_get_attachment): Handle map clauses with 'present' modifier.
(omp_group_base): Likewise.
(gimplify_scan_omp_clauses): Reorder present maps to come first.
Set GOVD flags for present defaultmaps.
(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
clauses.
(lower_omp_target): Handle map clauses with 'present' modifier.
Handle 'to' and 'from' clauses with 'present'.
* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
'from' clauses with 'present' modifier. Handle present defaultmap.
* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.
include/
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
(GOMP_MAP_FLAG_FORCE): Redefine.
(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
(enum gomp_map_kind): Add map kinds with 'present' modifiers.
(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
map variants with 'present'
(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
for map variants with 'always, present' modifiers.
(GOMP_MAP_ALWAYS): Redefine.
(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.
libgomp/
* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
modifier.
(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
(gomp_map_vars_internal, gomp_update, gomp_target_rev):
Emit runtime error if memory region not present.
* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
* testsuite/libgomp.fortran/target-present-1.f90: New test.
* testsuite/libgomp.fortran/target-present-2.f90: New test.
* testsuite/libgomp.fortran/target-present-3.f90: New test.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
duplicated 'present' and extend scan-dump tests for 'present'.
* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
'present'.
* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
modifier checking.
* c-c++-common/gomp/defaultmap-4.c: New test.
* c-c++-common/gomp/map-9.c: New test.
* c-c++-common/gomp/target-update-1.c: New test.
* gfortran.dg/gomp/defaultmap-8.f90: New test.
* gfortran.dg/gomp/map-11.f90: New test.
* gfortran.dg/gomp/map-12.f90: New test.
* gfortran.dg/gomp/target-update-1.f90: New test.
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
(cherry picked from commit 4ede915d5dde935a16df2c6640aee5ab22348d30)
|
|
When folding two conversions in a row we use TYPE_PRECISION but
that's invalid for VECTOR_TYPE. The following fixes this by
using element_precision instead.
middle-end/110182
* match.pd (two conversions in a row): Use element_precision
to DTRT for VECTOR_TYPE.
(cherry picked from commit 3e12669a0eb968cfcbe9242b382fd8020935edf8)
|
|
This bug was essentially that darwin_rs6000_special_round_type_align()
was ignoring externally-imposed capping of field alignment.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR target/110044
gcc/ChangeLog:
* config/rs6000/rs6000.cc (darwin_rs6000_special_round_type_align):
Make sure that we do not have a cap on field alignment before altering
the struct layout based on the type alignment of the first entry.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/darwin-abi-13-0.c: New test.
* gcc.target/powerpc/darwin-abi-13-1.c: New test.
* gcc.target/powerpc/darwin-abi-13-2.c: New test.
* gcc.target/powerpc/darwin-structs-0.h: New test.
(cherry picked from commit 84d080a29a780973bef47171ba708ae2f7b4ee47)
|
|
The pr96024.f90 testcase ICEs on big-endian hosts. The problem is
that length->val.integer is accessed after checking
length->expr_type == EXPR_CONSTANT, but it is a CHARACTER constant
which uses length->val.character union member instead and on big-endian
we end up reading constant 0x100000000 rather than some small number
on little-endian and if target doesn't have enough memory for 4 times
that (i.e. 16GB allocation), it ICEs.
2023-06-09 Jakub Jelinek <jakub@redhat.com>
PR fortran/96024
* primary.cc (gfc_convert_to_structure_constructor): Only do
constant string ctor length verification and truncation/padding
if constant length has INTEGER type.
(cherry picked from commit 4cf6e322adc19f927859e0a5edfa93cec4b8c844)
|
|
Since mask < 0 will be always false for vector char when
-funsigned-char, but vpblendvb needs to check the most significant
bit. The patch explicitly VCE to vector signed char.
gcc/ChangeLog:
PR target/110108
* config/i386/i386.cc (ix86_gimple_fold_builtin): Explicitly
view_convert_expr mask to signed type when folding pblendvb
builtins.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110108-2.c: New test.
|
|
|
|
As the PR says we shouldn't be using qualifier_unsigned for the return type of the __ssat intrinsics.
UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS already exists for that.
This was just a thinko.
This patch fixes this and the warning with -Wconversion goes away.
Bootstrapped and tested on arm-none-linux-gnueabihf.
gcc/ChangeLog:
PR target/109939
* config/arm/arm-builtins.cc (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS): Use
qualifier_none for the return operand.
gcc/testsuite/ChangeLog:
PR target/109939
* gcc.target/arm/pr109939.c: New test.
(cherry picked from commit 95542a6ec4b350c653b793b7c36a8210b0e9a89d)
|
|
|
|
PR106907 has few warnings spotted from cppcheck. In that addressing duplicate
expression issue here. Here the same expression is used twice in logical
AND(&&) operation which result in same result so removing that.
2023-06-06 Jeevitha Palanisamy <jeevitha@linux.ibm.com>
gcc/
PR target/106907
* config/rs6000/rs6000.cc (vec_const_128bit_to_bytes): Remove
duplicate expression.
(cherry picked from commit c4deccd44655c5d748dfed200a37f2b678c32fe8)
|
|
This reverts commit 6e3816fa47c56d87d21d90f67435a1ab1664d8db.
which then permits to apply the mainline patch of it more cleanly.
|
|
This reverts commit f719ab9a3ac51d798b012a5ab7757af2b81b4ae2.
in order to revert
commit 6e3816fa47c openmp: Add support for the 'present' modifier
which then permits to apply the mainline patches of those more cleanly.
|
|
In r11-966-g9a182ef9ee011935d827ab5c6c9a7cd8e22257d8 we introduce a
simplification to emit_move_insn that attempts to simplify moves of the form:
(set (subreg:M1 (reg:M2 ...)) (constant C))
where M1 and M2 are of equal mode size. That is problematic for the splitter
vfp.md:no_literal_pool_df_immediate in the arm backend, which tries to pun an
lvalue DFmode pseudo into DImode and assign a constant to it with
emit_move_insn, as the new transformation simply undoes this, and we end up
splitting indefinitely.
This patch changes things around in the arm backend so that we use a
DImode temporary (instead of DFmode) and first load the DImode constant
into the pseudo, and then pun the pseudo into DFmode as an rvalue in a
reg -> reg move. I believe this should be semantically equivalent but
avoids the pathalogical behaviour seen in the PR.
gcc/ChangeLog:
PR target/109800
* config/arm/arm.md (movdf): Generate temporary pseudo in DImode
instead of DFmode.
* config/arm/vfp.md (no_literal_pool_df_immediate): Rather than punning an
lvalue DFmode pseudo into DImode, use a DImode pseudo and pun it into
DFmode as an rvalue.
gcc/testsuite/ChangeLog:
PR target/109800
* gcc.target/arm/pure-code/pr109800.c: New test.
(cherry picked from commit f5298d9969b4fa34ff3aecd54b9630e22b2984a5)
|
|
Effectively, for GCN (as for nvptx) there is a common address space between
host and device, whether being accessible or not. Thus, this commit
permits to use 'omp requires unified_address' with GCN devices.
(nvptx accepts this requirement since r13-3460-g131d18e928a3ea.)
OG13 note: unified_address was supported before for GCN via
OG13 commit 3ddf3565fae amdgcn: libgomp plugin USM implementation
Thus, this cherry pick just uses the mainline/GCC14 syntax in the C file;
additionally, it updates the documentation.
libgomp/
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Regard
unified_address requirement as supported.
* libgomp.texi (OpenMP 5.0, AMD Radeon, nvptx): Remove
'unified_address' from the not-supported requirements.
(cherry picked from commit f1af7d65ff64fe7102d1490ef46ea491a533e641)
|
|
|
|
The monadic operations in std::expected always check has_value() so we
can avoid the execptional path in value() and the assertions in error()
by accessing _M_val and _M_unex directly. This means that the monadic
operations no longer require _M_unex to be copyable so that it can be
thrown from value(), as modified by LWG 3938.
This also fixes two incorrect uses of std::move in transform(F&&)& and
transform(F&&) const& which I found while making these changes.
Now that move-only error types are supported, it's possible to properly
test the constraints that LWG 3877 added to and_then and transform. The
lwg3877.cc test now does that.
libstdc++-v3/ChangeLog:
* include/std/expected (expected::and_then, expected::or_else)
(expected::transform_error): Use _M_val and _M_unex instead of
calling value() and error(), as per LWG 3938.
(expected::transform): Likewise. Remove incorrect std::move
calls from lvalue overloads.
(expected<void, E>::and_then, expected<void, E>::or_else)
(expected<void, E>::transform): Use _M_unex instead of calling
error().
* testsuite/20_util/expected/lwg3877.cc: Add checks for and_then
and transform, and for std::expected<void, E>.
* testsuite/20_util/expected/lwg3938.cc: New test.
(cherry picked from commit fe94f8b7e022b7e154f6c47cc292d4463bddac5e)
|
|
This was approved in Issaquah 2023. As well as fixing the value
categories, this fixes the fact that we were incorrectly testing E
instead of T in the or_else constraints.
libstdc++-v3/ChangeLog:
* include/std/expected (expected::and_then, expected::or_else)
(expected::transform, expected::transform_error): Fix exception
specifications as per LWG 3877.
(expected<void, E>::and_then, expected<void, E>::transform):
Likewise.
* testsuite/20_util/expected/lwg3877.cc: New test.
(cherry picked from commit ba490492e51834db645a3165d14f2ba0af62a8c7)
|
|
On sh target, there is a MULTILIB_DIRNAMES (or is it MULTILIB_OPTIONS) named m2,
this conflicts with the langauge m2. So when you do a `make clean`, it will remove
the m2 directory and then a build will fail. Now since r0-78222-gfa9585134f6f58,
the multilib directories are no longer created in the gcc directory as libgcc
was moved to the toplevel. So we can remove the part of clean that removes those
directories.
Tested on x86_64-linux-gnu and a cross to sh-elf that `make clean` followed by
`make` works again.
Committed as approved.
gcc/ChangeLog:
PR bootstrap/110085
* Makefile.in (clean): Remove the removing of
MULTILIB_DIR/MULTILIB_OPTIONS directories.
(cherry picked from commit afd87299cefd021daf0158d5b6276c37013996b9)
|
|
The size reported by stat is always zero for some special files such as
those under /proc, which means the current copy_file implementation
thinks there is nothing to copy. Instead of trusting the stat value, try
to read a character from a streambuf and check for EOF.
For the backport, we also need to avoid trying to use sendfile when stat
reports a zero size, so that we use streambufs to copy the file.
libstdc++-v3/ChangeLog:
PR libstdc++/108178
* src/filesystem/ops-common.h (do_copy_file): Check for empty
files by trying to read a character.
* testsuite/27_io/filesystem/operations/copy_file_108178.cc:
New test.
(cherry picked from commit 07a0e108247f23fcb919c61595adae143f1ea02a)
|
|
libstdc++-v3/ChangeLog:
* src/filesystem/ops-common.h (do_copy_file) [O_CLOEXEC]: Set
close-on-exec flag on file descriptors.
(cherry picked from commit 7e8e071c4b64f1b6ea5ddf528724fc793a0f0e36)
|
|
For 32-bit targets using -pedantic (or using Clang) makes the expression
_M_elems[0] ambiguous. The overloaded operator[] that we want to call
has a size_t parameter, but 0 is type ptrdiff_t for many ILP32 targets,
so using the implicit conversion from _M_elems to T* and then
subscripting that is also viable.
Change the 0 to (size_type)0 and also make the conversion to T*
explicit, so that's it's not viable here. The latter change requires a
static_cast in data() where we really do want to convert _M_elems to a
pointer.
libstdc++-v3/ChangeLog:
PR libstdc++/110139
* include/std/array (__array_traits<T, 0>::operator T*()): Make
conversion operator explicit.
(array::front): Use size_type as subscript operand.
(array::data): Use static_cast to make conversion explicit.
* testsuite/23_containers/array/element_access/110139.cc: New
test.
(cherry picked from commit 56001fad4ecc32396beead6644906e3846244b67)
|
|
It is not required that codecvt<char8_t, char, mbstate_t> facet be
supported by the locale, nor is it added as part of the default locale.
This can lead to dangerous behaviour when static_cast.
libstdc++-v3/ChangeLog:
* include/bits/locale_classes.tcc: Remove check for
codecvt<char8_t, char, mbstate_t> facet.
(cherry picked from commit 3d9b3ddb5fc9087c17645d53e6bcb1881e1955a4)
|