Age | Commit message (Collapse) | Author | Files | Lines |
|
Epilogue vectorization is not set up to re-use a vectorized
accumulator consisting of more than one vector. For non-SLP
we always reduce to a single but for SLP that isn't happening.
In such case we currenlty miscompile the epilog so avoid this.
PR tree-optimization/107160
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Do not register accumulator if we failed to reduce it
to a single vector.
* gcc.dg/vect/pr107160.c: New testcase.
(cherry picked from commit 5cbaf84c191b9a3e3cb26545c808d208bdbf2ab5)
|
|
The following fixes an unintended(?) side-effect of the special
MODIFY_EXPR expression entries we add for tail-merging during VN.
We shouldn't value-number the virtual operand differently here.
PR tree-optimization/107107
* tree-ssa-sccvn.cc (visit_reference_op_store): Do not
affect value-numbering when doing the tail merging
MODIFY_EXPR lookup.
* gcc.dg/pr107107.c: New testcase.
(cherry picked from commit 85333b9265720fc4e49397301cb16324d2b89aa7)
|
|
The following extends the skipping of same valued stores to
handle an arbitrary number of them as long as they are from the
same value (which we now record). That's an obvious extension
which allows to optimize the m_engaged member of std::optional
more reliably.
PR tree-optimization/106922
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Allow
an arbitrary number of same valued skipped stores.
* g++.dg/torture/pr106922.C: New testcase.
(cherry picked from commit af611afe5fcc908a6678b5b205fb5af7d64fbcb2)
|
|
On Thu, Sep 22, 2022 at 01:10:08PM +0200, Richard Biener via Gcc-patches wrote:
> * g++.dg/tree-ssa/pr106922.C: Adjust.
> --- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
> +++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
> @@ -87,5 +87,4 @@ void testfunctionfoo() {
> }
> }
>
> -// { dg-final { scan-tree-dump-times "Found fully redundant value" 4 "pre" { xfail { ! lp64 } } } }
> -// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" { xfail { ! lp64 } } } }
> +// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } }
I've noticed
+UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++20 scan-tree-dump-not dce3 "m_initialized"
+UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++2b scan-tree-dump-not dce3 "m_initialized"
with this change, both on x86_64 and i686.
The dump is still cddce3, additionally as the last reference to the pre
dump is gone, not sure it is worth creating that dump.
With the following patch, there aren't FAILs nor UNRESOLVED tests with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS="--target_board=unix\{-m32,-m64\} dg.exp='pr106922.C'"
2022-09-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106922
* g++.dg/tree-ssa/pr106922.C: Scan in cddce3 dump rather than
dce3. Remove -fdump-tree-pre-details from dg-options.
(cherry picked from commit a0de11d0d22054b6fd76a0730a3ec807542379d0)
|
|
The following enhances the store-with-same-value trick in
vn_reference_lookup_3 by not only looking for
a = val;
*ptr = val;
.. = a;
but also
*ptr = val;
other = x;
.. = a;
where the earlier store is more than one hop away. It does this
by queueing the actual value to compare until after the walk but
as disadvantage only allows a single such skipped store from a
constant value.
Unfortunately we cannot handle defs from non-constants this way
since we're prone to pick up values from the past loop iteration
this way and we have no good way to identify values that are
invariant in the currently iterated cycle. That's why we keep
the single-hop lookup for those cases. gcc.dg/tree-ssa/pr87126.c
would be a testcase that's un-XFAILed when we'd handle those
as well.
PR tree-optimization/106922
* tree-ssa-sccvn.cc (vn_walk_cb_data::same_val): New member.
(vn_walk_cb_data::finish): Perform delayed verification of
a skipped may-alias.
(vn_reference_lookup_pieces): Likewise.
(vn_reference_lookup): Likewise.
(vn_reference_lookup_3): When skipping stores of the same
value also handle constant stores that are more than a
single VDEF away by delaying the verification.
* gcc.dg/tree-ssa/ssa-fre-100.c: New testcase.
* g++.dg/tree-ssa/pr106922.C: Adjust.
(cherry picked from commit 9baee6181b4e427e0b5ba417e51424c15858dce7)
|
|
For example, for "g++-4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4", the recent
commit r13-3220-g45381d6f9f4e7b5c7b062f5ad8cc9788091c2d07
"amdgcn: add multiple vector sizes" broke the build:
In file included from [...]/source-gcc/gcc/coretypes.h:458:0,
from [...]/source-gcc/gcc/config/gcn/gcn.cc:24:
[...]/source-gcc/gcc/config/gcn/gcn.cc: In function ‘machine_mode VnMODE(int, machine_mode)’:
./insn-modes.h:42:71: error: temporary of non-literal type ‘scalar_int_mode’ in a constant expression
#define QImode (scalar_int_mode ((scalar_int_mode::from_int) E_QImode))
^
[...]/source-gcc/gcc/config/gcn/gcn.cc:405:10: note: in expansion of macro ‘QImode’
case QImode:
^
In file included from [...]/source-gcc/gcc/coretypes.h:478:0,
from [...]/source-gcc/gcc/config/gcn/gcn.cc:24:
[...]/source-gcc/gcc/machmode.h:410:7: note: ‘scalar_int_mode’ is not literal because:
class scalar_int_mode
^
[...]/source-gcc/gcc/machmode.h:410:7: note: ‘scalar_int_mode’ is not an aggregate, does not have a trivial default constructor, and has no constexpr constructor that is not a copy or move constructor
[...]
Addressing this like simiar issues have been addressed in the past.
gcc/
* config/gcn/gcn.cc (VnMODE): Use 'case E_QImode:' instead of
'case QImode:', etc.
(cherry picked from commit 612de72b0d2904b5a5a2b487ce4cb907c768a947)
|
|
'libgomp.c/reverse-offload-sm30.c'
That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too.
Fix test case added in recent
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc:
Warn instead of error when reverse offload is not possible".
libgomp/
* testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
'-foffload-options' syntax.
(cherry picked from commit b61796663ba1fe8fb83203829398f3f89ec212b7)
|
|
|
|
|
|
|
|
This patch prevents compiler-generated artificial variables from being
treated as privatization candidates for OpenACC.
The rationale is that e.g. "gang-private" variables actually must be
shared by each worker and vector spawned within a particular gang, but
that sharing is not necessary for any compiler-generated variable (at
least at present, but no such need is anticipated either). Variables on
the stack (and machine registers) are already private per-"thread"
(gang, worker and/or vector), and that's fine for artificial variables.
Several tests need their scan output patterns adjusted to compensate.
2022-10-14 Julian Brown <julian@codesourcery.com>
gcc/
* omp-low.cc (oacc_privatization_candidate_p): Artificial vars are not
privatization candidates.
libgomp/
* testsuite/libgomp.oacc-fortran/declare-1.f90: Adjust scan output.
* testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/print-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
|
|
The GCN backend uses a heuristic to determine whether to use FLAT or
GLOBAL addressing in a particular (offload) function: namely, if a
function takes a pointer-to-scalar parameter, it is assumed that the
pointer may refer to "flat scratch" space, and thus FLAT addressing must
be used instead of GLOBAL.
I came up with this heuristic initially whilst working on support for
moving OpenACC gang-private variables into local-data share (scratch)
memory. The assumption that only scalar variables would be transformed in
that way turned out to be wrong. For example, prior to the next patch in
the series, Fortran compiler-generated temporary structures were treated
as gang private and moved to LDS space, typically overflowing the region
allocated for such variables. That will no longer happen after that
patch is applied, but there may be other cases of structs moving to LDS
space now or in the future that this patch may be needed for.
2022-10-14 Julian Brown <julian@codesourcery.com>
gcc/
* config/gcn/gcn.cc (gcn_detect_incoming_pointer_arg): Any pointer
argument forces FLAT addressing mode, not just
pointer-to-non-aggregate.
|
|
This is the infamous PR rtl-optimization/38644 rearing its ugly head for
leaf functions on SPARC more than a decade later... Richard E.'s generic
solution has never been implemented so let's do as other RISC back-ends did.
gcc/
PR target/107248
* config/sparc/sparc.cc (sparc_expand_prologue): Emit a frame
blockage for leaf functions.
(sparc_flat_expand_prologue): Emit frame instead of full blockage.
(sparc_expand_epilogue): Emit a frame blockage for leaf functions.
(sparc_flat_expand_epilogue): Emit frame instead of full blockage.
|
|
|
|
Since r12-8066, in cxx_eval_vec_init we perform expand_vec_init_expr
while processing the default argument in this test. At this point
start_preparsed_function hasn't yet set current_function_decl.
expand_vec_init_expr then leads to maybe_splice_retval_cleanup which
checks DECL_CONSTRUCTOR_P (current_function_decl) without checking that
c_f_d is non-null first. It seems correct that c_f_d is null here, so
it seems to me that maybe_splice_retval_cleanup should check c_f_d as
in the following patch.
PR c++/106925
gcc/cp/ChangeLog:
* except.cc (maybe_splice_retval_cleanup): Check current_function_decl.
Make the bool const.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-defarg3.C: New test.
(cherry picked from commit 3130e70dab1e64a7b014391fe941090d5f3b6b7d)
|
|
gcc/
* doc/install.texi (Specific): Add missing items to bullet list.
(amdgcn): Update LLVM requirements, use version not date for newlib.
(nvptx): Use version not git hash for newlib.
(cherry picked from commit e886ebd17965d78f609b62479f4f48085108389c)
|
|
|
|
For actual arguments whose dummy is INTENT(OUT), we used to generate
clobbers on them at the same time we generated the argument reference
for the function call. This was wrong if for an argument coming
later, the value expression was depending on the value of the just-
clobbered argument, and we passed an undefined value in that case.
With this change, clobbers are collected separatedly and appended
to the procedure call preliminary code after all the arguments have been
evaluated.
PR fortran/106817
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_procedure_call): Collect all clobbers
to their own separate block. Append the block of clobbers to
the procedure preliminary block after the argument evaluation
codes for all the arguments.
gcc/testsuite/ChangeLog:
* gfortran.dg/intent_optimize_4.f90: New test.
(cherry picked from commit 29919bf3b6449bafd02e795abbb1966e3990c1fc)
|
|
The fortran frontend, as result symbol for a function without
declared result symbol, uses the function symbol itself. This caused
an invalid clobber of a function decl to be emitted, leading to an
ICE, whereas the intended behaviour was to clobber the function result
variable. This change fixes the problem by getting the decl from the
just-retrieved variable reference after the call to
gfc_conv_expr_reference, instead of copying it from the frontend symbol.
PR fortran/105012
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_procedure_call): Retrieve variable
from the just calculated variable reference.
gcc/testsuite/ChangeLog:
* gfortran.dg/intent_out_15.f90: New test.
(cherry picked from commit edaf1e005c90b311c39b46d85cea17befbece112)
|
|
This change inlines the clobber generation code from
gfc_conv_expr_reference to the single caller from where the add_clobber
flag can be true, and removes the add_clobber argument.
What motivates this is the standard making the procedure call a cause
for a variable to become undefined, which translates to a clobber
generation, so clobber generation should be closely related to procedure
call generation, whereas it is rather orthogonal to variable reference
generation. Thus the generation of the clobber feels more appropriate
in gfc_conv_procedure_call than in gfc_conv_expr_reference.
Behaviour remains unchanged.
gcc/fortran/ChangeLog:
* trans.h (gfc_conv_expr_reference): Remove add_clobber
argument.
* trans-expr.cc (gfc_conv_expr_reference): Ditto. Inline code
depending on add_clobber and conditions controlling it ...
(gfc_conv_procedure_call): ... to here.
(cherry picked from commit 2b393f6f83903cb836676bbd042c1b99a6e7e6f7)
|
|
The function was taken away by the "add multiple vector sizes" patch.
2022-10-11 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn.cc (gcn_expand_builtin_1): Change gcn_full_exec_reg
to get_exec.
|
|
The testsuite needs a few tweaks following my patches to add multiple vector
sizes for amdgcn.
gcc/testsuite/ChangeLog:
* gcc.dg/pr104464.c: Xfail on amdgcn.
* gcc.dg/signbit-2.c: Likewise.
* gcc.dg/signbit-5.c: Likewise.
* gcc.dg/vect/bb-slp-68.c: Likewise.
* gcc.dg/vect/bb-slp-cond-1.c: Change expectations on amdgcn.
* gcc.dg/vect/bb-slp-subgroups-3.c: Likewise.
* gcc.dg/vect/no-vfa-vect-depend-2.c: Change expectations for multiple
vector sizes.
* gcc.dg/vect/pr33953.c: Likewise.
* gcc.dg/vect/pr65947-12.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Likewise.
* gcc.dg/vect/pr80631-2.c: Likewise.
* gcc.dg/vect/slp-reduc-4.c: Likewise.
* gcc.dg/vect/trapv-vect-reduc-4.c: Likewise.
* lib/target-supports.exp (available_vector_sizes): Add more sizes
for amdgcn.
|
|
Another example of the vectorizer needing explicit insns where the scalar
expander just works.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (neg<mode>2): New define_expand.
|
|
Implements vec_init when the input is a vector of smaller vectors, or of
vector MEM types, or a smaller vector duplicated several times.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (vec_init<V_ALL:mode><V_ALL_ALT:mode>): New.
* config/gcn/gcn.cc (GEN_VN): Add andvNsi3, subvNsi3.
(GEN_VNM): Add gathervNm_expr.
(GEN_VN_NOEXEC): Add vec_seriesvNsi.
(gcn_expand_vector_init): Add initialization of vectors from smaller
vectors.
|
|
Add vec_extract expanders for all valid pairs of vector types.
gcc/ChangeLog:
* config/gcn/gcn-protos.h (get_exec): Add prototypes for two variants.
* config/gcn/gcn-valu.md
(vec_extract<V_ALL:mode><V_ALL_ALT:mode>): New define_expand.
* config/gcn/gcn.cc (get_exec): Export the existing function. Add a
new overload variant.
|
|
GET_MODE_NUNITS isn't a compile time constant, so we end up with many
impossible insns in the machine description. Adding MODE_VF allows the insns
to be eliminated completely.
gcc/ChangeLog:
* config/gcn/gcn-valu.md
(<cvt_name><VCVT_MODE:mode><VCVT_FMODE:mode>2<exec>): Use MODE_VF.
(<cvt_name><VCVT_FMODE:mode><VCVT_IMODE:mode>2<exec>): Likewise.
* config/gcn/gcn.h (MODE_VF): New macro.
|
|
The vectors sizes are simulated using implicit masking, but they make life
easier for the autovectorizer and SLP passes.
gcc/ChangeLog:
* config/gcn/gcn-modes.def (VECTOR_MODE): Add new modes
V32QI, V32HI, V32SI, V32DI, V32TI, V32HF, V32SF, V32DF,
V16QI, V16HI, V16SI, V16DI, V16TI, V16HF, V16SF, V16DF,
V8QI, V8HI, V8SI, V8DI, V8TI, V8HF, V8SF, V8DF,
V4QI, V4HI, V4SI, V4DI, V4TI, V4HF, V4SF, V4DF,
V2QI, V2HI, V2SI, V2DI, V2TI, V2HF, V2SF, V2DF.
(ADJUST_ALIGNMENT): Likewise.
* config/gcn/gcn-protos.h (gcn_full_exec): Delete.
(gcn_full_exec_reg): Delete.
(gcn_scalar_exec): Delete.
(gcn_scalar_exec_reg): Delete.
(vgpr_1reg_mode_p): Use inner mode to identify vector registers.
(vgpr_2reg_mode_p): Likewise.
(vgpr_vector_mode_p): Use VECTOR_MODE_P.
* config/gcn/gcn-valu.md (V_QI, V_HI, V_HF, V_SI, V_SF, V_DI, V_DF,
V_QIHI, V_1REG, V_INT_1REG, V_INT_1REG_ALT, V_FP_1REG, V_2REG, V_noQI,
V_noHI, V_INT_noQI, V_INT_noHI, V_ALL, V_ALL_ALT, V_INT, V_FP):
Add additional vector modes.
(V64_SI, V64_DI, V64_ALL, V64_FP): New iterators.
(scalar_mode, SCALAR_MODE, vnsi, VnSI, vndi, VnDI, sdwa):
Add additional vector mode mappings.
(mov<mode>): Implement vector length conversions.
(ldexp<mode>3<exec>): Use VnSI.
(frexp<mode>_exp2<exec>): Likewise.
(VCVT_MODE, VCVT_FMODE, VCVT_IMODE): Add additional vector modes.
(reduc_<reduc_op>_scal_<mode>): Use V64_ALL.
(fold_left_plus_<mode>): Use V64_FP.
(*<reduc_op>_dpp_shr_<mode>): Use V64_1REG.
(*<reduc_op>_dpp_shr_<mode>): Use V64_DI.
(*plus_carry_dpp_shr_<mode>): Use V64_INT_1REG.
(*plus_carry_in_dpp_shr_<mode>): Use V64_SI.
(*plus_carry_dpp_shr_<mode>): Use V64_DI.
(mov_from_lane63_<mode>): Use V64_2REG.
* config/gcn/gcn.cc (VnMODE): New function.
(gcn_can_change_mode_class): Support multiple vector sizes.
(gcn_modes_tieable_p): Likewise.
(gcn_operand_part): Likewise.
(gcn_scalar_exec): Delete function.
(gcn_scalar_exec_reg): Delete function.
(gcn_full_exec): Delete function.
(gcn_full_exec_reg): Delete function.
(gcn_inline_fp_constant_p): Support multiple vector sizes.
(gcn_fp_constant_p): Likewise.
(A): New macro.
(GEN_VN_NOEXEC): New macro.
(GEN_VNM_NOEXEC): New macro.
(GEN_VN): New macro.
(GEN_VNM): New macro.
(GET_VN_FN): New macro.
(CODE_FOR): New macro.
(CODE_FOR_OP): New macro.
(gen_mov_with_exec): Delete function.
(gen_duplicate_load): Delete function.
(gcn_expand_vector_init): Support multiple vector sizes.
(strided_constant): Likewise.
(gcn_addr_space_legitimize_address): Likewise.
(gcn_expand_scalar_to_vector_address): Likewise.
(gcn_expand_scaled_offsets): Likewise.
(gcn_secondary_reload): Likewise.
(gcn_valid_cvt_p): Likewise.
(gcn_expand_builtin_1): Likewise.
(gcn_make_vec_perm_address): Likewise.
(gcn_vectorize_vec_perm_const): Likewise.
(gcn_vector_mode_supported_p): Likewise.
(gcn_autovectorize_vector_modes): New hook.
(gcn_related_vector_mode): Support multiple vector sizes.
(gcn_expand_dpp_shr_insn): Add FIXME comment.
(gcn_md_reorg): Support multiple vector sizes.
(print_reg): Likewise.
(print_operand): Likewise.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): New hook.
|
|
Add a vector length parameter needed by amdgcn without breaking aarch64.
All amdgcn vector masks are DImode, regardless of vector length, so we can't
tell what length is implied simply from the operator mode. (Even if we used
different integer modes there's no mode small enough to differenciate a 2 or
4 lane mask). Without knowing the intended length we end up using a mask with
too many lanes enabled, which leads to undefined behaviour..
The extra operand is not added for vector mask types so AArch64 does not need
to be adjusted.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (while_ultsidi): Limit mask length using
operand 3.
* doc/md.texi (while_ult): Document new operand 3 usage.
* internal-fn.cc (expand_while_optab_fn): Set operand 3 when lhs_type
maps to a non-vector mode.
|
|
|
|
Several MVE builtins incorrectly use the same predicate/constraint
pair for several modes, which does not match the specification.
This patch uses the appropriate iterator instead.
2022-09-06 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/mve.md (mve_vqshluq_n_s<mode>): Use
MVE_pred/MVE_constraint instead of mve_imm_7/Ra.
(mve_vqshluq_m_n_s<mode>): Likewise.
(mve_vqrshrnbq_n_<supf><mode>): Use MVE_pred3/MVE_constraint3
instead of mve_imm_8/Rb.
(mve_vqrshrunbq_n_s<mode>): Likewise.
(mve_vqrshrntq_n_<supf><mode>): Likewise.
(mve_vqrshruntq_n_s<mode>): Likewise.
(mve_vrshrnbq_n_<supf><mode>): Likewise.
(mve_vrshrntq_n_<supf><mode>): Likewise.
(mve_vqrshrnbq_m_n_<supf><mode>): Likewise.
(mve_vqrshrntq_m_n_<supf><mode>): Likewise.
(mve_vrshrnbq_m_n_<supf><mode>): Likewise.
(mve_vrshrntq_m_n_<supf><mode>): Likewise.
(mve_vqrshrunbq_m_n_s<mode>): Likewise.
(mve_vsriq_n_<supf><mode): Use MVE_pred2/MVE_constraint2 instead
of mve_imm_selective_upto_8/Rg.
(mve_vsriq_m_n_<supf><mode>): Likewise.
(cherry-picked from c3fb6658c7670e446f2fd00984404d971e416b3c)
|
|
The following avoids creating BIT_FIELD_REF of bitfields in
update-address-taken. The patch doesn't implement punning to
a full precision integer type but leaves a comment according to
that.
PR tree-optimization/106934
* tree-ssa.cc (non_rewritable_mem_ref_base): Avoid BIT_FIELD_REFs
of bitfields.
(maybe_rewrite_mem_ref_base): Likewise.
* gfortran.dg/pr106934.f90: New testcase.
(cherry picked from commit 05f5c42cb42c5088187d44cc45a5f671d19ad8c5)
|
|
PRE implicitely keeps virtual operands at the blocks incoming version
but the explicit updating point during PHI translation fails to trigger
when there are no PHIs at all in a block. Later lazy updating then
fails because of a too lose block check. A similar issues plagues
reference invalidation when checking the ANTIC_OUT to ANTIC_IN
translation. The following fixes both and makes the lazy updating
work.
The diagnostic testcase unfortunately requires boost so the
testcase is the one I reduced for a missed optimization in PRE.
The testcase fails with -m32 on x86_64 because we optimize too
much before PRE which causes PRE to not trigger so we fail to
eliminate a full redundancy. I'm going to open a separate bug
for this. Hopefully the !lp64 selector is good enough.
PR tree-optimization/106922
* tree-ssa-pre.cc (translate_vuse_through_block): Only
keep the VUSE if its def dominates PHIBLOCK.
(prune_clobbered_mems): Rewrite logic so we check whether
a value dies in a block when the VUSE def doesn't dominate it.
* g++.dg/tree-ssa/pr106922.C: New testcase.
(cherry picked from commit 5edf02ed2b6de024f83a023d046a6a18f645bc83)
|
|
When predictive commoning builds a reference for iteration N it
prematurely associates a constant offset into the MEM_REF offset
operand which can be invalid if the base pointer then points
outside of an object which alias-analysis does not consider valid.
PR tree-optimization/106892
* tree-predcom.cc (ref_at_iteration): Do not associate the
constant part of the offset into the MEM_REF offset
operand, across a non-zero offset.
* gcc.dg/torture/pr106892.c: New testcase.
(cherry picked from commit a8b0b13da7379feb31950a9d2ad74b98a29c547f)
|
|
The following avoids adding PHIs to the worklist for uninit processing
if we reach them following backedges. That confuses predicate analysis
because it assumes the use is happening in the same iteration as the the
definition. For the testcase in the PR the situation is like
void foo (int val)
{
int uninit;
# val = PHI <..> (B)
for (..)
{
if (..)
{
.. = val; (C)
val = uninit;
}
# val = PHI <..> (A)
}
}
and starting from (A) with 'uninit' as argument we arrive at (B)
and from there at (C). Predicate analysis then tries to prove
the predicate of (B) (not the backedge) can prove that the
path from (B) to (C) is unreachable which isn't really what it
necessary - that's what we'd need to do when the preheader
edge of the loop were the edge with the uninitialized def.
So the following makes those cases intentionally false negatives.
PR tree-optimization/105937
* tree-ssa-uninit.cc (find_uninit_use): Do not queue PHIs
on backedges.
(execute_late_warn_uninitialized): Mark backedges.
* g++.dg/uninit-pr105937.C: New testcase.
(cherry picked from commit c77fae1ca796d6ea06d5cd437909905c3d3d771c)
|
|
Merge up to r12-8817-g97374f25e1ee7ea45293c244f29425c9f9abcf5a (11th Oct 2022)
|
|
|
|
* ro.po: New.
|
|
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/100040
PR fortran/100029
* trans-expr.cc (gfc_conv_class_to_class): Add code to have
assumed-rank arrays recognized as full arrays and fix the type
of the array assignment.
(gfc_conv_procedure_call): Change order of code blocks such that
the free of ALLOCATABLE dummy arguments with INTENT(OUT) occurs
first.
gcc/testsuite/ChangeLog:
PR fortran/100029
* gfortran.dg/PR100029.f90: New test.
PR fortran/100040
* gfortran.dg/PR100040.f90: New test.
(cherry picked from commit 5299155bb80e90df822e1eebc9f9a0c8e4505a46)
|
|
|
|
|
|
|
|
Revert commit a24748da5b48f23ac83fa9f2d128f766e80567ed
that was added to OG11 as dg-error fix for the cherry-pick of
"Fortran/OpenMP: Add support for 'close' in map clause"
r12-944-gcdcec2f8505ea12c2236cf0184d77dd2f5de4832
2022-10-06 Tobias Burnus <tobias@codesourcery.com>
Revert:
2021-05-14 Tobias Burnus <tobias@codesourcery.com>
* c-c++-common/gomp/map-6.c: Remove two dg-error.
|
|
gcc/testsuite/
* gfortran.dg/gomp/depend-5.f90: Update scan-tree-dump.
* gfortran.dg/gomp/scope-6.f90: Likewise.
|
|
gcc/testsuite/
* gfortran.dg/gomp/affinity-clause-1.f90: Update pattern for
different array-reference handling in OG12.
* gfortran.dg/gomp/uses_allocators-3.f90: Fix dg-error string.
|
|
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1 Impl. Status): Mark 'assume' as 'Y'.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_assumes): New.
(show_omp_clauses, show_namespace): Call it.
(show_omp_node, show_code_node): Handle OpenMP ASSUME.
* gfortran.h (enum gfc_statement): Add ST_OMP_ASSUME,
ST_OMP_END_ASSUME, ST_OMP_ASSUMES and ST_NOTHING.
(gfc_exec_op): Add EXEC_OMP_ASSUME.
(gfc_omp_assumptions): New struct.
(gfc_get_omp_assumptions): New XCNEW #define.
(gfc_omp_clauses, gfc_namespace): Add assume member.
(gfc_resolve_omp_assumptions): New prototype.
* match.h (gfc_match_omp_assume, gfc_match_omp_assumes): New.
* openmp.cc (omp_code_to_statement): Forward declare.
(enum gfc_omp_directive_kind, struct gfc_omp_directive): New.
(gfc_free_omp_clauses): Free assume member and its struct data.
(enum omp_mask2): Add OMP_CLAUSE_ASSUMPTIONS.
(gfc_omp_absent_contains_clause): New.
(gfc_match_omp_clauses): Call it; optionally use passed
omp_clauses argument.
(omp_verify_merge_absent_contains, gfc_match_omp_assume,
gfc_match_omp_assumes, gfc_resolve_omp_assumptions): New.
(resolve_omp_clauses): Call the latter.
(gfc_resolve_omp_directive, omp_code_to_statement): Handle
EXEC_OMP_ASSUME.
* parse.cc (decode_omp_directive): Parse OpenMP ASSUME(S).
(next_statement, parse_executable, parse_omp_structured_block):
Handle ST_OMP_ASSUME.
(case_omp_decl): Add ST_OMP_ASSUMES.
(gfc_ascii_statement): Handle Assumes, optional return
string without '!$OMP '/'!$ACC ' prefix.
* parse.h (gfc_ascii_statement): Add optional bool arg to prototype.
* resolve.cc (gfc_resolve_blocks, gfc_resolve_code): Add
EXEC_OMP_ASSUME.
(gfc_resolve): Resolve ASSUMES directive.
* symbol.cc (gfc_free_namespace): Free omp_assumes member.
* st.cc (gfc_free_statement): Handle EXEC_OMP_ASSUME.
* trans-openmp.cc (gfc_trans_omp_directive): Likewise.
* trans.cc (trans_code): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/assume-1.f90: New test.
* gfortran.dg/gomp/assume-2.f90: New test.
* gfortran.dg/gomp/assumes-1.f90: New test.
* gfortran.dg/gomp/assumes-2.f90: New test.
(cherry picked from commit e2a228438919d846995bf2c839c9b657442224b2)
|
|
Split off from the 'Fortran: Add OpenMP's assume(s) directives' patch.
gcc/
* doc/invoke.texi (-fopenmp): Mention C++ attribut syntax.
(-fopenmp-simd): Likewise; update permitted directives.
gcc/fortran/
* parse.cc (decode_omp_directive): Handle '(end) loop' and 'scan'
also with -fopenmp-simd.
gcc/testsuite/
* gfortran.dg/gomp/openmp-simd-7.f90: New test.
(cherry picked from commit 8792047470073df0da4a5b91997d6058193d7676)
|
|
gcc/
* doc/install.texi (Specific): Add missing items to bullet list.
(amdgcn): Update LLVM requirements, use version not date for newlib.
(nvptx): Use version not git hash for newlib.
(cherry picked from commit e886ebd17965d78f609b62479f4f48085108389c)
|
|
|