Age | Commit message (Collapse) | Author | Files | Lines |
|
gcc/fortran/ChangeLog:
PR fortran/93635
* symbol.cc (conflict_std): Helper function for reporting attribute
conflicts depending on the Fortran standard version.
(conf_std): Helper macro for checking standard-dependent conflicts.
(gfc_check_conflict): Use it.
gcc/testsuite/ChangeLog:
PR fortran/93635
* gfortran.dg/c-interop/c1255-2.f90: Adjust pattern.
* gfortran.dg/pr87907.f90: Likewise.
* gfortran.dg/pr93635.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
|
|
gcc/fortran/ChangeLog:
PR fortran/86100
* trans-array.cc (gfc_conv_ss_startstride): Use abridged_ref_name
to generate a more user-friendly name for bounds-check messages.
* trans-expr.cc (gfc_copy_class_to_class): Fix bounds check for
rank>1 by looping over the dimensions.
gcc/testsuite/ChangeLog:
PR fortran/86100
* gfortran.dg/bounds_check_25.f90: New test.
|
|
This lets it recognize more preprocessing floating constants.
gcc/c-family/
* c-ada-spec.cc (is_cpp_float): New predicate.
(dump_number): Deal with more preprocessing floating constants.
(dump_ada_macros) <CPP_NUMBER>: Use is_cpp_float.
|
|
We did not evaluate expressions with variably modified types correctly
in typeof and did not produce warnings when jumping over declarations
using typeof. After addressof or array-to-pointer decay we construct
new pointer types that have to be marked variably modified if the pointer
target is variably modified.
2024-05-18 Martin Uecker <uecker@tugraz.at>
PR c/114831
gcc/c/
* c-typeck.cc (array_to_pointer_conversion, build_unary_op):
Propagate flag to pointer target.
gcc/testsuite/
* gcc.dg/pr114831-1.c: New test.
* gcc.dg/pr114831-2.c: New test.
* gcc.dg/gnu23-varmod-1.c: New test.
* gcc.dg/gnu23-varmod-2.c: New test.
|
|
This fixes an ICE when a module directive is not given at global scope.
Although not explicitly mentioned, it seems implied from [basic.link] p1
and [module.global.frag] that a module-declaration must appear at the
global scope after preprocessing. Apart from this the patch also
slightly improves the errors given when accidentally using a module
control-line in other situations where it is not expected.
PR c++/115200
gcc/cp/ChangeLog:
* parser.cc (cp_parser_error_1): Special-case unexpected module
directives for better diagnostics.
(cp_parser_module_declaration): Check that the module
declaration is at global scope.
(cp_parser_import_declaration): Sync error message with that in
cp_parser_error_1.
gcc/testsuite/ChangeLog:
* g++.dg/modules/mod-decl-1.C: Update error messages.
* g++.dg/modules/mod-decl-6.C: New test.
* g++.dg/modules/mod-decl-7.C: New test.
* g++.dg/modules/mod-decl-8.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This appears to be an oversight in the definition of module_has_cmi_p.
This change will allow us to use the function directly in more places
that need to additional work only if generating a module CMI in the
future, allowing us to do additional work only when we know we need it.
gcc/cp/ChangeLog:
* cp-tree.h (module_has_cmi_p): Also include header units.
(module_maybe_has_cmi_p): Update comment.
* module.cc (set_defining_module): Only need to track
declarations for later exporting if the module may have a CMI.
(set_defining_module_for_partial_spec): Likewise.
* name-lookup.cc (pushdecl): Likewise.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
In r14-9530 we relaxed "depending on type with no-linkage" errors for
declarations that could actually be accessed from different TUs anyway.
However, this also enabled it for unnamed types, which never work.
In a normal module interface, an unnamed type is TU-local by
[basic.link] p15.2, and so cannot be exposed or the program is
ill-formed. We don't yet implement this checking but we should assume
that we will later; currently supporting this actually causes ICEs when
attempting to create the mangled name in some situations.
For a header unit, by [module.import] p5.3 it is unspecified whether two
TUs importing a header unit providing such a declaration are importing
the same header unit. In this case, we would require name mangling
changes to somehow allow the (anonymous) type exported by such a header
unit to correspond across different TUs in the presence of other
anonymous declarations, so for this patch just assume that this case
would be an ODR violation instead.
gcc/cp/ChangeLog:
* tree.cc (no_linkage_check): Anonymous types can't be accessed
in a different TU.
gcc/testsuite/ChangeLog:
* g++.dg/modules/linkage-1_a.C: Remove anonymous type test.
* g++.dg/modules/linkage-1_b.C: Likewise.
* g++.dg/modules/linkage-1_c.C: Likewise.
* g++.dg/modules/linkage-2.C: Add note about anonymous types.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
Testing with Zbs enabled by default showed a minor logic error. After
the loop clearing things with bclri, we can only use the sequence if we
were able to clear all the necessary bits. If any bits are still on,
then the bclr sequence turned out to not be profitable.
--
So this is conceptually similar to how we handled direct generation of
bseti for constant synthesis, but this time for bclr.
In the bclr case, we already have an expander for AND. So we just
needed to adjust the predicate to accept another class of constant
operands (those with a single bit clear).
With that in place constant synthesis is adjusted so that it counts the
number of bits clear in the high 33 bits of a 64bit word. If that
number is small relative to the current best cost, then we try to
generate the constant with a lui based sequence for the low half which
implicitly sets the upper 32 bits as well. Then we bclri one or more of
those upper 33 bits.
So as an example, this code goes from 4 instructions down to 3:
> unsigned long foo_0xfffffffbfffff7ff(void) { return
0xfffffffbfffff7ffUL; }
Note the use of 33 bits above. That's meant to capture cases like this:
> unsigned long foo_0xfffdffff7ffff7ff(void) { return
0xfffdffff7ffff7ffUL; }
We can use lui+addi+bclri+bclri to synthesize that in 4 instructions
instead of 5.
I'm including a handful of cases covering the two basic ideas above that
were found by the testing code.
And, no, we're not done yet. I see at least one more notable idiom
missing before exploring zbkb's potential to improve things.
Tested in my tester and waiting on Rivos CI system before moving forward.
gcc/
* config/riscv/predicates.md (arith_operand_or_mode_mask): Renamed to..
(arith_or_mode_mask_or_zbs_operand): New predicate.
* config/riscv/riscv.md (and<mode>3): Update predicate for operand 2.
* config/riscv/riscv.cc (riscv_build_integer_1): Use bclri to clear
bits, particularly bits 31..63 when profitable to do so.
gcc/testsuite/
* gcc.target/riscv/synthesis-6.c: New test.
|
|
create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:
end_a <= start_b || end_b <= start_a
It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last inclusive aligned pointers (and changes the comparisons
accordingly). The comment for the latter is:
/* Calculate the minimum alignment shared by all four pointers,
then arrange for this alignment to be subtracted from the
exclusive maximum values to get inclusive maximum values.
This "- min_align" is cumulative with a "+ access_size"
in the calculation of the maximum values. In the best
(and common) case, the two cancel each other out, leaving
us with an inclusive bound based only on seg_len. In the
worst case we're simply adding a smaller number than before.
The problem is that the associated code implicitly assumed that the
access size was a multiple of the pointer alignment, and so the
alignment could be carried over to the exclusive end pointer.
The testcase started failing after g:9fa5b473b5b8e289b6542
because that commit improved the alignment information for
the accesses.
gcc/
PR tree-optimization/115192
* tree-data-ref.cc (create_intersect_range_checks): Take the
alignment of the access sizes into account.
gcc/testsuite/
PR tree-optimization/115192
* gcc.dg/vect/pr115192.c: New test.
|
|
This patch corrects the gm2.texi xref for the modula-2 documentation.
gcc/ChangeLog:
* doc/gm2.texi: Replace all occurrences of xref {, , , gm2}
with xref {, , , m2}.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
The match.pd patterns to merge two vector permutes into one fail when a
potentially no-op view convert expressions is between the two permutes.
This change lifts this restriction.
gcc/ChangeLog:
* match.pd: Allow no-op view_convert between permutes.
gcc/testsuite/ChangeLog:
* gcc.dg/fold-perm-2.c: New test.
|
|
For some hardware which doesn't support unaligned vector memory access,
test case costmodel-vect-76b.c expects to see cost modeling would make
the decision that it's not profitable for peeling, according to the
commit history, test case comments and the way to check.
For now, the existing loop bound 14 works well for Power7, but it does
not for some targets on which the cost of operation vec_perm can be
different from Power7, such as: Power6, it's 3 vs. 1. This difference
further causes the difference (10 vs. 12) on the minimum iteration for
profitability and cause the failure. To keep the original test point,
this patch is to tweak the loop bound to ensure it's not profitable
to be vectorized for !vect_no_align with peeling.
Co-Authored-By: Kewen Lin <linkw@linux.ibm.com>
for gcc/testsuite/ChangeLog
* gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak.
|
|
There's not really a good way to test what the testcase wants to
test, the following exchanges one dump scan for another (imperfect)
one.
* gcc.dg/vect/vect-gather-4.c: Scan for not vectorizing using
SLP.
|
|
When sinking code closer to its uses we already try to minimize the
distance we move by inserting at the start of the basic-block. The
following makes sure to sink closest to the control dependence
check of the region we want to sink to as well as make sure to
ignore control dependences that are only guarding exceptional code.
This restores somewhat the old profile check but without requiring
nearly even probabilities. The patch also makes sure to not give
up completely when the best sink location is one we do not want to
sink to but possibly then choose the next best one.
PR tree-optimization/115144
* tree-ssa-sink.cc (do_not_sink): New function, split out
from ...
(select_best_block): Here. First pick valid block to
sink to. From that search for the best valid block,
avoiding sinking across conditions to exceptional code.
(sink_code_in_bb): When updating vuses of stores in
paths we do not sink a store to make sure we didn't
pick a dominating sink location.
* gcc.dg/tree-ssa/ssa-sink-22.c: New testcase.
|
|
gcc/testsuite/ChangeLog:
PR target/114148
* gcc.target/i386/pr106010-7b.c: Refine testcase.
|
|
I noticed that phiprop leaves around phi nodes which
defines a ssa name which is unused. This just adds a
bitmap to mark those ssa names and then calls
simple_dce_from_worklist at the very end to remove
those phi nodes and all of the dependencies if there
was any. This might allow us to optimize something earlier
due to the removal of the phi which was taking the address
of the variables.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* tree-ssa-phiprop.cc (phiprop_insert_phi): Add
dce_ssa_names argument. Add the phi's result to it.
(propagate_with_phi): Add dce_ssa_names argument.
Update call to phiprop_insert_phi.
(pass_phiprop::execute): Update call to propagate_with_phi.
Call simple_dce_from_worklist if there was a change.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
The following avoids splitting store dataref groups during SLP
discovery but instead forces (eventually single-lane) consecutive
lane SLP discovery for all lanes of the group, creating VEC_PERM
SLP nodes merging them so the store will always cover the whole group.
With this for example
int x[1024], y[1024], z[1024], w[1024];
void foo (void)
{
for (int i = 0; i < 256; i++)
{
x[4*i+0] = y[2*i+0];
x[4*i+1] = y[2*i+1];
x[4*i+2] = z[i];
x[4*i+3] = w[i];
}
}
which was previously using hybrid SLP can now be fully SLPed and
SSE code generated looks better (but of course you never know,
I didn't actually benchmark). We of course need a VF of four here.
.L2:
movdqa z(%rax), %xmm0
movdqa w(%rax), %xmm4
movdqa y(%rax,%rax), %xmm2
movdqa y+16(%rax,%rax), %xmm1
movdqa %xmm0, %xmm3
punpckhdq %xmm4, %xmm0
punpckldq %xmm4, %xmm3
movdqa %xmm2, %xmm4
shufps $238, %xmm3, %xmm2
movaps %xmm2, x+16(,%rax,4)
movdqa %xmm1, %xmm2
shufps $68, %xmm3, %xmm4
shufps $68, %xmm0, %xmm2
movaps %xmm4, x(,%rax,4)
shufps $238, %xmm0, %xmm1
movaps %xmm2, x+32(,%rax,4)
movaps %xmm1, x+48(,%rax,4)
addq $16, %rax
cmpq $1024, %rax
jne .L2
The extra permute nodes merging distinct branches of the SLP
tree might be unexpected for some code, esp. since
SLP_TREE_REPRESENTATIVE cannot be meaningfully set and we
cannot populate SLP_TREE_SCALAR_STMTS or SLP_TREE_SCALAR_OPS
consistently as we can have a mix of both.
The patch keeps the sub-trees form consecutive lanes but that's
in principle not necessary if we for example have an even/odd
split which now would result in N single-lane sub-trees. That's
left for future improvements.
The interesting part is how VLA vector ISAs handle merging of
two vectors that's not trivial even/odd merging. The strathegy
of how to build the permute tree might need adjustments for that
(in the end splitting each branch to single lanes and then doing
even/odd merging would be the brute-force fallback). Not sure
how much we can or should rely on the SLP optimize pass to handle
this.
The gcc.dg/vect/slp-12a.c case is interesting as we currently split
the 8 store group into lanes 0-5 which we SLP with an unroll factor
of two (on x86-64 with SSE) and the remaining two lanes are using
interleaving vectorization with a final unroll factor of four. Thus
we're using hybrid SLP within a single store group. After the change
we discover the same 0-5 lane SLP part as well as two single-lane
parts feeding the full store group. But that results in a load
permutation that isn't supported (I have WIP patchs to rectify that).
So we end up cancelling SLP and vectorizing the whole loop with
interleaving which is IMO good and results in better code.
This is similar for gcc.target/i386/pr52252-atom.c where interleaving
generates much better code than hybrid SLP. I'm unsure how to update
the testcase though.
gcc.dg/vect/slp-21.c runs into similar situations. Note that when
when analyzing SLP operations we discard an instance we currently
force the full loop to have no SLP because hybrid detection is
broken. It's probably not worth fixing this at this moment.
For gcc.dg/vect/pr97428.c we are not splitting the 16 store group
into two but merge the two 8 lane loads into one before doing the
store and thus have only a single SLP instance. A similar situation
happens in gcc.dg/vect/slp-11c.c but the branches feeding the
single SLP store only have a single lane. Likewise for
gcc.dg/vect/vect-complex-5.c and gcc.dg/vect/vect-gather-2.c.
gcc.dg/vect/slp-cond-1.c has an additional SLP vectorization
with a SLP store group of size two but two single-lane branches.
* tree-vect-slp.cc (vect_build_slp_instance): Do not split
store dataref groups on loop SLP discovery failure but create
a single SLP instance for the stores but branch to SLP sub-trees
and merge with a series of VEC_PERM nodes.
* gcc.dg/vect/pr97428.c: Expect a single store SLP group.
* gcc.dg/vect/slp-11c.c: Likewise, if !vect_load_lanes.
* gcc.dg/vect/vect-complex-5.c: Likewise.
* gcc.dg/vect/slp-12a.c: Do not expect SLP.
* gcc.dg/vect/slp-21.c: Remove not important scanning for SLP.
* gcc.dg/vect/slp-cond-1.c: Expect one more SLP if !vect_load_lanes.
* gcc.dg/vect/vect-gather-2.c: Expect SLP to be used.
* gcc.target/i386/pr52252-atom.c: XFAIL test for palignr.
|
|
|
|
Constrained partial specialisations aren't all necessarily tracked on
the instantiation table. The modules code uses a separate
'partial_specializations' table to track them instead to ensure that
they get walked and emitted when emitting a module, but currently this
does not always happen.
The attached testcase fails in two ways. First, because the partial
specialisation is just a declaration (and not a definition),
'set_defining_module' never ends up getting called on it and so it never
gets added to the partial specialisation table. We fix this by ensuring
that when partial specializations are created they always get added, and
so we never miss one. To prevent adding partial specialisations multiple
times we split this out as a new function.
The second way it fails is that when exporting the primary interface for
a module with partitions, we also re-walk the specializations of all
imported partitions to merge them into a single BMI. So this patch
ensures that after calling 'match_mergeable_specialization' we also
ensure that if the name came from a partition it gets added to the
specialization table so that a dependency is correctly created for it.
PR c++/114947
gcc/cp/ChangeLog:
* cp-tree.h (set_defining_module_for_partial_spec): Declare.
* module.cc (trees_in::decl_value): Track partial specs coming
from partitions.
(set_defining_module): Don't track partial specialisations here
anymore.
(set_defining_module_for_partial_spec): New function.
* pt.cc (process_partial_specialization): Call it.
gcc/testsuite/ChangeLog:
* g++.dg/modules/partial-4_a.C: New test.
* g++.dg/modules/partial-4_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
Certain components of GORI were needed in order to process a COND_EXPR
expression and calculate the 2 operands as if they were true and false edges
based on the condition. With GORI available from the range_query
objcet now, this can be moved into the fold_using_range code where it
really belongs.
* gimple-range-edge.h (range_query::condexpr_adjust): Delete.
* gimple-range-fold.cc (fold_using_range::range_of_range_op): Use
gori_ssa routine.
(fold_using_range::range_of_address): Likewise.
(fold_using_range::range_of_phi): Likewise.
(fold_using_range::condexpr_adjust): Relocated from gori_compute.
(fold_using_range::range_of_cond_expr): Use local condexpr_adjust.
(fur_source::register_outgoing_edges): Use gori_ssa routine.
* gimple-range-fold.h (gori_ssa): Rename from gori_bb.
(fold_using_range::condexpr_adjust): Add prototype.
* gimple-range-gori.cc (gori_compute::condexpr_adjust): Relocate.
* gimple-range-gori.h (gori_compute::condexpr_adjust): Delete.
|
|
Move gori_map dependency and import/export object into a range query and
construct it simultaneously with a gori object.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Use gori_ssa.
(ranger_cache::dump): Likewise.
(ranger_cache::get_global_range): Likewise.
(ranger_cache::set_global_range): Likewise.
(ranger_cache::register_inferred_value): Likewise.
* gimple-range-edge.h (gimple_outgoing_range::map): Remove.
* gimple-range-fold.cc (fold_using_range::range_of_range_op): Use
gori_ssa.
(fold_using_range::range_of_address): Likewise.
(fold_using_range::range_of_phi): Likewise.
(fur_source::register_outgoing_edges): Likewise.
* gimple-range-fold.h (fur_source::query): Make const.
(gori_ssa): New.
* gimple-range-gori.cc (gori_map::dump): Use 'this' pointer.
(gori_compute::gori_compute): Construct with a gori_map.
* gimple-range-gori.h (gori_compute:gori_compute): Change
prototype.
(gori_compute::map): Delete.
(gori_compute::m_map): Change to a reference.
(FOR_EACH_GORI_IMPORT_NAME): Change parameter gori to gorimap.
(FOR_EACH_GORI_EXPORT_NAME): Likewise.
* gimple-range-path.cc (path_range_query::compute_ranges_in_block):
Use gori_ssa method.
(path_range_query::compute_exit_dependencies): Likewise.
* gimple-range.cc (gimple_ranger::range_of_stmt): Likewise.
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges):
Likewise.
* tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise.
* tree-vrp.cc (remove_unreachable::handle_early): Likewise.
(remove_unreachable::remove_and_update_globals): Likewise.
* value-query.cc (range_query::create_gori): Create gori map.
(range_query::share_query): Copy gori map member.
(range_query::range_query): Initiialize gori_map member.
* value-query.h (range_query::gori_ssa): New.
(range_query::m_map): New.
|
|
This patch moves the GORI component into the range_query object, and
makes it generally available. This makes it much easier to share
between ranger and the passes.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Create
GORi via the range_query instead of a local member.
(ranger_cache::dump_bb): Use gori via from the range_query parent.
(ranger_cache::get_global_range): Likewise.
(ranger_cache::set_global_range): Likewise.
(ranger_cache::edge_range): Likewise.
(anger_cache::block_range): Likewise.
(ranger_cache::fill_block_cache): Likewise.
(ranger_cache::range_from_dom): Likewise.
(ranger_cache::register_inferred_value): Likewise.
* gimple-range-cache.h (ranger_cache::m_gori): Delete.
* gimple-range-fold.cc (fur_source::fur_source): Set m_depend_p.
(fur_depend::fur_depend): Remove gori parameter.
* gimple-range-fold.h (fur_source::gori): Adjust.
(fur_source::m_gori): Delete.
(fur_source::m_depend): New.
(fur_depend::fur_depend): Adjust prototype.
* gimple-range-path.cc (path_range_query::path_range_query): Share
ranger oracles.
(path_range_query::range_defined_in_block): Use oracle directly.
(path_range_query::compute_ranges_in_block): Use new gori() method.
(path_range_query::adjust_for_non_null_uses): Use oracle directly.
(path_range_query::compute_exit_dependencies): Likewise.
(jt_fur_source::jt_fur_source): No gori in the parameters.
(path_range_query::range_of_stmt): Likewise.
(path_range_query::compute_outgoing_relations): Likewise.
* gimple-range.cc (gimple_ranger::fold_range_internal): Likewise.
(gimple_ranger::range_of_stmt): Access gori via gori () method.
(assume_query::range_of_expr): Create a gori object.
(assume_query::~assume_query): Destroy a gori object.
(assume_query::calculate_op): Remove old gori() accessor.
* gimple-range.h (gimple_ranger::gori): Delete.
(assume_query::~assume_query): New.
(assume_query::m_gori): Delete.
* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): use
gori () method.
* tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise.
* value-query.cc (default_gori): New.
(range_query::create_gori): New.
(range_query::destroy_gori): New.
(range_query::share_oracles): Set m_gori.
(range_query::range_query): Set m_gori to default.
(range_query::~range_query): call destroy gori.
* value-query.h (range_query): Adjust prototypes
(range_query::m_gori): New.
|
|
Make gimple_outgoing_range a base class for the GORI API, and provide
base routines returning false. gori_compute inherits from
gimple_outgoing_range and no longer needs it as a private member.
Rename outgoing_edge_range_p to edge_range_p.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust
m_gori constructor.
(ranger_cache::edge_range): Use renamed edge_range_p name.
(ranger_cache::range_from_dom): Likewise.
* gimple-range-edge.h (gimple_outgoing_range::condexpr_adjust): New.
(gimple_outgoing_range::has_edge_range_p): New.
(gimple_outgoing_range::dump): New.
(gimple_outgoing_range::compute_operand_range): New.
(gimple_outgoing_range::map): New.
* gimple-range-fold.cc (fur_source::register_outgoing_edges ): Use
renamed edge_range_p routine
* gimple-range-gori.cc (gori_compute::gori_compute): Adjust
constructor.
(gori_compute::~gori_compute): New.
(gori_compute::edge_range_p): Rename from outgoing_edge_range_p
and use inherited routine instead of member method.
* gimple-range-gori.h (class gori_compute): Inherit from
gimple_outgoing_range, adjust protoypes.
(gori_compute::outgpoing): Delete.
* gimple-range-path.cc (path_range_query::compute_ranges_in_block): Use
renamed edge_range_p routine.
* tree-ssa-loop-unswitch.cc (evaluate_control_stmt_using_entry_checks):
Likewise.
|
|
This patch moves the gori_compute object away from inheriting a
gori_map object and instead it as a local member. Export it via map ().
* gimple-range-cache.cc (ranger_cache::ranger_cache): Access
gori_map via member call.
(ranger_cache::dump_bb): Likewise.
(ranger_cache::get_global_range): Likewise.
(ranger_cache::set_global_range): Likewise.
(ranger_cache::register_inferred_value): Likewise.
* gimple-range-fold.cc (fold_using_range::range_of_range_op): Likewise.
(fold_using_range::range_of_address): Likewise.
(fold_using_range::range_of_phi): Likewise.
* gimple-range-gori.cc (gori_compute::compute_operand_range_switch):
likewise.
(gori_compute::compute_operand_range): Likewise.
(gori_compute::compute_logical_operands): Likewise.
(gori_compute::refine_using_relation): Likewise.
(gori_compute::compute_operand1_and_operand2_range): Likewise.
(gori_compute::may_recompute_p): Likewise.
(gori_compute::has_edge_range_p): Likewise.
(gori_compute::outgoing_edge_range_p): Likewise.
(gori_compute::condexpr_adjust): Likewise.
* gimple-range-gori.h (class gori_compute): Do not inherit from
gori_map.
(gori_compute::m_map): New.
* gimple-range-path.cc (gimple-range-path.cc): Use gori_map member.
(path_range_query::compute_exit_dependencies): Likewise.
* gimple-range.cc (gimple_ranger::range_of_stmt): Likewise.
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): Likewise.
* tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise.
* tree-vrp.cc (remove_unreachable::handle_early): Likewise.
(remove_unreachable::remove_and_update_globals): Likewise.
|
|
Change the default constructor to not process switches, add method to
enable/disable switch processing.
* gimple-range-edge.cc (gimple_outgoing_range::gimple_outgoing_range):
Do not allocate a range allocator at construction time.
(gimple_outgoing_range::~gimple_outgoing_range): Delete allocator
if one was allocated.
(gimple_outgoing_range::set_switch_limit): New.
(gimple_outgoing_range::switch_edge_range): Create an allocator if one
does not exist.
(gimple_outgoing_range::edge_range_p): Check for zero edges.
* gimple-range-edge.h (class gimple_outgoing_range): Adjust prototypes.
|
|
Gimple_range_fold contains some shorthand fold_range routines for
easy user consumption of that range-ops interface, but there is no equivalent
routines for op1_range and op2_range. This patch provides basic versions.
Any range-op entry which has an op1_range or op2_range implemented can
potentially also provide inferred ranges. This is a step towards
PR 113879. Default is currently OFF for performance reasons as it
dramtically increases the number of inferred ranges.
PR tree-optimization/113879
* gimple-range-fold.cc (op1_range): New.
(op2_range): New.
* gimple-range-fold.h (op1_range): New prototypes.
(op2_range): New prototypes.
* gimple-range-infer.cc (gimple_infer_range::add_range): Do not
add an inferred range if it is VARYING.
(gimple_infer_range::gimple_infer_range): Add inferred ranges
for any range-op statements if requested.
* gimple-range-infer.h (gimple_infer_range): Add parameter.
|
|
Turn the infer_manager class into an always available oracle accessible via a
range_query object. Also assocaite each inferrred range with it's
originating stmt.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Create an infer
oracle instead of a local member.
(ranger_cache::~ranger_cache): Destroy the oracle.
(ranger_cache::edge_range): Use oracle.
(ranger_cache::fill_block_cache): Likewise.
(ranger_cache::range_from_dom): Likewise.
(ranger_cache::apply_inferred_ranges): Likewise.
* gimple-range-cache.h (ranger_cache::m_exit): Delete.
* gimple-range-infer.cc (infer_oracle): New static object;
(class infer_oracle): New.
(non_null_wrapper::non_null_wrapper): New.
(non_null_wrapper::add_nonzero): New.
(non_null_wrapper::add_range): New.
(non_null_loadstore): Use nonnull_wrapper.
(gimple_infer_range::gimple_infer_range): New alternate constructor.
(exit_range::stmt): New.
(infer_range_manager::has_range_p): Combine seperate methods.
(infer_range_manager::maybe_adjust_range): Adjust has_range_p call.
(infer_range_manager::add_ranges): New.
(infer_range_manager::add_range): Take stmt rather than BB.
(infer_range_manager::add_nonzero): Adjust from BB to stmt.
* gimple-range-infer.h (class gimple_infer_range): Adjust methods.
(infer_range_oracle): New.
(class infer_range_manager): Inherit from infer_range_oracle.
Adjust methods.
* gimple-range-path.cc (path_range_query::range_defined_in_block): Use
oracle.
(path_range_query::adjust_for_non_null_uses): Likewise.
* gimple-range.cc (gimple_ranger::range_on_edge): Likewise
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
* value-query.cc (default_infer_oracle): New.
(range_query::create_infer_oracle): New.
(range_query::destroy_infer_oracle): New.
(range_query::share_query): Copy infer pointer.
(range_query::range_query): Initialize infer pointer.
(range_query::~range_query): destroy infer object.
* value-query.h (range_query::infer_oracle): New.
(range_query::create_infer_oracle): New prototype.
(range_query::destroy_infer_oracle): New prototype.
(range_query::m_infer): New.
|
|
Ranger and the ranger cache need to share components, this provides a
blessed way to do so.
* gimple-range.cc (gimple_ranger::gimple_ranger): Share the
components from ranger_cache.
(gimple_ranger::~gimple_ranger): Don't clear pointer.
* value-query.cc (range_query::share_query): New.
(range_query::range_query): Clear shared component flag.
(range_query::~range_query): Don't free shared component copies.
* value-query.h (share_query): New prototype.
(m_shared_copy_p): New member.
|
|
With more oracles incoming, rename the range_query oracle () method to
relation (), and remove the redundant 'relation' text from register and query
methods, resulting in calls that look like:
relation ()->record (...) and
relation ()->query (...)
* gimple-range-cache.cc (ranger_cache::dump_bb): Use m_relation.
(ranger_cache::fill_block_cache): Likewise
* gimple-range-fold.cc (fur_stmt::get_phi_operand): Use new names.
(fur_depend::register_relation): Likewise.
(fold_using_range::range_of_phi): Likewise.
* gimple-range-path.cc (path_range_query::path_range_query): Likewise.
(path_range_query::~path_range_query): Likewise.
(ath_range_query::compute_ranges): Likewise.
(jt_fur_source::register_relation): Likewise.
(jt_fur_source::query_relation): Likewise.
(path_range_query::maybe_register_phi_relation): Likewise.
* gimple-range-path.h (get_path_oracle): Likewise.
* gimple-range.cc (gimple_ranger::gimple_ranger): Likewise.
(gimple_ranger::~gimple_ranger): Likewise.
* value-query.cc (range_query::create_relation_oracle): Likewise.
(range_query::destroy_relation_oracle): Likewise.
(range_query::share_oracles): Likewise.
(range_query::range_query): Likewise.
* value-query.h (value_query::relation): Rename from oracle.
(m_relation): Rename from m_oracle.
* value-relation.cc (relation_oracle::query): Rename from
query_relation.
(equiv_oracle::query): Likewise.
(equiv_oracle::record): Rename from register_relation.
(relation_oracle::record): Likewise.
(dom_oracle::record): Likewise.
(dom_oracle::query): Rename from query_relation.
(path_oracle::record): Rename from register_relation.
(path_oracle::query): Rename from query_relation.
* value-relation.h (*::record): Rename from register_relation.
(*::query): Rename from query_relation.
|
|
This eliminates the need to check if the relation oracle pointer is NULL
before every call by providing a default oracle which does nothing.
REmove unused routines, and Unify register_relation method names.
* gimple-range-cache.cc (ranger_cache::dump_bb): Remove check for
NULL oracle pointer.
(ranger_cache::fill_block_cache): Likewise.
* gimple-range-fold.cc (fur_stmt::get_phi_operand): Likewise.
(fur_depend::fur_depend): Likewise.
(fur_depend::register_relation): Likewise, use qury_relation.
(fold_using_range::range_of_phi): Likewise.
(fold_using_range::relation_fold_and_or): Likewise.
* gimple-range-fold.h (fur_source::m_oracle): Delete. Oracle
can be accessed dirctly via m_query now.
* gimple-range-path.cc (path_range_query::path_range_query):
Adjust for oracle reference pointer.
(path_range_query::compute_ranges): Likewise.
(jt_fur_source::jt_fur_source): Adjust for no m_oracle member.
(jt_fur_source::register_relation): Do not check for NULL
pointer.
(jt_fur_source::query_relation): Likewise.
* gimple-range.cc (gimple_ranger::gimple_ranger): Adjust for
reference pointer.
* value-query.cc (default_relation_oracle): New.
(range_query::create_relation_oracle): Relocate from header.
Ensure not being added to global query.
(range_query::destroy_relation_oracle): Relocate from header.
(range_query::range_query): Initailize to default oracle.
(ange_query::~range_query): Call destroy_relation_oracle.
* value-query.h (class range_query): Adjust prototypes.
(range_query::create_relation_oracle): Move to source file.
(range_query::destroy_relation_oracle): Move to source file.
* value-relation.cc (relation_oracle::validate_relation): Delete.
(relation_oracle::register_stmt): Rename to register_relation.
(relation_oracle::register_edge): Likewise.
* value-relation.h (register_stmt): Rename to register_relation and
provide default function in base class.
(register_edge): Likewise.
(relation_oracle::validate_relation): Delete.
(relation_oracle::query_relation): Provide default in base class.
(relation_oracle::dump): Likewise.
(relation_oracle::equiv_set): Likewise.
(default_relation_oracle): New extenal reference.
(partial_equiv_set, add_partial_equiv): Move to protected.
|
|
Move relation queries from range_query object into the relation oracle.
* gimple-range-cache.cc (ranger_cache::ranger_cache): Call
create_relation_oracle.
(ranger_cache::~ranger_cache): Call destroy_relation_oracle.
* gimple-range-fold.cc (fur_stmt::get_phi_operand): Check for
relation oracle bnefore calling query_relation.
(fold_using_range::range_of_phi): Likewise.
* gimple-range-path.cc (path_range_query::~path_range_query): Set
relation oracle pointer to NULL when done.
* gimple-range.cc (gimple_ranger::~gimple_ranger): Likewise.
* value-query.cc (range_query::~range_query): Ensure any
relation oracle is destroyed.
(range_query::query_relation): relocate to relation_oracle object.
* value-query.h (class range_query): Adjust method proototypes.
(range_query::create_relation_oracle): New.
(range_query::destroy_relation_oracle): New.
* value-relation.cc (relation_oracle::query_relation): Relocate
from range query class.
* value-relation.h (Call relation_oracle): New prototypes.
|
|
Decaying the array temporary to a pointer and then deleting that crashes in
verify_gimple_stmt, because the TARGET_EXPR is first evaluated inside the
TRY_FINALLY_EXPR, but the cleanup point is outside. Fixed by using
get_target_expr instead of save_expr.
I also adjust the stabilize_expr comment to prevent me from again thinking
it's a suitable replacement.
PR c++/115187
gcc/cp/ChangeLog:
* init.cc (build_delete): Use get_target_expr instead of save_expr.
* tree.cc (stabilize_expr): Update comment.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/array-prvalue3.C: New test.
|
|
This avoids generating invalid Ada code for function with a multidimensional
array parameter and also cleans things up left and right.
gcc/c-family/
* c-ada-spec.cc (check_type_name_conflict): Add guard.
(is_char_array): Simplify.
(dump_ada_array_type): Use strip_array_types.
(dump_ada_node) <POINTER_TYPE>: Deal with anonymous array types.
(dump_nested_type): Use strip_array_types.
|
|
There are sorts of match pattern for SAT related cases, there will be
some duplicated code to check the dest, op_0, op_1 are same tree types.
Aka ternary tree type matches. Thus, add overloaded types_match func
do this and avoid match code duplication.
The below test suites are passed for this patch:
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 regression test.
gcc/ChangeLog:
* generic-match-head.cc (types_match): Add overloaded types_match
for 3 types.
* gimple-match-head.cc (types_match): Ditto.
* match.pd: Leverage overloaded types_match.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Forgot a check for an SSA name before trying to replace a PHI arg with
its current definition.
PR tree-optimization/115197
* tree-loop-distribution.cc (copy_loop_before): Constant PHI
args remain the same.
* gcc.dg/pr115197.c: New testcase.
|
|
When processing a &ANYTHING = X constraint we treat it as *ANYTHING = X
during constraint processing but then end up recording it as
&ANYTHING = X anyway, breaking constraint graph building. This is
because we only update the local copy of the LHS and not the constraint
itself.
PR tree-optimization/115199
* tree-ssa-structalias.cc (process_constraint): Also
record &ANYTHING = X as *ANYTING = X in the end.
* gcc.dg/torture/pr115199.c: New testcase.
|
|
I failed to realize we do not represent FUNCTION_DECLs or LABEL_DECLs
in vars explicitly and thus have to compare pt.vars_contains_nonlocal.
PR tree-optimization/115138
* tree-ssa-alias.cc (ptrs_compare_unequal): Make sure
pt.vars_contains_nonlocal differs since we do not represent
FUNCTION_DECLs or LABEL_DECLs in vars explicitly.
* gcc.dg/torture/pr115138.c: New testcase.
|
|
Hi,
Case pr106550.c is testing constant building for 64bit
register. It fails with -m32 without having the expected rldimi.
So, this case requires target of has_arch_ppc64.
Bootstrap and regtest pass on ppc64{,le}.
Is this ok for trunk?
BR,
Jeff(Jiufu) Guo
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106550.c: Adjust by requiring has_arch_ppc64
effective target. And remove power10_ok.
|
|
gcc.dg/vect/vect-pr111779.c FAILs on 32 and 64-bit Solaris/SPARC:
FAIL: gcc.dg/vect/vect-pr111779.c -flto -ffat-lto-objects scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-pr111779.c scan-tree-dump vect "LOOP VECTORIZED"
This patch implements Richard's analysis from the PR, skipping the
scan-tree-dump part for big-endian targets without vect_shift_char
support.
Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11 (32 and 64-bit each).
2024-05-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc/testsuite:
PR tree-optimization/114072
* gcc.dg/vect/vect-pr111779.c (scan-tree-dump): Require
vect_shift_char on big-endian targets.
|
|
2024-05-23 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/103312
* dependency.cc (gfc_dep_compare_expr): Handle component call
expressions. Return -2 as default and return 0 if compared with
a function expression that is from an interface body and has
the same name.
* expr.cc (gfc_reduce_init_expr): If the expression is a comp
call do not attempt to reduce, defer to resolution and return
false.
* trans-types.cc (gfc_get_dtype_rank_type,
gfc_get_nodesc_array_type): Fix whitespace.
gcc/testsuite/
PR fortran/103312
* gfortran.dg/pr103312.f90: New test.
|
|
Consider a NOCE conversion as profitable if there is at least one
conditional move.
gcc/ChangeLog:
PR target/109549
* config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P):
Define.
(s390_noce_conversion_profitable_p): Implement.
gcc/testsuite/ChangeLog:
* gcc.target/s390/ccor.c: Order of loads are reversed, now, as a
consequence the condition has to be reversed.
|
|
Some of the asm opcodes expected by pr79004 depend on
-mlong-double-128 to be output. E.g., without this flag, the
conditions of patterns @extenddf<mode>2 and extendsf<mode>2 do not
hold, and so GCC resorts to libcalls instead of even trying
rs6000_expand_float128_convert.
Perhaps the conditions are too strict, and they could enable the use
of conversion insns involving __ieee128/_Float128 even with 64-bit
long doubles.
For now, xfail the opcodes that are not available on longdouble64.
While at that, drop long_double_64bit, since it's broken and sort of
redundant.
for gcc/testsuite/ChangeLog
PR target/105359
* gcc.target/powerpc/pr79004.c: Xfail opcodes not available on
longdouble64.
* lib/target-supports.exp
(check_effective_target_long_double_64bit): Drop.
(add_options_for_long_double_64bit): Likewise.
|
|
Fix a use of int_range_max in phiopt that should be a type agnostic
range, because it could be either a pointer or an int.
PR tree-optimization/115191
gcc/ChangeLog:
* tree-ssa-phiopt.cc (value_replacement): Use Value_Range instead
of int_range_max.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr115191.c: New test.
|
|
This patch adds Qualcomm's new oryon-1 core; this is enough
to recongize the core and later on will add the tuning structure.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (oryon-1): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document oryon-1.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Co-authored-by: Joel Jones <quic_joeljone@quicinc.com>
Co-authored-by: Wei Zhao <quic_wezhao@quicinc.com>
|
|
|
|
Here the member functions QList::g and QList::h are given the same
function type by build_cp_fntype_variant since their noexcept-specs are
equivalent according to cp_tree_equal. In doing so however this means
that the function type of QList::h refers to a function parameter from
QList::g, which ends up confusing modules streaming.
I'm not sure if modules can be fixed to handle this situation, but
regardless it seems weird in principle that a function parameter can
escape in such a way. The analogous situation with a trailing return
type and decltype
auto g(QList &other) -> decltype(f(other));
auto h(QList &other) -> decltype(f(other));
behaves better because we don't canonicalize decltype, and so the
function types of g and h are non-canonical and therefore not shared.
In light of this, it seems natural to treat function types with complex
noexcept-specs as non-canonical as well so that each such function
declaration is given a unique function type node. (The main benefit of
type canonicalization is to speed up repeated type comparisons, but it
should be rare to repeatedly compare two otherwise compatible function
types with complex noexcept-specs.)
To that end, this patch strengthens the ce_exact case of comp_except_specs
to require identity instead of equivalence of the noexcept-spec so that
build_cp_fntype_variant doesn't reuse a variant when it shouldn't. In
turn we need to use structural equality for types with a complex eh spec.
This lets us get rid of the tricky handling of canonical types when updating
unparsed noexcept-spec variants.
PR c++/115159
gcc/cp/ChangeLog:
* tree.cc (build_cp_fntype_variant): Always use structural
equality for types with a complex exception specification.
(fixup_deferred_exception_variants): Use structural equality
for adjusted variants.
* typeck.cc (comp_except_specs): Require == instead of
cp_tree_equal for ce_exact noexcept-spec comparison.
gcc/testsuite/ChangeLog:
* g++.dg/modules/noexcept-2_a.H: New test.
* g++.dg/modules/noexcept-2_b.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This patch is a follow-up of r15-697-ga2e4fe5a53cf75 to also fold vget_high_*
intrinsics to BIT_FILED_REF and remove the vget_high_* definitions from
arm_neon.h to use the new intrinsics framework.
PR target/102171
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc (AARCH64_SIMD_VGET_HIGH_BUILTINS):
New macro to create definitions for all vget_high intrinsics.
(VGET_HIGH_BUILTIN): Likewise.
(enum aarch64_builtins): Add vget_high function codes.
(AARCH64_SIMD_VGET_LOW_BUILTINS): Delete duplicate macro.
(aarch64_general_fold_builtin): Fold vget_high calls.
* config/aarch64/aarch64-simd-builtins.def: Delete vget_high builtins.
* config/aarch64/aarch64-simd.md (aarch64_get_high<mode>): Delete.
(aarch64_vget_hi_halfv8bf): Likewise.
* config/aarch64/arm_neon.h (__attribute__): Delete.
(vget_high_f16): Likewise.
(vget_high_f32): Likewise.
(vget_high_f64): Likewise.
(vget_high_p8): Likewise.
(vget_high_p16): Likewise.
(vget_high_p64): Likewise.
(vget_high_s8): Likewise.
(vget_high_s16): Likewise.
(vget_high_s32): Likewise.
(vget_high_s64): Likewise.
(vget_high_u8): Likewise.
(vget_high_u16): Likewise.
(vget_high_u32): Likewise.
(vget_high_u64): Likewise.
(vget_high_bf16): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vget_high_2.c: New test.
* gcc.target/aarch64/vget_high_2_be.c: New test.
Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
|
|
Add regression test to the existing zero/sign extend tests for CMSE to
verify that r0, r1, r2 and r3 are properly extended, not just r0.
boolCharShortEnumSecureFunc test is done using -O0 to ensure the
instructions are in a predictable order.
gcc/testsuite/ChangeLog:
* gcc.target/arm/cmse/extend-param.c: Add regression test. Add
-fshort-enums.
* gcc.target/arm/cmse/extend-return.c: Add -fshort-enums option.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
The problem directly comes from the -ffold-mem-offsets pass messing up with
the prologue and the frame-related instructions, which is a no-no with SEH,
so the fix simply disconnects the pass in these circumstances.
gcc/
PR rtl-optimization/115038
* fold-mem-offsets.cc (fold_offsets): Return 0 if the defining
instruction of the register is frame related.
gcc/testsuite/
* g++.dg/opt/fmo1.C: New test.
|
|
This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
which considers an instruction loading a 64-bit constant to be significantly
cheaper than loading a 32-bit (or smaller) constant.
Consider the two functions:
unsigned long long foo() { return 0x0123456789abcdefULL; }
unsigned int bar() { return 10; }
and the corresponding lines from combine's dump file:
insn_cost 1 for #: r98:DI=0x123456789abcdef
insn_cost 4 for #: ax:SI=0xa
The same issue can be seen in -dP assembler output.
movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4
The problem is that pattern_costs interpretation of rtx_costs contains
"return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
example a register or small immediate constant) is considered special,
and equivalent to a single instruction, but all other values are treated
as verbatim. Hence to x86_64's 10-byte long movabsq instruction slightly
more expensive than a simple constant, rtx_costs needs to return
COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost of
movabsq is the intended value 5:
insn_cost 5 for #: r98:DI=0x123456789abcdef
and
movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4
2024-05-22 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
A CONST_INT that isn't x86_64_immediate_operand requires an extra
(expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.
|