aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-05-24Fortran: improve attribute conflict checking [PR93635]Harald Anlauf4-40/+54
gcc/fortran/ChangeLog: PR fortran/93635 * symbol.cc (conflict_std): Helper function for reporting attribute conflicts depending on the Fortran standard version. (conf_std): Helper macro for checking standard-dependent conflicts. (gfc_check_conflict): Use it. gcc/testsuite/ChangeLog: PR fortran/93635 * gfortran.dg/c-interop/c1255-2.f90: Adjust pattern. * gfortran.dg/pr87907.f90: Likewise. * gfortran.dg/pr93635.f90: New test. Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
2024-05-24Fortran: fix bounds check for assignment, class component [PR86100]Harald Anlauf3-19/+60
gcc/fortran/ChangeLog: PR fortran/86100 * trans-array.cc (gfc_conv_ss_startstride): Use abridged_ref_name to generate a more user-friendly name for bounds-check messages. * trans-expr.cc (gfc_copy_class_to_class): Fix bounds check for rank>1 by looping over the dimensions. gcc/testsuite/ChangeLog: PR fortran/86100 * gfortran.dg/bounds_check_25.f90: New test.
2024-05-24Small enhancement to implementation of -fdump-ada-specEric Botcazou1-9/+61
This lets it recognize more preprocessing floating constants. gcc/c-family/ * c-ada-spec.cc (is_cpp_float): New predicate. (dump_number): Deal with more preprocessing floating constants. (dump_ada_macros) <CPP_NUMBER>: Use is_cpp_float.
2024-05-24c: Fix for some variably modified types not being recognized [PR114831]Martin Uecker5-0/+80
We did not evaluate expressions with variably modified types correctly in typeof and did not produce warnings when jumping over declarations using typeof. After addressof or array-to-pointer decay we construct new pointer types that have to be marked variably modified if the pointer target is variably modified. 2024-05-18 Martin Uecker <uecker@tugraz.at> PR c/114831 gcc/c/ * c-typeck.cc (array_to_pointer_conversion, build_unary_op): Propagate flag to pointer target. gcc/testsuite/ * gcc.dg/pr114831-1.c: New test. * gcc.dg/pr114831-2.c: New test. * gcc.dg/gnu23-varmod-1.c: New test. * gcc.dg/gnu23-varmod-2.c: New test.
2024-05-25c++/modules: Improve errors for bad module-directives [PR115200]Nathaniel Shead5-6/+64
This fixes an ICE when a module directive is not given at global scope. Although not explicitly mentioned, it seems implied from [basic.link] p1 and [module.global.frag] that a module-declaration must appear at the global scope after preprocessing. Apart from this the patch also slightly improves the errors given when accidentally using a module control-line in other situations where it is not expected. PR c++/115200 gcc/cp/ChangeLog: * parser.cc (cp_parser_error_1): Special-case unexpected module directives for better diagnostics. (cp_parser_module_declaration): Check that the module declaration is at global scope. (cp_parser_import_declaration): Sync error message with that in cp_parser_error_1. gcc/testsuite/ChangeLog: * g++.dg/modules/mod-decl-1.C: Update error messages. * g++.dg/modules/mod-decl-6.C: New test. * g++.dg/modules/mod-decl-7.C: New test. * g++.dg/modules/mod-decl-8.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-25c++/modules: Remember that header units have CMIsNathaniel Shead3-7/+6
This appears to be an oversight in the definition of module_has_cmi_p. This change will allow us to use the function directly in more places that need to additional work only if generating a module CMI in the future, allowing us to do additional work only when we know we need it. gcc/cp/ChangeLog: * cp-tree.h (module_has_cmi_p): Also include header units. (module_maybe_has_cmi_p): Update comment. * module.cc (set_defining_module): Only need to track declarations for later exporting if the module may have a CMI. (set_defining_module_for_partial_spec): Likewise. * name-lookup.cc (pushdecl): Likewise. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-05-25c++/modules: Fix treatment of unnamed typesNathaniel Shead5-15/+7
In r14-9530 we relaxed "depending on type with no-linkage" errors for declarations that could actually be accessed from different TUs anyway. However, this also enabled it for unnamed types, which never work. In a normal module interface, an unnamed type is TU-local by [basic.link] p15.2, and so cannot be exposed or the program is ill-formed. We don't yet implement this checking but we should assume that we will later; currently supporting this actually causes ICEs when attempting to create the mangled name in some situations. For a header unit, by [module.import] p5.3 it is unspecified whether two TUs importing a header unit providing such a declaration are importing the same header unit. In this case, we would require name mangling changes to somehow allow the (anonymous) type exported by such a header unit to correspond across different TUs in the presence of other anonymous declarations, so for this patch just assume that this case would be an ODR violation instead. gcc/cp/ChangeLog: * tree.cc (no_linkage_check): Anonymous types can't be accessed in a different TU. gcc/testsuite/ChangeLog: * g++.dg/modules/linkage-1_a.C: Remove anonymous type test. * g++.dg/modules/linkage-1_b.C: Likewise. * g++.dg/modules/linkage-1_c.C: Likewise. * g++.dg/modules/linkage-2.C: Add note about anonymous types. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-24[to-be-committed,v2,RISC-V] Use bclri in constant synthesisJeff Law4-7/+138
Testing with Zbs enabled by default showed a minor logic error. After the loop clearing things with bclri, we can only use the sequence if we were able to clear all the necessary bits. If any bits are still on, then the bclr sequence turned out to not be profitable. -- So this is conceptually similar to how we handled direct generation of bseti for constant synthesis, but this time for bclr. In the bclr case, we already have an expander for AND. So we just needed to adjust the predicate to accept another class of constant operands (those with a single bit clear). With that in place constant synthesis is adjusted so that it counts the number of bits clear in the high 33 bits of a 64bit word. If that number is small relative to the current best cost, then we try to generate the constant with a lui based sequence for the low half which implicitly sets the upper 32 bits as well. Then we bclri one or more of those upper 33 bits. So as an example, this code goes from 4 instructions down to 3: > unsigned long foo_0xfffffffbfffff7ff(void) { return 0xfffffffbfffff7ffUL; } Note the use of 33 bits above. That's meant to capture cases like this: > unsigned long foo_0xfffdffff7ffff7ff(void) { return 0xfffdffff7ffff7ffUL; } We can use lui+addi+bclri+bclri to synthesize that in 4 instructions instead of 5. I'm including a handful of cases covering the two basic ideas above that were found by the testing code. And, no, we're not done yet. I see at least one more notable idiom missing before exploring zbkb's potential to improve things. Tested in my tester and waiting on Rivos CI system before moving forward. gcc/ * config/riscv/predicates.md (arith_operand_or_mode_mask): Renamed to.. (arith_or_mode_mask_or_zbs_operand): New predicate. * config/riscv/riscv.md (and<mode>3): Update predicate for operand 2. * config/riscv/riscv.cc (riscv_build_integer_1): Use bclri to clear bits, particularly bits 31..63 when profitable to do so. gcc/testsuite/ * gcc.target/riscv/synthesis-6.c: New test.
2024-05-24vect: Fix access size alignment assumption [PR115192]Richard Sandiford2-1/+32
create_intersect_range_checks checks whether two access ranges a and b are alias-free using something equivalent to: end_a <= start_b || end_b <= start_a It has two ways of doing this: a "vanilla" way that calculates the exact exclusive end pointers, and another way that uses the last inclusive aligned pointers (and changes the comparisons accordingly). The comment for the latter is: /* Calculate the minimum alignment shared by all four pointers, then arrange for this alignment to be subtracted from the exclusive maximum values to get inclusive maximum values. This "- min_align" is cumulative with a "+ access_size" in the calculation of the maximum values. In the best (and common) case, the two cancel each other out, leaving us with an inclusive bound based only on seg_len. In the worst case we're simply adding a smaller number than before. The problem is that the associated code implicitly assumed that the access size was a multiple of the pointer alignment, and so the alignment could be carried over to the exclusive end pointer. The testcase started failing after g:9fa5b473b5b8e289b6542 because that commit improved the alignment information for the accesses. gcc/ PR tree-optimization/115192 * tree-data-ref.cc (create_intersect_range_checks): Take the alignment of the access sizes into account. gcc/testsuite/ PR tree-optimization/115192 * gcc.dg/vect/pr115192.c: New test.
2024-05-24modula2: fix xref fourth parameter in documentation, change from gm2 to m2Gaius Mulley1-13/+13
This patch corrects the gm2.texi xref for the modula-2 documentation. gcc/ChangeLog: * doc/gm2.texi: Replace all occurrences of xref {, , , gm2} with xref {, , , m2}. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-05-24MATCH: Look through VIEW_CONVERT when folding VEC_PERM_EXPRs.Manolis Tsamis2-6/+24
The match.pd patterns to merge two vector permutes into one fail when a potentially no-op view convert expressions is between the two permutes. This change lifts this restriction. gcc/ChangeLog: * match.pd: Allow no-op view_convert between permutes. gcc/testsuite/ChangeLog: * gcc.dg/fold-perm-2.c: New test.
2024-05-24testsuite: adjust iteration count for ppc costmodel 76bAlexandre Oliva1-1/+1
For some hardware which doesn't support unaligned vector memory access, test case costmodel-vect-76b.c expects to see cost modeling would make the decision that it's not profitable for peeling, according to the commit history, test case comments and the way to check. For now, the existing loop bound 14 works well for Power7, but it does not for some targets on which the cost of operation vec_perm can be different from Power7, such as: Power6, it's 3 vs. 1. This difference further causes the difference (10 vs. 12) on the minimum iteration for profitability and cause the failure. To keep the original test point, this patch is to tweak the loop bound to ensure it's not profitable to be vectorized for !vect_no_align with peeling. Co-Authored-By: Kewen Lin <linkw@linux.ibm.com> for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak.
2024-05-24Fix gcc.dg/vect/vect-gather-4.c for cascadelakeRichard Biener1-1/+1
There's not really a good way to test what the testcase wants to test, the following exchanges one dump scan for another (imperfect) one. * gcc.dg/vect/vect-gather-4.c: Scan for not vectorizing using SLP.
2024-05-24tree-optimization/115144 - improve sinking destination choiceRichard Biener2-34/+86
When sinking code closer to its uses we already try to minimize the distance we move by inserting at the start of the basic-block. The following makes sure to sink closest to the control dependence check of the region we want to sink to as well as make sure to ignore control dependences that are only guarding exceptional code. This restores somewhat the old profile check but without requiring nearly even probabilities. The patch also makes sure to not give up completely when the best sink location is one we do not want to sink to but possibly then choose the next best one. PR tree-optimization/115144 * tree-ssa-sink.cc (do_not_sink): New function, split out from ... (select_best_block): Here. First pick valid block to sink to. From that search for the best valid block, avoiding sinking across conditions to exceptional code. (sink_code_in_bb): When updating vuses of stores in paths we do not sink a store to make sure we didn't pick a dominating sink location. * gcc.dg/tree-ssa/ssa-sink-22.c: New testcase.
2024-05-24Fix typo in the testcase.liuhongt1-5/+5
gcc/testsuite/ChangeLog: PR target/114148 * gcc.target/i386/pr106010-7b.c: Refine testcase.
2024-05-23Use simple_dce_from_worklist in phipropAndrew Pinski1-10/+18
I noticed that phiprop leaves around phi nodes which defines a ssa name which is unused. This just adds a bitmap to mark those ssa names and then calls simple_dce_from_worklist at the very end to remove those phi nodes and all of the dependencies if there was any. This might allow us to optimize something earlier due to the removal of the phi which was taking the address of the variables. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-phiprop.cc (phiprop_insert_phi): Add dce_ssa_names argument. Add the phi's result to it. (propagate_with_phi): Add dce_ssa_names argument. Update call to phiprop_insert_phi. (pass_phiprop::execute): Update call to propagate_with_phi. Call simple_dce_from_worklist if there was a change. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-24Avoid splitting store dataref groups during SLP discoveryRichard Biener9-49/+240
The following avoids splitting store dataref groups during SLP discovery but instead forces (eventually single-lane) consecutive lane SLP discovery for all lanes of the group, creating VEC_PERM SLP nodes merging them so the store will always cover the whole group. With this for example int x[1024], y[1024], z[1024], w[1024]; void foo (void) { for (int i = 0; i < 256; i++) { x[4*i+0] = y[2*i+0]; x[4*i+1] = y[2*i+1]; x[4*i+2] = z[i]; x[4*i+3] = w[i]; } } which was previously using hybrid SLP can now be fully SLPed and SSE code generated looks better (but of course you never know, I didn't actually benchmark). We of course need a VF of four here. .L2: movdqa z(%rax), %xmm0 movdqa w(%rax), %xmm4 movdqa y(%rax,%rax), %xmm2 movdqa y+16(%rax,%rax), %xmm1 movdqa %xmm0, %xmm3 punpckhdq %xmm4, %xmm0 punpckldq %xmm4, %xmm3 movdqa %xmm2, %xmm4 shufps $238, %xmm3, %xmm2 movaps %xmm2, x+16(,%rax,4) movdqa %xmm1, %xmm2 shufps $68, %xmm3, %xmm4 shufps $68, %xmm0, %xmm2 movaps %xmm4, x(,%rax,4) shufps $238, %xmm0, %xmm1 movaps %xmm2, x+32(,%rax,4) movaps %xmm1, x+48(,%rax,4) addq $16, %rax cmpq $1024, %rax jne .L2 The extra permute nodes merging distinct branches of the SLP tree might be unexpected for some code, esp. since SLP_TREE_REPRESENTATIVE cannot be meaningfully set and we cannot populate SLP_TREE_SCALAR_STMTS or SLP_TREE_SCALAR_OPS consistently as we can have a mix of both. The patch keeps the sub-trees form consecutive lanes but that's in principle not necessary if we for example have an even/odd split which now would result in N single-lane sub-trees. That's left for future improvements. The interesting part is how VLA vector ISAs handle merging of two vectors that's not trivial even/odd merging. The strathegy of how to build the permute tree might need adjustments for that (in the end splitting each branch to single lanes and then doing even/odd merging would be the brute-force fallback). Not sure how much we can or should rely on the SLP optimize pass to handle this. The gcc.dg/vect/slp-12a.c case is interesting as we currently split the 8 store group into lanes 0-5 which we SLP with an unroll factor of two (on x86-64 with SSE) and the remaining two lanes are using interleaving vectorization with a final unroll factor of four. Thus we're using hybrid SLP within a single store group. After the change we discover the same 0-5 lane SLP part as well as two single-lane parts feeding the full store group. But that results in a load permutation that isn't supported (I have WIP patchs to rectify that). So we end up cancelling SLP and vectorizing the whole loop with interleaving which is IMO good and results in better code. This is similar for gcc.target/i386/pr52252-atom.c where interleaving generates much better code than hybrid SLP. I'm unsure how to update the testcase though. gcc.dg/vect/slp-21.c runs into similar situations. Note that when when analyzing SLP operations we discard an instance we currently force the full loop to have no SLP because hybrid detection is broken. It's probably not worth fixing this at this moment. For gcc.dg/vect/pr97428.c we are not splitting the 16 store group into two but merge the two 8 lane loads into one before doing the store and thus have only a single SLP instance. A similar situation happens in gcc.dg/vect/slp-11c.c but the branches feeding the single SLP store only have a single lane. Likewise for gcc.dg/vect/vect-complex-5.c and gcc.dg/vect/vect-gather-2.c. gcc.dg/vect/slp-cond-1.c has an additional SLP vectorization with a SLP store group of size two but two single-lane branches. * tree-vect-slp.cc (vect_build_slp_instance): Do not split store dataref groups on loop SLP discovery failure but create a single SLP instance for the stores but branch to SLP sub-trees and merge with a series of VEC_PERM nodes. * gcc.dg/vect/pr97428.c: Expect a single store SLP group. * gcc.dg/vect/slp-11c.c: Likewise, if !vect_load_lanes. * gcc.dg/vect/vect-complex-5.c: Likewise. * gcc.dg/vect/slp-12a.c: Do not expect SLP. * gcc.dg/vect/slp-21.c: Remove not important scanning for SLP. * gcc.dg/vect/slp-cond-1.c: Expect one more SLP if !vect_load_lanes. * gcc.dg/vect/vect-gather-2.c: Expect SLP to be used. * gcc.target/i386/pr52252-atom.c: XFAIL test for palignr.
2024-05-24Daily bump.GCC Administrator6-1/+485
2024-05-24c++/modules: Ensure all partial specialisations are tracked [PR114947]Nathaniel Shead5-4/+34
Constrained partial specialisations aren't all necessarily tracked on the instantiation table. The modules code uses a separate 'partial_specializations' table to track them instead to ensure that they get walked and emitted when emitting a module, but currently this does not always happen. The attached testcase fails in two ways. First, because the partial specialisation is just a declaration (and not a definition), 'set_defining_module' never ends up getting called on it and so it never gets added to the partial specialisation table. We fix this by ensuring that when partial specializations are created they always get added, and so we never miss one. To prevent adding partial specialisations multiple times we split this out as a new function. The second way it fails is that when exporting the primary interface for a module with partitions, we also re-walk the specializations of all imported partitions to merge them into a single BMI. So this patch ensures that after calling 'match_mergeable_specialization' we also ensure that if the name came from a partition it gets added to the specialization table so that a dependency is correctly created for it. PR c++/114947 gcc/cp/ChangeLog: * cp-tree.h (set_defining_module_for_partial_spec): Declare. * module.cc (trees_in::decl_value): Track partial specs coming from partitions. (set_defining_module): Don't track partial specialisations here anymore. (set_defining_module_for_partial_spec): New function. * pt.cc (process_partial_specialization): Call it. gcc/testsuite/ChangeLog: * g++.dg/modules/partial-4_a.C: New test. * g++.dg/modules/partial-4_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-05-23Move condexpr_adjust into gimple-range-foldAndrew MacLeod5-130/+113
Certain components of GORI were needed in order to process a COND_EXPR expression and calculate the 2 operands as if they were true and false edges based on the condition. With GORI available from the range_query objcet now, this can be moved into the fold_using_range code where it really belongs. * gimple-range-edge.h (range_query::condexpr_adjust): Delete. * gimple-range-fold.cc (fold_using_range::range_of_range_op): Use gori_ssa routine. (fold_using_range::range_of_address): Likewise. (fold_using_range::range_of_phi): Likewise. (fold_using_range::condexpr_adjust): Relocated from gori_compute. (fold_using_range::range_of_cond_expr): Use local condexpr_adjust. (fur_source::register_outgoing_edges): Use gori_ssa routine. * gimple-range-fold.h (gori_ssa): Rename from gori_bb. (fold_using_range::condexpr_adjust): Add prototype. * gimple-range-gori.cc (gori_compute::condexpr_adjust): Relocate. * gimple-range-gori.h (gori_compute::condexpr_adjust): Delete.
2024-05-23Make gori_map a shared component.Andrew MacLeod13-45/+53
Move gori_map dependency and import/export object into a range query and construct it simultaneously with a gori object. * gimple-range-cache.cc (ranger_cache::ranger_cache): Use gori_ssa. (ranger_cache::dump): Likewise. (ranger_cache::get_global_range): Likewise. (ranger_cache::set_global_range): Likewise. (ranger_cache::register_inferred_value): Likewise. * gimple-range-edge.h (gimple_outgoing_range::map): Remove. * gimple-range-fold.cc (fold_using_range::range_of_range_op): Use gori_ssa. (fold_using_range::range_of_address): Likewise. (fold_using_range::range_of_phi): Likewise. (fur_source::register_outgoing_edges): Likewise. * gimple-range-fold.h (fur_source::query): Make const. (gori_ssa): New. * gimple-range-gori.cc (gori_map::dump): Use 'this' pointer. (gori_compute::gori_compute): Construct with a gori_map. * gimple-range-gori.h (gori_compute:gori_compute): Change prototype. (gori_compute::map): Delete. (gori_compute::m_map): Change to a reference. (FOR_EACH_GORI_IMPORT_NAME): Change parameter gori to gorimap. (FOR_EACH_GORI_EXPORT_NAME): Likewise. * gimple-range-path.cc (path_range_query::compute_ranges_in_block): Use gori_ssa method. (path_range_query::compute_exit_dependencies): Likewise. * gimple-range.cc (gimple_ranger::range_of_stmt): Likewise. (gimple_ranger::register_transitive_inferred_ranges): Likewise. * tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): Likewise. * tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise. * tree-vrp.cc (remove_unreachable::handle_early): Likewise. (remove_unreachable::remove_and_update_globals): Likewise. * value-query.cc (range_query::create_gori): Create gori map. (range_query::share_query): Copy gori map member. (range_query::range_query): Initiialize gori_map member. * value-query.h (range_query::gori_ssa): New. (range_query::m_map): New.
2024-05-23Make GORI a range_query component.Andrew MacLeod11-49/+75
This patch moves the GORI component into the range_query object, and makes it generally available. This makes it much easier to share between ranger and the passes. * gimple-range-cache.cc (ranger_cache::ranger_cache): Create GORi via the range_query instead of a local member. (ranger_cache::dump_bb): Use gori via from the range_query parent. (ranger_cache::get_global_range): Likewise. (ranger_cache::set_global_range): Likewise. (ranger_cache::edge_range): Likewise. (anger_cache::block_range): Likewise. (ranger_cache::fill_block_cache): Likewise. (ranger_cache::range_from_dom): Likewise. (ranger_cache::register_inferred_value): Likewise. * gimple-range-cache.h (ranger_cache::m_gori): Delete. * gimple-range-fold.cc (fur_source::fur_source): Set m_depend_p. (fur_depend::fur_depend): Remove gori parameter. * gimple-range-fold.h (fur_source::gori): Adjust. (fur_source::m_gori): Delete. (fur_source::m_depend): New. (fur_depend::fur_depend): Adjust prototype. * gimple-range-path.cc (path_range_query::path_range_query): Share ranger oracles. (path_range_query::range_defined_in_block): Use oracle directly. (path_range_query::compute_ranges_in_block): Use new gori() method. (path_range_query::adjust_for_non_null_uses): Use oracle directly. (path_range_query::compute_exit_dependencies): Likewise. (jt_fur_source::jt_fur_source): No gori in the parameters. (path_range_query::range_of_stmt): Likewise. (path_range_query::compute_outgoing_relations): Likewise. * gimple-range.cc (gimple_ranger::fold_range_internal): Likewise. (gimple_ranger::range_of_stmt): Access gori via gori () method. (assume_query::range_of_expr): Create a gori object. (assume_query::~assume_query): Destroy a gori object. (assume_query::calculate_op): Remove old gori() accessor. * gimple-range.h (gimple_ranger::gori): Delete. (assume_query::~assume_query): New. (assume_query::m_gori): Delete. * tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): use gori () method. * tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise. * value-query.cc (default_gori): New. (range_query::create_gori): New. (range_query::destroy_gori): New. (range_query::share_oracles): Set m_gori. (range_query::range_query): Set m_gori to default. (range_query::~range_query): call destroy gori. * value-query.h (range_query): Adjust prototypes (range_query::m_gori): New.
2024-05-23Gori_compute inherits from gimple_outgoing_range.Andrew MacLeod7-20/+36
Make gimple_outgoing_range a base class for the GORI API, and provide base routines returning false. gori_compute inherits from gimple_outgoing_range and no longer needs it as a private member. Rename outgoing_edge_range_p to edge_range_p. * gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust m_gori constructor. (ranger_cache::edge_range): Use renamed edge_range_p name. (ranger_cache::range_from_dom): Likewise. * gimple-range-edge.h (gimple_outgoing_range::condexpr_adjust): New. (gimple_outgoing_range::has_edge_range_p): New. (gimple_outgoing_range::dump): New. (gimple_outgoing_range::compute_operand_range): New. (gimple_outgoing_range::map): New. * gimple-range-fold.cc (fur_source::register_outgoing_edges ): Use renamed edge_range_p routine * gimple-range-gori.cc (gori_compute::gori_compute): Adjust constructor. (gori_compute::~gori_compute): New. (gori_compute::edge_range_p): Rename from outgoing_edge_range_p and use inherited routine instead of member method. * gimple-range-gori.h (class gori_compute): Inherit from gimple_outgoing_range, adjust protoypes. (gori_compute::outgpoing): Delete. * gimple-range-path.cc (path_range_query::compute_ranges_in_block): Use renamed edge_range_p routine. * tree-ssa-loop-unswitch.cc (evaluate_control_stmt_using_entry_checks): Likewise.
2024-05-23Gori_compute no longer inherits from gori_map.Andrew MacLeod9-50/+54
This patch moves the gori_compute object away from inheriting a gori_map object and instead it as a local member. Export it via map (). * gimple-range-cache.cc (ranger_cache::ranger_cache): Access gori_map via member call. (ranger_cache::dump_bb): Likewise. (ranger_cache::get_global_range): Likewise. (ranger_cache::set_global_range): Likewise. (ranger_cache::register_inferred_value): Likewise. * gimple-range-fold.cc (fold_using_range::range_of_range_op): Likewise. (fold_using_range::range_of_address): Likewise. (fold_using_range::range_of_phi): Likewise. * gimple-range-gori.cc (gori_compute::compute_operand_range_switch): likewise. (gori_compute::compute_operand_range): Likewise. (gori_compute::compute_logical_operands): Likewise. (gori_compute::refine_using_relation): Likewise. (gori_compute::compute_operand1_and_operand2_range): Likewise. (gori_compute::may_recompute_p): Likewise. (gori_compute::has_edge_range_p): Likewise. (gori_compute::outgoing_edge_range_p): Likewise. (gori_compute::condexpr_adjust): Likewise. * gimple-range-gori.h (class gori_compute): Do not inherit from gori_map. (gori_compute::m_map): New. * gimple-range-path.cc (gimple-range-path.cc): Use gori_map member. (path_range_query::compute_exit_dependencies): Likewise. * gimple-range.cc (gimple_ranger::range_of_stmt): Likewise. (gimple_ranger::register_transitive_inferred_ranges): Likewise. * tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): Likewise. * tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise. * tree-vrp.cc (remove_unreachable::handle_early): Likewise. (remove_unreachable::remove_and_update_globals): Likewise.
2024-05-23Default gimple_outgoing_range to not process switches.Andrew MacLeod2-7/+28
Change the default constructor to not process switches, add method to enable/disable switch processing. * gimple-range-edge.cc (gimple_outgoing_range::gimple_outgoing_range): Do not allocate a range allocator at construction time. (gimple_outgoing_range::~gimple_outgoing_range): Delete allocator if one was allocated. (gimple_outgoing_range::set_switch_limit): New. (gimple_outgoing_range::switch_edge_range): Create an allocator if one does not exist. (gimple_outgoing_range::edge_range_p): Check for zero edges. * gimple-range-edge.h (class gimple_outgoing_range): Adjust prototypes.
2024-05-23Add inferred ranges for range-ops based statements.Andrew MacLeod4-2/+119
Gimple_range_fold contains some shorthand fold_range routines for easy user consumption of that range-ops interface, but there is no equivalent routines for op1_range and op2_range. This patch provides basic versions. Any range-op entry which has an op1_range or op2_range implemented can potentially also provide inferred ranges. This is a step towards PR 113879. Default is currently OFF for performance reasons as it dramtically increases the number of inferred ranges. PR tree-optimization/113879 * gimple-range-fold.cc (op1_range): New. (op2_range): New. * gimple-range-fold.h (op1_range): New prototypes. (op2_range): New prototypes. * gimple-range-infer.cc (gimple_infer_range::add_range): Do not add an inferred range if it is VARYING. (gimple_infer_range::gimple_infer_range): Add inferred ranges for any range-op statements if requested. * gimple-range-infer.h (gimple_infer_range): Add parameter.
2024-05-23Move infer_manager to a range_query oracle.Andrew MacLeod8-58/+131
Turn the infer_manager class into an always available oracle accessible via a range_query object. Also assocaite each inferrred range with it's originating stmt. * gimple-range-cache.cc (ranger_cache::ranger_cache): Create an infer oracle instead of a local member. (ranger_cache::~ranger_cache): Destroy the oracle. (ranger_cache::edge_range): Use oracle. (ranger_cache::fill_block_cache): Likewise. (ranger_cache::range_from_dom): Likewise. (ranger_cache::apply_inferred_ranges): Likewise. * gimple-range-cache.h (ranger_cache::m_exit): Delete. * gimple-range-infer.cc (infer_oracle): New static object; (class infer_oracle): New. (non_null_wrapper::non_null_wrapper): New. (non_null_wrapper::add_nonzero): New. (non_null_wrapper::add_range): New. (non_null_loadstore): Use nonnull_wrapper. (gimple_infer_range::gimple_infer_range): New alternate constructor. (exit_range::stmt): New. (infer_range_manager::has_range_p): Combine seperate methods. (infer_range_manager::maybe_adjust_range): Adjust has_range_p call. (infer_range_manager::add_ranges): New. (infer_range_manager::add_range): Take stmt rather than BB. (infer_range_manager::add_nonzero): Adjust from BB to stmt. * gimple-range-infer.h (class gimple_infer_range): Adjust methods. (infer_range_oracle): New. (class infer_range_manager): Inherit from infer_range_oracle. Adjust methods. * gimple-range-path.cc (path_range_query::range_defined_in_block): Use oracle. (path_range_query::adjust_for_non_null_uses): Likewise. * gimple-range.cc (gimple_ranger::range_on_edge): Likewise (gimple_ranger::register_transitive_inferred_ranges): Likewise. * value-query.cc (default_infer_oracle): New. (range_query::create_infer_oracle): New. (range_query::destroy_infer_oracle): New. (range_query::share_query): Copy infer pointer. (range_query::range_query): Initialize infer pointer. (range_query::~range_query): destroy infer object. * value-query.h (range_query::infer_oracle): New. (range_query::create_infer_oracle): New prototype. (range_query::destroy_infer_oracle): New prototype. (range_query::m_infer): New.
2024-05-23Allow components to be shared among range-queries.Andrew MacLeod3-3/+17
Ranger and the ranger cache need to share components, this provides a blessed way to do so. * gimple-range.cc (gimple_ranger::gimple_ranger): Share the components from ranger_cache. (gimple_ranger::~gimple_ranger): Don't clear pointer. * value-query.cc (range_query::share_query): New. (range_query::range_query): Clear shared component flag. (range_query::~range_query): Don't free shared component copies. * value-query.h (share_query): New prototype. (m_shared_copy_p): New member.
2024-05-23Rename relation oracle and API.Andrew MacLeod9-79/+69
With more oracles incoming, rename the range_query oracle () method to relation (), and remove the redundant 'relation' text from register and query methods, resulting in calls that look like: relation ()->record (...) and relation ()->query (...) * gimple-range-cache.cc (ranger_cache::dump_bb): Use m_relation. (ranger_cache::fill_block_cache): Likewise * gimple-range-fold.cc (fur_stmt::get_phi_operand): Use new names. (fur_depend::register_relation): Likewise. (fold_using_range::range_of_phi): Likewise. * gimple-range-path.cc (path_range_query::path_range_query): Likewise. (path_range_query::~path_range_query): Likewise. (ath_range_query::compute_ranges): Likewise. (jt_fur_source::register_relation): Likewise. (jt_fur_source::query_relation): Likewise. (path_range_query::maybe_register_phi_relation): Likewise. * gimple-range-path.h (get_path_oracle): Likewise. * gimple-range.cc (gimple_ranger::gimple_ranger): Likewise. (gimple_ranger::~gimple_ranger): Likewise. * value-query.cc (range_query::create_relation_oracle): Likewise. (range_query::destroy_relation_oracle): Likewise. (range_query::share_oracles): Likewise. (range_query::range_query): Likewise. * value-query.h (value_query::relation): Rename from oracle. (m_relation): Rename from m_oracle. * value-relation.cc (relation_oracle::query): Rename from query_relation. (equiv_oracle::query): Likewise. (equiv_oracle::record): Rename from register_relation. (relation_oracle::record): Likewise. (dom_oracle::record): Likewise. (dom_oracle::query): Rename from query_relation. (path_oracle::record): Rename from register_relation. (path_oracle::query): Rename from query_relation. * value-relation.h (*::record): Rename from register_relation. (*::query): Rename from query_relation.
2024-05-23Move to an always available relation oracle.Andrew MacLeod9-197/+119
This eliminates the need to check if the relation oracle pointer is NULL before every call by providing a default oracle which does nothing. REmove unused routines, and Unify register_relation method names. * gimple-range-cache.cc (ranger_cache::dump_bb): Remove check for NULL oracle pointer. (ranger_cache::fill_block_cache): Likewise. * gimple-range-fold.cc (fur_stmt::get_phi_operand): Likewise. (fur_depend::fur_depend): Likewise. (fur_depend::register_relation): Likewise, use qury_relation. (fold_using_range::range_of_phi): Likewise. (fold_using_range::relation_fold_and_or): Likewise. * gimple-range-fold.h (fur_source::m_oracle): Delete. Oracle can be accessed dirctly via m_query now. * gimple-range-path.cc (path_range_query::path_range_query): Adjust for oracle reference pointer. (path_range_query::compute_ranges): Likewise. (jt_fur_source::jt_fur_source): Adjust for no m_oracle member. (jt_fur_source::register_relation): Do not check for NULL pointer. (jt_fur_source::query_relation): Likewise. * gimple-range.cc (gimple_ranger::gimple_ranger): Adjust for reference pointer. * value-query.cc (default_relation_oracle): New. (range_query::create_relation_oracle): Relocate from header. Ensure not being added to global query. (range_query::destroy_relation_oracle): Relocate from header. (range_query::range_query): Initailize to default oracle. (ange_query::~range_query): Call destroy_relation_oracle. * value-query.h (class range_query): Adjust prototypes. (range_query::create_relation_oracle): Move to source file. (range_query::destroy_relation_oracle): Move to source file. * value-relation.cc (relation_oracle::validate_relation): Delete. (relation_oracle::register_stmt): Rename to register_relation. (relation_oracle::register_edge): Likewise. * value-relation.h (register_stmt): Rename to register_relation and provide default function in base class. (register_edge): Likewise. (relation_oracle::validate_relation): Delete. (relation_oracle::query_relation): Provide default in base class. (relation_oracle::dump): Likewise. (relation_oracle::equiv_set): Likewise. (default_relation_oracle): New extenal reference. (partial_equiv_set, add_partial_equiv): Move to protected.
2024-05-23Move all relation queries into relation_oracle.Andrew MacLeod8-65/+76
Move relation queries from range_query object into the relation oracle. * gimple-range-cache.cc (ranger_cache::ranger_cache): Call create_relation_oracle. (ranger_cache::~ranger_cache): Call destroy_relation_oracle. * gimple-range-fold.cc (fur_stmt::get_phi_operand): Check for relation oracle bnefore calling query_relation. (fold_using_range::range_of_phi): Likewise. * gimple-range-path.cc (path_range_query::~path_range_query): Set relation oracle pointer to NULL when done. * gimple-range.cc (gimple_ranger::~gimple_ranger): Likewise. * value-query.cc (range_query::~range_query): Ensure any relation oracle is destroyed. (range_query::query_relation): relocate to relation_oracle object. * value-query.h (class range_query): Adjust method proototypes. (range_query::create_relation_oracle): New. (range_query::destroy_relation_oracle): New. * value-relation.cc (relation_oracle::query_relation): Relocate from range query class. * value-relation.h (Call relation_oracle): New prototypes.
2024-05-23c++: deleting array temporary [PR115187]Jason Merrill3-2/+21
Decaying the array temporary to a pointer and then deleting that crashes in verify_gimple_stmt, because the TARGET_EXPR is first evaluated inside the TRY_FINALLY_EXPR, but the cleanup point is outside. Fixed by using get_target_expr instead of save_expr. I also adjust the stabilize_expr comment to prevent me from again thinking it's a suitable replacement. PR c++/115187 gcc/cp/ChangeLog: * init.cc (build_delete): Use get_target_expr instead of save_expr. * tree.cc (stabilize_expr): Update comment. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/array-prvalue3.C: New test.
2024-05-23Another small fix to implementation of -fdump-ada-specEric Botcazou1-19/+11
This avoids generating invalid Ada code for function with a multidimensional array parameter and also cleans things up left and right. gcc/c-family/ * c-ada-spec.cc (check_type_name_conflict): Add guard. (is_char_array): Simplify. (dump_ada_array_type): Use strip_array_types. (dump_ada_node) <POINTER_TYPE>: Deal with anonymous array types. (dump_nested_type): Use strip_array_types.
2024-05-23Match: Add overloaded types_match to avoid code dup [NFC]Pan Li3-20/+30
There are sorts of match pattern for SAT related cases, there will be some duplicated code to check the dest, op_0, op_1 are same tree types. Aka ternary tree type matches. Thus, add overloaded types_match func do this and avoid match code duplication. The below test suites are passed for this patch: * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 regression test. gcc/ChangeLog: * generic-match-head.cc (types_match): Add overloaded types_match for 3 types. * gimple-match-head.cc (types_match): Ditto. * match.pd: Leverage overloaded types_match. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-05-23tree-optimization/115197 - fix ICE w/ constant in LC PHI and loop distributionRichard Biener2-2/+19
Forgot a check for an SSA name before trying to replace a PHI arg with its current definition. PR tree-optimization/115197 * tree-loop-distribution.cc (copy_loop_before): Constant PHI args remain the same. * gcc.dg/pr115197.c: New testcase.
2024-05-23tree-optimization/115199 - fix PTA constraint processing for &ANYTHING LHSRichard Biener2-1/+25
When processing a &ANYTHING = X constraint we treat it as *ANYTHING = X during constraint processing but then end up recording it as &ANYTHING = X anyway, breaking constraint graph building. This is because we only update the local copy of the LHS and not the constraint itself. PR tree-optimization/115199 * tree-ssa-structalias.cc (process_constraint): Also record &ANYTHING = X as *ANYTING = X in the end. * gcc.dg/torture/pr115199.c: New testcase.
2024-05-23tree-optimization/115138 - ptr-vs-ptr and FUNCTION_DECLsRichard Biener2-0/+34
I failed to realize we do not represent FUNCTION_DECLs or LABEL_DECLs in vars explicitly and thus have to compare pt.vars_contains_nonlocal. PR tree-optimization/115138 * tree-ssa-alias.cc (ptrs_compare_unequal): Make sure pt.vars_contains_nonlocal differs since we do not represent FUNCTION_DECLs or LABEL_DECLs in vars explicitly. * gcc.dg/torture/pr115138.c: New testcase.
2024-05-23missing require target has_arch_ppc64 for pr106550.cJiufu Guo1-1/+1
Hi, Case pr106550.c is testing constant building for 64bit register. It fails with -m32 without having the expected rldimi. So, this case requires target of has_arch_ppc64. Bootstrap and regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff(Jiufu) Guo gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr106550.c: Adjust by requiring has_arch_ppc64 effective target. And remove power10_ok.
2024-05-23testsuite: vect: Fix gcc.dg/vect/vect-pr111779.c on SPARC [PR114072]Rainer Orth1-1/+1
gcc.dg/vect/vect-pr111779.c FAILs on 32 and 64-bit Solaris/SPARC: FAIL: gcc.dg/vect/vect-pr111779.c -flto -ffat-lto-objects scan-tree-dump vect "LOOP VECTORIZED" FAIL: gcc.dg/vect/vect-pr111779.c scan-tree-dump vect "LOOP VECTORIZED" This patch implements Richard's analysis from the PR, skipping the scan-tree-dump part for big-endian targets without vect_shift_char support. Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11 (32 and 64-bit each). 2024-05-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR tree-optimization/114072 * gcc.dg/vect/vect-pr111779.c (scan-tree-dump): Require vect_shift_char on big-endian targets.
2024-05-23Fortran: Fix ICEs due to comp calls in initialization exprs [PR103312]Paul Thomas4-2/+126
2024-05-23 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/103312 * dependency.cc (gfc_dep_compare_expr): Handle component call expressions. Return -2 as default and return 0 if compared with a function expression that is from an interface body and has the same name. * expr.cc (gfc_reduce_init_expr): If the expression is a comp call do not attempt to reduce, defer to resolution and return false. * trans-types.cc (gfc_get_dtype_rank_type, gfc_get_nodesc_array_type): Fix whitespace. gcc/testsuite/ PR fortran/103312 * gfortran.dg/pr103312.f90: New test.
2024-05-23s390: Implement TARGET_NOCE_CONVERSION_PROFITABLE_P [PR109549]Stefan Schulze Frielinghaus2-2/+34
Consider a NOCE conversion as profitable if there is at least one conditional move. gcc/ChangeLog: PR target/109549 * config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P): Define. (s390_noce_conversion_profitable_p): Implement. gcc/testsuite/ChangeLog: * gcc.target/s390/ccor.c: Order of loads are reversed, now, as a consequence the condition has to be reversed.
2024-05-23[testsuite] xfail pr79004 on longdouble64; drop long_double_64bitAlexandre Oliva2-49/+8
Some of the asm opcodes expected by pr79004 depend on -mlong-double-128 to be output. E.g., without this flag, the conditions of patterns @extenddf<mode>2 and extendsf<mode>2 do not hold, and so GCC resorts to libcalls instead of even trying rs6000_expand_float128_convert. Perhaps the conditions are too strict, and they could enable the use of conversion insns involving __ieee128/_Float128 even with 64-bit long doubles. For now, xfail the opcodes that are not available on longdouble64. While at that, drop long_double_64bit, since it's broken and sort of redundant. for gcc/testsuite/ChangeLog PR target/105359 * gcc.target/powerpc/pr79004.c: Xfail opcodes not available on longdouble64. * lib/target-supports.exp (check_effective_target_long_double_64bit): Drop. (add_options_for_long_double_64bit): Likewise.
2024-05-23[prange] Use type agnostic range in phiopt [PR115191]Aldy Hernandez2-3/+12
Fix a use of int_range_max in phiopt that should be a type agnostic range, because it could be either a pointer or an int. PR tree-optimization/115191 gcc/ChangeLog: * tree-ssa-phiopt.cc (value_replacement): Use Value_Range instead of int_range_max. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr115191.c: New test.
2024-05-22AARCH64: Add Qualcomnm oryon-1 coreAndrew Pinski3-1/+7
This patch adds Qualcomm's new oryon-1 core; this is enough to recongize the core and later on will add the tuning structure. gcc/ChangeLog: * config/aarch64/aarch64-cores.def (oryon-1): New entry. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi (AArch64 Options): Document oryon-1. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Co-authored-by: Joel Jones <quic_joeljone@quicinc.com> Co-authored-by: Wei Zhao <quic_wezhao@quicinc.com>
2024-05-23Daily bump.GCC Administrator4-1/+165
2024-05-22c++: canonicity of fn types w/ complex eh specs [PR115159]Patrick Palka4-39/+41
Here the member functions QList::g and QList::h are given the same function type by build_cp_fntype_variant since their noexcept-specs are equivalent according to cp_tree_equal. In doing so however this means that the function type of QList::h refers to a function parameter from QList::g, which ends up confusing modules streaming. I'm not sure if modules can be fixed to handle this situation, but regardless it seems weird in principle that a function parameter can escape in such a way. The analogous situation with a trailing return type and decltype auto g(QList &other) -> decltype(f(other)); auto h(QList &other) -> decltype(f(other)); behaves better because we don't canonicalize decltype, and so the function types of g and h are non-canonical and therefore not shared. In light of this, it seems natural to treat function types with complex noexcept-specs as non-canonical as well so that each such function declaration is given a unique function type node. (The main benefit of type canonicalization is to speed up repeated type comparisons, but it should be rare to repeatedly compare two otherwise compatible function types with complex noexcept-specs.) To that end, this patch strengthens the ce_exact case of comp_except_specs to require identity instead of equivalence of the noexcept-spec so that build_cp_fntype_variant doesn't reuse a variant when it shouldn't. In turn we need to use structural equality for types with a complex eh spec. This lets us get rid of the tricky handling of canonical types when updating unparsed noexcept-spec variants. PR c++/115159 gcc/cp/ChangeLog: * tree.cc (build_cp_fntype_variant): Always use structural equality for types with a complex exception specification. (fixup_deferred_exception_variants): Use structural equality for adjusted variants. * typeck.cc (comp_except_specs): Require == instead of cp_tree_equal for ce_exact noexcept-spec comparison. gcc/testsuite/ChangeLog: * g++.dg/modules/noexcept-2_a.H: New test. * g++.dg/modules/noexcept-2_b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-22aarch64: Fold vget_high_* intrinsics to BIT_FIELD_REF [PR102171]Pengxuan Zheng6-149/+104
This patch is a follow-up of r15-697-ga2e4fe5a53cf75 to also fold vget_high_* intrinsics to BIT_FILED_REF and remove the vget_high_* definitions from arm_neon.h to use the new intrinsics framework. PR target/102171 gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (AARCH64_SIMD_VGET_HIGH_BUILTINS): New macro to create definitions for all vget_high intrinsics. (VGET_HIGH_BUILTIN): Likewise. (enum aarch64_builtins): Add vget_high function codes. (AARCH64_SIMD_VGET_LOW_BUILTINS): Delete duplicate macro. (aarch64_general_fold_builtin): Fold vget_high calls. * config/aarch64/aarch64-simd-builtins.def: Delete vget_high builtins. * config/aarch64/aarch64-simd.md (aarch64_get_high<mode>): Delete. (aarch64_vget_hi_halfv8bf): Likewise. * config/aarch64/arm_neon.h (__attribute__): Delete. (vget_high_f16): Likewise. (vget_high_f32): Likewise. (vget_high_f64): Likewise. (vget_high_p8): Likewise. (vget_high_p16): Likewise. (vget_high_p64): Likewise. (vget_high_s8): Likewise. (vget_high_s16): Likewise. (vget_high_s32): Likewise. (vget_high_s64): Likewise. (vget_high_u8): Likewise. (vget_high_u16): Likewise. (vget_high_u32): Likewise. (vget_high_u64): Likewise. (vget_high_bf16): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vget_high_2.c: New test. * gcc.target/aarch64/vget_high_2_be.c: New test. Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
2024-05-22testsuite: Verify r0-r3 are extended with CMSETorbjörn SVENSSON2-6/+19
Add regression test to the existing zero/sign extend tests for CMSE to verify that r0, r1, r2 and r3 are properly extended, not just r0. boolCharShortEnumSecureFunc test is done using -O0 to ensure the instructions are in a predictable order. gcc/testsuite/ChangeLog: * gcc.target/arm/cmse/extend-param.c: Add regression test. Add -fshort-enums. * gcc.target/arm/cmse/extend-return.c: Add -fshort-enums option. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-05-22Fix internal error in seh_cfa_offset with -O2 -fno-omit-frame-pointerEric Botcazou2-1/+26
The problem directly comes from the -ffold-mem-offsets pass messing up with the prologue and the frame-related instructions, which is a no-no with SEH, so the fix simply disconnects the pass in these circumstances. gcc/ PR rtl-optimization/115038 * fold-mem-offsets.cc (fold_offsets): Return 0 if the defining instruction of the register is frame related. gcc/testsuite/ * g++.dg/opt/fmo1.C: New test.
2024-05-22i386: Correct insn_cost of movabsq.Roger Sayle1-1/+2
This single line patch fixes a strange quirk/glitch in i386's rtx_costs, which considers an instruction loading a 64-bit constant to be significantly cheaper than loading a 32-bit (or smaller) constant. Consider the two functions: unsigned long long foo() { return 0x0123456789abcdefULL; } unsigned int bar() { return 10; } and the corresponding lines from combine's dump file: insn_cost 1 for #: r98:DI=0x123456789abcdef insn_cost 4 for #: ax:SI=0xa The same issue can be seen in -dP assembler output. movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 The problem is that pattern_costs interpretation of rtx_costs contains "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for example a register or small immediate constant) is considered special, and equivalent to a single instruction, but all other values are treated as verbatim. Hence to x86_64's 10-byte long movabsq instruction slightly more expensive than a simple constant, rtx_costs needs to return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost of movabsq is the intended value 5: insn_cost 5 for #: r98:DI=0x123456789abcdef and movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: A CONST_INT that isn't x86_64_immediate_operand requires an extra (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.