aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-04-19rtl-optimization/109237 - quadraticness in delete_trivially_dead_insnsRichard Biener1-15/+24
The following addresses quadraticness in processing debug insns in delete_trivially_dead_insns and insn_live_p by using TREE_VISITED on the INSN_VAR_LOCATION_DECL to indicate a later debug bind with the same decl and no intervening real insn or debug marker. That gets rid of the NEXT_INSN walk in insn_live_p in favor of first clearing TREE_VISITED in the first loop over insn and the book-keeping of decls we set the bit since we need to clear them when visiting a real or debug marker insn. That improves the time spent in delete_trivially_dead_insns from 10.6s to 2.2s for the testcase. PR rtl-optimization/109237 * cse.cc (insn_live_p): Remove NEXT_INSN walk, instead check TREE_VISITED on INSN_VAR_LOCATION_DECL. (delete_trivially_dead_insns): Maintain TREE_VISITED on active debug bind INSN_VAR_LOCATION_DECL.
2023-04-19rtl-optimization/109237 - speedup bb_is_just_returnRichard Biener1-2/+2
For the testcase bb_is_just_return is on top of the profile, changing it to walk BB insns backwards puts it off the profile. That's because in the forward walk you have to process possibly many debug insns but in a backward walk you very likely run into control insns first. PR rtl-optimization/109237 * cfgcleanup.cc (bb_is_just_return): Walk insns backwards.
2023-04-19testsuite: Fix up pr109524.C for -std=c++23 [PR109524]Jakub Jelinek1-1/+1
This testcase was reduced such that it isn't valid C++23, so with my usual testing with GXX_TESTSUITE_STDS=98,11,14,17,20,2b it fails: FAIL: g++.dg/pr109524.C -std=gnu++2b (test for excess errors) .../gcc/testsuite/g++.dg/pr109524.C: In function 'nn hh(nn)': .../gcc/testsuite/g++.dg/pr109524.C:35:12: error: cannot bind non-const lvalue reference of type 'nn&' to an rvalue of type 'nn' .../gcc/testsuite/g++.dg/pr109524.C:17:6: note: initializing argument 1 of 'nn::nn(nn&)' The following patch fixes that and I've verified it doesn't change anything on what the test was testing, it still ICEs in r13-7198 and passes in r13-7203, now in all language modes (except for 98 where it is intentionally UNSUPPORTED). 2023-04-19 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109524 * g++.dg/pr109524.C (nn::nn): Change argument type from nn & to const nn &.
2023-04-19install.texi: Document --enable-decimal-float for AArch64Christophe Lyon1-7/+8
When I committed the patches to enable support for DFP on AArch64, I forgot to update the installation documentation. This patch adds AArch64 as needed (same as i386/x86_64). 2023-04-17 Christophe Lyon <christophe.lyon@arm.com> gcc/ * doc/install.texi (enable-decimal-float): Add AArch64.
2023-04-19Check hard_regno_mode_ok before setting lowest memory move cost for the mode ↵liuhongt1-0/+4
with different reg classes. There's a potential performance issue when backend returns some unreasonable value for the mode which can be never be allocate with reg class. gcc/ChangeLog: PR rtl-optimization/109351 * ira.cc (setup_class_subset_and_memory_move_costs): Check hard_regno_mode_ok before setting lowest memory move cost for the mode with different reg classes.
2023-04-19Daily bump.GCC Administrator4-1/+339
2023-04-18doc: remove stray @golJason Merrill1-1/+1
@gol was removed in r13-6778, new doc additions can't use it. gcc/ChangeLog: * doc/invoke.texi: Remove stray @gol.
2023-04-18ifcvt.cc: Prevent excessive if-conversion for conditional movesTakayuki 'January June' Suwa1-1/+1
gcc/ * ifcvt.cc (cond_move_process_if_block): Consider the result of targetm.noce_conversion_profitable_p() when replacing the original sequence with the converted one.
2023-04-18Add -gcodeview optionMark Harmstone4-0/+18
gcc/ * common.opt (gcodeview): Add new option. * gcc.cc (driver_handle_option); Handle OPT_gcodeview. * opts.cc (command_handle_option): Similarly. * doc/invoke.texi: Add documentation for -gcodeview.
2023-04-18PHIOPT: Move tree_ssa_cs_elim into pass_cselim::execute.Andrew Pinski1-61/+57
This moves around the code for tree_ssa_cs_elim slightly improving code readability and removing declarations that are no longer needed. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-phiopt.cc (tree_ssa_phiopt_worker): Remove declaration. (make_pass_phiopt): Make execute out of line. (tree_ssa_cs_elim): Move code into ... (pass_cselim::execute): here.
2023-04-18gcc: Drop obsolete INCLUDE_PTHREAD_HSam James1-4/+0
gcc/ChangeLog: * system.h: Drop unused INCLUDE_PTHREAD_H.
2023-04-18vect: Verify that GET_MODE_UNITS is greater than one for ↵Kevin Lee1-0/+2
vect_grouped_store_supported gcc/ChangeLog: * tree-vect-data-refs.cc (vect_grouped_store_supported): Add new condition.
2023-04-18Add TARGET_ZBKB to the condition of bswapsi2, bswapdi2 and rotr<mode>3 patternsSinan Lin1-3/+3
gcc/ * config/riscv/bitmanip.md (rotr<mode>3 expander): Enable for ZBKB. (bswapdi2, bswapsi2): Similarly.
2023-04-18i386: Improve permutations with INSERTPS instruction [PR94908]Uros Bizjak8-9/+164
INSERTPS can select any element from src and insert into any place of the dest. For SSE4.1 targets, compiler can generate e.g. insertps $64, %xmm0, %xmm1 to insert element 1 from %xmm1 to element 0 of %xmm0. gcc/ChangeLog: PR target/94908 * config/i386/i386-builtin.def (__builtin_ia32_insertps128): Use CODE_FOR_sse4_1_insertps_v4sf. * config/i386/i386-expand.cc (expand_vec_perm_insertps): New. (expand_vec_perm_1): Call expand_vec_per_insertps. * config/i386/i386.md ("unspec"): Declare UNSPEC_INSERTPS here. * config/i386/mmx.md (mmxscalarmode): New mode attribute. (@sse4_1_insertps_<mode>): New insn pattern. * config/i386/sse.md (@sse4_1_insertps_<mode>): Macroize insn pattern from sse4_1_insertps using VI4F_128 mode iterator. gcc/testsuite/ChangeLog: PR target/94908 * gcc.target/i386/pr94908.c: New test. * gcc.target/i386/sse4_1-insertps-5.c: New test. * gcc.target/i386/vperm-v4sf-2-sse4.c: New test.
2023-04-18Add GTY support for vrange.Aldy Hernandez2-37/+99
IPA currently puts *some* irange's in GC memory. When I contribute support for generic ranges in IPA, we'll need to change this to vrange. This patch adds GTY support for both vrange and frange. gcc/ChangeLog: * value-range.cc (gt_ggc_mx): New. (gt_pch_nx): New. * value-range.h (class vrange): Add GTY marker. (class frange): Same. (gt_ggc_mx): Remove. (gt_pch_nx): Remove.
2023-04-18constraint: fix relaxed memory and repeated constraint handlingVictor L. Do Nascimento2-4/+38
The function `constrain_operands' lacked the logic to consider relaxed memory constraints when "traditional" memory constraints were not satisfied, creating potential issues as observed during the reload compilation pass. In addition, it was observed that while `constrain_operands' chooses to disregard constraints when more than one alternative is provided, e.g. "m,r" using CONSTRAINT__UNKNOWN, it has no checks in place to determine whether the multiple constraints in a given string are in fact repetitions of the same constraint and should thus in fact be treated as a single constraint, as ought to be the case for something like "m,m". Both of these issues are dealt with here, thus ensuring that we get appropriate pattern matching. gcc/ * lra-constraints.cc (constraint_unique): New. (process_address_1): Apply constraint_unique test. * recog.cc (constrain_operands): Allow relaxed memory constaints.
2023-04-18Docs: Add doc for RISC-V vector intrinsicsKito Cheng1-0/+9
Document which version of RISC-V vector intrinsics has implemented in GCC. gcc/ChangeLog: * doc/extend.texi (Target Builtins): Add RISC-V Vector Intrinsics. (RISC-V Vector Intrinsics): Document GCC implemented which version of RISC-V vector intrinsics and its reference.
2023-04-18middle-end/108786 - add bitmap_clear_first_set_bitRichard Biener9-20/+48
This adds bitmap_clear_first_set_bit and uses it where previously bitmap_clear_bit followed bitmap_first_set_bit. The advantage is speeding up the search and avoiding to clobber ->current. PR middle-end/108786 * bitmap.h (bitmap_clear_first_set_bit): New. * bitmap.cc (bitmap_first_set_bit_worker): Rename from bitmap_first_set_bit and add optional clearing of the bit. (bitmap_first_set_bit): Wrap bitmap_first_set_bit_worker. (bitmap_clear_first_set_bit): Likewise. * df-core.cc (df_worklist_dataflow_doublequeue): Use bitmap_clear_first_set_bit. * graphite-scop-detection.cc (scop_detection::merge_sese): Likewise. * sanopt.cc (sanitize_asan_mark_unpoison): Likewise. (sanitize_asan_mark_poison): Likewise. * tree-cfgcleanup.cc (cleanup_tree_cfg_noloop): Likewise. * tree-into-ssa.cc (rewrite_blocks): Likewise. * tree-ssa-dce.cc (simple_dce_from_worklist): Likewise. * tree-ssa-sccvn.cc (do_rpo_vn_1): Likewise.
2023-04-18Shrink points-to analysis dumps when not dumping with -detailsRichard Biener19-46/+53
The following allows to get PTA stats with -stats without blowing up your filesystem by guarding constraint and solution dumping with TDF_DETAILS and the SSA points-to info with TDF_DETAILS or TDF_ALIAS. * tree-ssa-structalias.cc (dump_sa_stats): Split out from... (dump_sa_points_to_info): ... this function. (compute_points_to_sets): Guard large dumps with TDF_DETAILS, and call dump_sa_stats guarded with TDF_STATS. (ipa_pta_execute): Likewise. (compute_may_aliases): Guard dump_alias_info with TDF_DETAILS|TDF_ALIAS. * gcc.dg/ipa/ipa-pta-16.c: Use -details for dump. * gcc.dg/tm/alias-1.c: Likewise. * gcc.dg/tm/alias-2.c: Likewise. * gcc.dg/torture/ipa-pta-1.c: Likewise. * gcc.dg/torture/pr39074-2.c: Likewise. * gcc.dg/torture/pr39074.c: Likewise. * gcc.dg/torture/pta-callused-1.c: Likewise. * gcc.dg/torture/pta-escape-1.c: Likewise. * gcc.dg/torture/pta-ptrarith-1.c: Likewise. * gcc.dg/torture/pta-ptrarith-2.c: Likewise. * gcc.dg/torture/pta-ptrarith-3.c: Likewise. * gcc.dg/torture/pta-structcopy-1.c: Likewise. * gcc.dg/torture/ssa-pta-fn-1.c: Likewise. * gcc.dg/tree-ssa/alias-19.c: Likewise. * gcc.dg/tree-ssa/pta-callused.c: Likewise. * gcc.dg/tree-ssa/pta-fp.c: Likewise. * gcc.dg/tree-ssa/pta-ptrarith-1.c: Likewise. * gcc.dg/tree-ssa/pta-ptrarith-2.c: Likewise.
2023-04-18PHIOPT: add folding/simplification detail to the dumpAndrew Pinski1-0/+29
While debugging PHI-OPT with match-and-simplify, I found that adding more dumping to the debug dumps made it easier to understand what was going on rather than stepping in the debugger so this adds them. Note I used TDF_FOLDING rather than TDF_DETAILS as these debug messages can be chatty and only needed if you are debugging match and simplify with PHI-OPT and match and simplify uses TDF_FOLDING as its check. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-phiopt.cc (gimple_simplify_phiopt): Dump the expression that is being tried when TDF_FOLDING is true. (phiopt_worker::match_simplify_replacement): Dump the sequence which was created by gimple_simplify_phiopt when TDF_FOLDING is true.
2023-04-18PHIOPT: small cleanup in match_simplify_replacementAndrew Pinski1-3/+2
We know that the statement we are moving is already have a SSA_NAME on the lhs so we don't need to check that and can also just call reset_flow_sensitive_info with the name we already got. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-phiopt.cc (match_simplify_replacement): Simplify code that does the movement slightly.
2023-04-18aarch64: Use standard RTL codes for __rev16 intrinsic expansionKyrylo Tkachov1-9/+17
I noticed for the expansion of the __rev16* arm_acle.h intrinsics we don't need to use an unspec just because it doesn't match neatly to a bswap code. We have organic combine patterns for it that we can reuse. This patch removes the define_insn using UNSPEC_REV (should it have been an UNSPEC_REV16?) and adds an expander to emit the patterns we have for rev16 using standard RTL codes. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64.md (@aarch64_rev16<mode>): Change to define_expand. (rev16<mode>2): Rename to... (aarch64_rev16<mode>2_alt1): ... This. (rev16<mode>2_alt): Rename to... (*aarch64_rev16<mode>2_alt2): ... This.
2023-04-18Declare dconstm0 to go along with dconst0 and friends.Aldy Hernandez5-9/+11
Negating dconst0 is getting pretty old, and we will keep adding copies of the same idiom. Fixed by adding a dconstm0 constant to go along with dconst1, dconstm1, etc. gcc/ChangeLog: * emit-rtl.cc (init_emit_once): Initialize dconstm0. * gimple-range-op.cc (class cfn_signbit): Remove dconstm0 declaration. * range-op-float.cc (zero_range): Use dconstm0. (zero_to_inf_range): Same. * real.h (dconstm0): New. * value-range.cc (frange::flush_denormals_to_zero): Use dconstm0. (frange::set_zero): Do not declare dconstm0.
2023-04-18RAII auto_mpfr and autp_mpzRichard Biener4-15/+41
The following adds two RAII classes, one for mpz_t and one for mpfr_t making object lifetime management easier. Both formerly require explicit initialization with {mpz,mpfr}_init and release with {mpz,mpfr}_clear. I've converted two example places (where lifetime is trivial). * system.h (class auto_mpz): New, * realmpfr.h (class auto_mpfr): Likewise. * fold-const-call.cc (do_mpfr_arg1): Use auto_mpfr. (do_mpfr_arg2): Likewise. * tree-ssa-loop-niter.cc (bound_difference): Use auto_mpz;
2023-04-18aarch64: Use intrinsic flags information rather than hardcoding FLAG_AUTO_FPKyrylo Tkachov1-1/+1
We record the flags to use for the intrinsics in aarch64_simd_intrinsic_data, so use it when initialising them rather than using a hardcoded FLAG_AUTO_FP. The current vreinterpret intrinsics use FLAG_AUTO_FP anyway so this patch is an NFC but this will be needed as we migrate more builtins into the intrinsics infrastructure. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (aarch64_init_simd_intrinsics): Take builtin flags from intrinsic data rather than hardcoded FLAG_AUTO_FP.
2023-04-18Fixed typo.Jin Ma1-1/+1
gcc/ada * gcc-interface/utils.cc (unchecked_convert): Fixed typo.
2023-04-18Return true from operator== for two identical ranges containing NAN.Aldy Hernandez1-10/+0
The == operator for ranges signifies that two ranges contain the same thing, not that they are ultimately equal. So [2,4] == [2,4], even though one may be a 2 and the other may be a 3. Similarly with two VARYING ranges. There is an oversight in frange::operator== where we are returning false for two identical NANs. This is causing us to never cache NANs in sbr_sparse_bitmap::set_bb_range. gcc/ChangeLog: * value-range.cc (frange::operator==): Adjust for NAN. (range_tests_nan): Remove some NAN tests.
2023-04-18Add inchash support for vrange.Aldy Hernandez4-0/+95
This patch provides inchash support for vrange. It is along the lines of the streaming support I just posted and will be used for IPA hashing of ranges. gcc/ChangeLog: * inchash.cc (hash::add_real_value): New. * inchash.h (class hash): Add add_real_value. * value-range.cc (add_vrange): New. * value-range.h (inchash::add_vrange): New.
2023-04-18tree-optimization/109539 - restrict PHI handling in access diagnosticsRichard Biener1-11/+45
Access diagnostics visits the SSA def-use chains to diagnose things like dangling pointer uses. When that runs into PHIs it tries to prove all incoming pointers of which one is the currently visited use are related to decide whether to keep looking for the PHI def uses. That turns out to be overly optimistic and thus costly. The following scraps the existing handling for simply requiring that we eventually visit all incoming pointers of the PHI during the def-use chain analysis and only then process uses of the PHI def. Note this handles backedges of natural loops optimistically, diagnosing the first iteration. There's gcc.dg/Wuse-after-free-2.c containing a testcase requiring this. PR tree-optimization/109539 * gimple-ssa-warn-access.cc (pass_waccess::check_pointer_uses): Re-implement pointer relatedness for PHIs.
2023-04-18amdgcn: HardFP divideAndrew Stubbs4-97/+144
Implement FP division using hardware instructions. This replaces both the softfp library calls, and the --fast-math inaccurate divsion we had previously. The GCN architecture does not have a single divide instruction, but it does have a number of support instructions designed to make multiply-by-reciprocal sufficiently accurate for non-fast-math usage. gcc/ChangeLog: * config/gcn/gcn-valu.md (SV_SFDF): New iterator. (SV_FP): New iterator. (scalar_mode, SCALAR_MODE): Add identity mappings for scalar modes. (recip<mode>2): Unify the two patterns using SV_FP. (div_scale<mode><exec_vcc>): New insn. (div_fmas<mode><exec>): New insn. (div_fixup<mode><exec>): New insn. (div<mode>3): Unify the two expanders and rewrite using hardfp. * config/gcn/gcn.cc (gcn_md_reorg): Support "vccwait" attribute. * config/gcn/gcn.md (unspec): Add UNSPEC_DIV_SCALE, UNSPEC_DIV_FMAS, and UNSPEC_DIV_FIXUP. (vccwait): New attribute. gcc/testsuite/ChangeLog: * gcc.target/gcn/fpdiv.c: Remove the -ffast-math requirement.
2023-04-18aarch64: Give hint for -mcpu options that match -march insteadKyrylo Tkachov2-0/+19
We should redirect users of the erroneous -mcpu=armv8.2-a to use -march instead. There is an equivalent hint for -march used with a CPU name. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_validate_mcpu): Add hint to use -march if the argument matches that. gcc/testsuite/ChangeLog: * gcc.target/aarch64/spellcheck_11.c: New test.
2023-04-18aarch64: Add QI -> HI zero-extension for LDAPRKyrylo Tkachov2-3/+11
This patch is a straightforward extension of the zero-extending LDAPR pattern to represent QI -> HI load-extends. This maps down to a LDAPRB-W instruction. This lets us remove a redundant zero-extend in the new test function. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_zext): Use SD_HSDI for destination mode iterator. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldapr-zext.c: Add test for u8 to u16 extension.
2023-04-18RISC-V: Adjust the parsing order of extensions to be consistent with ↵Jin Ma2-7/+7
riscv-spec and binutils. The current order of gcc and binutils parsing extensions is inconsistent. According to latest risc-v spec, the canonical order in which extension names must appear in the name string specified in Table 29.1 is different from before. In the latest table, non-standard extensions must be listed after all standard extensions. To keep consistent, we now change the parsing order. Related llvm patch links: https://reviews.llvm.org/D148315 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (multi_letter_subset_rank): Swap the order of z-extensions and s-extensions. (riscv_subset_list::parse): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/arch-5.c: Likewise.
2023-04-18match.pd: Improve fneg/fadd optimization [PR109240]Jakub Jelinek3-54/+175
match.pd has mostly for AArch64 an optimization in which it optimizes certain forms of __builtin_shuffle of x + y and x - y vectors into fneg using twice as wide element type so that every other sign is changed, followed by fadd. The following patch extends that optimization, so that it can handle other forms as well, using the same fneg but fsub instead of fadd. As the plus is commutative and minus is not and I want to handle vec_perm with plus minus and minus plus order preferrably in one pattern, I had to do the matching operand checks by hand. 2023-04-18 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109240 * match.pd (fneg/fadd): Rewrite such that it handles both plus as first vec_perm operand and minus as second using fneg/fadd and minus as first vec_perm operand and plus as second using fneg/fsub. * gcc.target/aarch64/simd/addsub_2.c: New test. * gcc.target/aarch64/sve/addsub_2.c: New test.
2023-04-18Abstract out REAL_VALUE_TYPE streaming.Aldy Hernandez4-25/+38
In upcoming patches I will contribute code to stream out frange's as well as vrange's. This patch abstracts out the REAL_VALUE_TYPE streaming into their own functions, so that they may be used elsewhere. gcc/ChangeLog: * data-streamer.cc (bp_pack_real_value): New. (bp_unpack_real_value): New. * data-streamer.h (bp_pack_real_value): New. (bp_unpack_real_value): New. * tree-streamer-in.cc (unpack_ts_real_cst_value_fields): Use bp_unpack_real_value. * tree-streamer-out.cc (pack_ts_real_cst_value_fields): Use bp_pack_real_value.
2023-04-18Abstract out calculation of max HWIs per wide int.Aldy Hernandez1-5/+7
I'm about to add one more use of the same snippet of code, for a total of 4 identical calculations in the code base. gcc/ChangeLog: * wide-int.h (WIDE_INT_MAX_HWIS): New. (class fixed_wide_int_storage): Use it. (trailing_wide_ints <N>::set_precision): Use it. (trailing_wide_ints <N>::extra_size): Use it.
2023-04-18LoongArch: Optimize additions with immediatesXi Ruoyao9-16/+246
1. Use addu16i.d for TARGET_64BIT and suitable immediates. 2. Split one addition with immediate into two addu16i.d or addi.{d/w} instructions if possible. This can avoid using a temp register w/o increase the count of instructions. Inspired by https://reviews.llvm.org/D143710 and https://reviews.llvm.org/D147222. Bootstrapped and regtested on loongarch64-linux-gnu. Ok for GCC 14? gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_addu16i_imm12_operand_p): New function prototype. (loongarch_split_plus_constant): Likewise. * config/loongarch/loongarch.cc (loongarch_addu16i_imm12_operand_p): New function. (loongarch_split_plus_constant): Likewise. * config/loongarch/loongarch.h (ADDU16I_OPERAND): New macro. (DUAL_IMM12_OPERAND): Likewise. (DUAL_ADDU16I_OPERAND): Likewise. * config/loongarch/constraints.md (La, Lb, Lc, Ld, Le): New constraint. * config/loongarch/predicates.md (const_dual_imm12_operand): New predicate. (const_addu16i_operand): Likewise. (const_addu16i_imm12_di_operand): Likewise. (const_addu16i_imm12_si_operand): Likewise. (plus_di_operand): Likewise. (plus_si_operand): Likewise. (plus_si_extend_operand): Likewise. * config/loongarch/loongarch.md (add<mode>3): Convert to define_insn_and_split. Use plus_<mode>_operand predicate instead of arith_operand. Add alternatives for La, Lb, Lc, Ld, and Le constraints. (*addsi3_extended): Convert to define_insn_and_split. Use plus_si_extend_operand instead of arith_operand. Add alternatives for La and Le alternatives. gcc/testsuite/ChangeLog: * gcc.target/loongarch/add-const.c: New test. * gcc.target/loongarch/stack-check-cfa-1.c: Adjust for stack frame size change. * gcc.target/loongarch/stack-check-cfa-2.c: Likewise.
2023-04-18Add two new methods to Value_Range.Aldy Hernandez1-0/+9
This is for upcoming work in this area. gcc/ChangeLog: * value-range.h (Value_Range::Value_Range): New. (Value_Range::contains_p): New.
2023-04-18Constify invariant fields of vrange and irange.Aldy Hernandez1-10/+11
The discriminator in vrange cannot change after construction, similarly the number of allocated ranges in an irange. It's best to make them constant to avoid invalid changes. gcc/ChangeLog: * value-range.h (class vrange): Make m_discriminator const. (class irange): Make m_max_ranges const. Adjust constructors accordingly. (class unsupported_range): Construct vrange appropriately. (class frange): Same.
2023-04-18LoongArch: Remove the definition of the macro LOGICAL_OP_NON_SHORT_CIRCUIT ↵Lulu Cheng1-1/+0
under the architecture and use the default definition instead. In some cases, setting this macro as the default can reduce the number of conditional branch instructions. gcc/ChangeLog: * config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Remove the macro definition.
2023-04-18LoongArch: Add built-in functions description of LoongArch Base instruction ↵Lulu Cheng1-0/+129
set instructions. gcc/ChangeLog: * doc/extend.texi: Add section for LoongArch Base Built-in functions.
2023-04-18Daily bump.GCC Administrator5-1/+172
2023-04-17RISC-V: make the stack manipulation codes more readable.Fei Gao1-7/+10
gcc/ChangeLog: * config/riscv/riscv.cc (riscv_first_stack_step): Make codes more readable. (riscv_expand_epilogue): Likewise.
2023-04-17c++: bound ttp level lowering [PR109531]Patrick Palka3-19/+50
Here when level lowering the bound ttp TT<typename T::type> via the substitution T=C, we're neglecting to canonicalize (and thereby strip of simple typedefs) the substituted template arguments {A<int>} before determining the new canonical type via hash table lookup. This leads to a hash mismatch ICE for the two equivalent types TT<int> and TT<A<int>> since iterative_hash_template_arg assumes type arguments are already canonicalized. We can fix this by canonicalizing or coercing the substituted arguments directly, but seeing as creation and ordinary substitution of bound ttps both go through lookup_template_class, which in turn performs the desired coercion/canonicalization, it seems preferable to make this code path go through lookup_template_class as well. PR c++/109531 gcc/cp/ChangeLog: * pt.cc (tsubst) <case BOUND_TEMPLATE_TEMPLATE_PARM>: In the level-lowering case just use lookup_template_class to rebuild the bound ttp. gcc/testsuite/ChangeLog: * g++.dg/template/canon-type-20.C: New test. * g++.dg/template/ttp36.C: New test.
2023-04-17RISC-V: optimize stack manipulation in save-restoreFei Gao2-24/+66
The stack that save-restore reserves is not well accumulated in stack allocation and deallocation. This patch allows less instructions to be used in stack allocation and deallocation if save-restore enabled. before patch: bar: call t0,__riscv_save_4 addi sp,sp,-64 ... li t0,-12288 addi t0,t0,-1968 # optimized out after patch add sp,sp,t0 # prologue ... li t0,12288 # epilogue addi t0,t0,2000 # optimized out after patch add sp,sp,t0 ... addi sp,sp,32 tail __riscv_restore_4 after patch: bar: call t0,__riscv_save_4 addi sp,sp,-2032 ... li t0,-12288 add sp,sp,t0 # prologue ... li t0,12288 # epilogue add sp,sp,t0 ... addi sp,sp,2032 tail __riscv_restore_4 gcc/ * config/riscv/riscv.cc (riscv_expand_prologue): Consider save-restore in stack allocation. (riscv_expand_epilogue): Consider save-restore in stack deallocation. gcc/testsuite * gcc.target/riscv/stack_save_restore.c: New test.
2023-04-17PHIOPT: Remove gate_hoist_loads prototypeAndrew Pinski1-1/+0
gate_hoist_loads is defined before its usage so there is no reason for the declaration (prototype) to be there. Committed as obvious after a bootstrap/test on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-phiopt.cc (gate_hoist_loads): Remove prototype.
2023-04-17Do not export global ranges from -Walloca pass.Aldy Hernandez1-2/+1
A warning pass should not be exporting global ranges it finds along the way, because that will alter the behavior of future passes. The reason the present behavior was there was because of some long ago forgotten regression in another pass. This regression is no longer there, and if there's ever any fallout from cleaning this up, we can address it in the pass that is missing some information. gcc/ChangeLog: * gimple-ssa-warn-alloca.cc (pass_walloca::execute): Do not export global ranges.
2023-04-17RISC-V: Force ilp32d for the T-Head FMV testPalmer Dabbelt1-1/+1
These functions are NOPs on the soft-float ABIs. Since we're already forcing the ISA, let's just force the ABI too. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadfmv-fmv.c: Force the ilp32d ABI.
2023-04-17RISC-V: Set the ABI for the RVV testsPalmer Dabbelt1-1/+3
The RVV test harness currently sets the ISA according to the target tuple, but doesn't also set the ABI. This just sets the ABI to match the ISA, though we should really also be respecting the user's specific ISA to test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.exp (gcc_mabi): New variable.
2023-04-17RISC-V: Clean up the pr106602.c testcasePalmer Dabbelt3-1/+30
The test case that was added is rv64i-specific, as there's better ways to generate this code on rv32i (where the long/int cast is a NOP) and on rv64i_zba (where we have word shifts). This renames the original test case and adds two more for those targets. gcc/testsuite/ChangeLog: PR target/106602 * gcc.target/riscv/pr106602.c: Moved to... * gcc.target/riscv/pr106602-rv64i.c: ...here. * gcc.target/riscv/pr106602-rv32i.c: New test. * gcc.target/riscv/pr106602-rv64i_zba.c: New test.