aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-06-06reload1: Change return type of predicate function from int to boolUros Bizjak2-3/+3
gcc/ChangeLog: * rtl.h (function_invariant_p): Change return type from int to bool. * reload1.cc (function_invariant_p): Change return type from int to bool and adjust function body accordingly.
2023-06-06RISC-V: Add RVV vwmacc/vwmaccu/vwmaccsu combine lowering optmizationJuzhe-Zhong8-0/+346
Fix according to comments from Robin of V1 patch. This patch add combine optimization for following case: __attribute__ ((noipa)) void vwmaccsu (int16_t *__restrict dst, int8_t *__restrict a, uint8_t *__restrict b, int n) { for (int i = 0; i < n; i++) dst[i] += (int16_t) a[i] * (int16_t) b[i]; } Before this patch: ... vsext.vf2 vzext.vf2 vmadd.vv .. After this patch: ... vwmaccsu.vv ... gcc/ChangeLog: * config/riscv/autovec-opt.md (*<optab>_fma<mode>): New pattern. (*single_<optab>mult_plus<mode>): Ditto. (*double_<optab>mult_plus<mode>): Ditto. (*sign_zero_extend_fma): Ditto. (*zero_sign_extend_fma): Ditto. * config/riscv/riscv-protos.h (enum insn_type): New enum. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-8.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-9.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-6.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run-8.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run-9.c: New test.
2023-06-06openmp: Add support for the 'present' modifierTobias Burnus23-51/+797
This implements support for the OpenMP 5.1 'present' modifier, which can be used in map clauses in the 'target', 'target data', 'target data enter' and 'target data exit' constructs, and in the 'to' and 'from' clauses of the 'target update' construct. It is also supported in defaultmap. The modifier triggers a fatal runtime error if the data specified by the clause is not already present on the target device. It can also be combined with 'always' in map clauses. 2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> gcc/c/ * c-parser.cc (c_parser_omp_clause_defaultmap, c_parser_omp_clause_map): Parse 'present'. (c_parser_omp_clause_to, c_parser_omp_clause_from): Remove. (c_parser_omp_clause_from_to): New; parse to/from clauses with optional present modifer. (c_parser_omp_all_clauses): Update call. (c_parser_omp_target_data, c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Handle new map enum values for 'present' mapping. gcc/cp/ * parser.cc (cp_parser_omp_clause_defaultmap, cp_parser_omp_clause_map): Parse 'present'. (cp_parser_omp_clause_from_to): New; parse to/from clauses with optional 'present' modifier. (cp_parser_omp_all_clauses): Update call. (cp_parser_omp_target_data, cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data): Handle new enum value for 'present' mapping. * semantics.cc (finish_omp_target): Likewise. gcc/fortran/ * dump-parse-tree.cc (show_omp_namelist): Display 'present' map modifier. (show_omp_clauses): Display 'present' motion modifier for 'to' and 'from' clauses. * gfortran.h (enum gfc_omp_map_op): Add entries with 'present' modifiers. (struct gfc_omp_namelist): Add 'present_modifer'. * openmp.cc (gfc_match_motion_var_list): New, handles optional 'present' modifier for to/from clauses. (gfc_match_omp_clauses): Call it for to/from clauses; parse 'present' in defaultmap and map clauses. (resolve_omp_clauses): Allow 'present' modifiers on 'target', 'target data', 'target enter' and 'target exit' directives. * trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for defaultmap. gcc/ * gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag set. (omp_get_attachment): Handle map clauses with 'present' modifier. (omp_group_base): Likewise. (gimplify_scan_omp_clauses): Reorder present maps to come first. Set GOVD flags for present defaultmaps. (gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps. * omp-low.cc (scan_sharing_clauses): Handle 'always, present' map clauses. (lower_omp_target): Handle map clauses with 'present' modifier. Handle 'to' and 'from' clauses with 'present'. * tree-core.h (enum omp_clause_defaultmap_kind): Add OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind. * tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and 'from' clauses with 'present' modifier. Handle present defaultmap. * tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define. include/ * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New. (GOMP_MAP_FLAG_FORCE): Redefine. (GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New. (enum gomp_map_kind): Add map kinds with 'present' modifiers. (GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for map variants with 'present' (GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true for map variants with 'always, present' modifiers. (GOMP_MAP_ALWAYS): Redefine. (GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New. libgomp/ * libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses. * target.c (gomp_to_device_kind_p): Add map kinds with 'present' modifier. (gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro. (gomp_map_vars_internal, gomp_update, gomp_target_rev): Emit runtime error if memory region not present. * testsuite/libgomp.c-c++-common/target-present-1.c: New test. * testsuite/libgomp.c-c++-common/target-present-2.c: New test. * testsuite/libgomp.c-c++-common/target-present-3.c: New test. * testsuite/libgomp.fortran/target-present-1.f90: New test. * testsuite/libgomp.fortran/target-present-2.f90: New test. * testsuite/libgomp.fortran/target-present-3.f90: New test. gcc/testsuite/ * c-c++-common/gomp/map-6.c: Update dg-error, extend to test for duplicated 'present' and extend scan-dump tests for 'present'. * gfortran.dg/gomp/defaultmap-1.f90: Update dg-error. * gfortran.dg/gomp/map-7.f90: Extend parse and dump test for 'present'. * gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present' modifier checking. * c-c++-common/gomp/defaultmap-4.c: New test. * c-c++-common/gomp/map-9.c: New test. * c-c++-common/gomp/target-update-1.c: New test. * gfortran.dg/gomp/defaultmap-8.f90: New test. * gfortran.dg/gomp/map-11.f90: New test. * gfortran.dg/gomp/map-12.f90: New test. * gfortran.dg/gomp/target-update-1.f90: New test.
2023-06-06rs6000: genfusion: Delete dead codeSegher Boessenkool1-3/+0
2023-06-06 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/genfusion.pl: Delete some dead code.
2023-06-06rs6000: genfusion: Rewrite load/compare codeSegher Boessenkool1-82/+103
This makes the code more readable, more digestible, more maintainable, more extensible. That kind of thing. It does that by pulling things apart a bit, but also making what stays together more cohesive lumps. The original function was a bunch of loops and early-outs, and then quite a bit of stuff done per iteration, with the iterations essentially independent of each other. This patch moves the stuff done for one iteration to a new _one function. The second big thing is the stuff printed to the .md file is done in "here documents" now, which is a lot more readable than having to quote and escape and double-escape pieces of text. Whitespace inside the here-document is significant (will be printed as-is), which is a bit awkward sometimes, or might take some getting used to, but it is also one of the benefits of using them. Local variables are declared at first use (or close to first use). There also shouldn't be many at all, often you can write easier to read and manage code by omitting to name something that is hard to name in the first place. Finally some things are done in more typical, more modern, and tighter Perl style, for example REs in "if"s or "qw" for lists of constants. 2023-06-06 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): New, rewritten and split out from... (gen_ld_cmpi_p10): ... this.
2023-06-06rs6000: Remove duplicate expression [PR106907]Jeevitha Palanisamy1-1/+0
PR106907 has few warnings spotted from cppcheck. In that addressing duplicate expression issue here. Here the same expression is used twice in logical AND(&&) operation which result in same result so removing that. 2023-06-06 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/106907 * config/rs6000/rs6000.cc (vec_const_128bit_to_bytes): Remove duplicate expression.
2023-06-06aarch64: Improve representation of vpaddd intrinsicsKyrylo Tkachov4-14/+3
The aarch64_addpdi pattern is redundant as the reduc_plus_scal_<mode> pattern can already generate the required form of the ADDP instruction, and is mostly folded to GIMPLE early on so can benefit from more optimisations. Though it turns out that we were missing the folding for the unsigned variants. This patch adds that and wires up the vpaddd_u64 and vpaddd_s64 intrinsics through the above pattern instead so that we can remove a redundant pattern and get more optimisation earlier. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (aarch64_general_gimple_fold_builtin): Handle unsigned reduc_plus_scal_ builtins. * config/aarch64/aarch64-simd-builtins.def (addp): Delete DImode instances. * config/aarch64/aarch64-simd.md (aarch64_addpdi): Delete. * config/aarch64/arm_neon.h (vpaddd_s64): Reimplement with __builtin_aarch64_reduc_plus_scal_v2di. (vpaddd_u64): Reimplement with __builtin_aarch64_reduc_plus_scal_v2di_uu.
2023-06-06aarch64: Reimplement URSHR,SRSHR patterns with standard RTL codesKyrylo Tkachov2-7/+93
Having converted the patterns for the URSRA,SRSRA instructions to standard RTL codes we can also easily convert the non-accumulating forms URSHR,SRSHR. This patch does that, reusing the various helpers and predicates from that patch in a straightforward way. This allows GCC to perform the optimisations in the testcase, matching what Clang does. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_<sur>shr_n<mode>): Delete. (aarch64_<sra_op>rshr_n<mode><vczle><vczbe>_insn): New define_insn. (aarch64_<sra_op>rshr_n<mode>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/vrshr_1.c: New test.
2023-06-06aarch64: Simplify SHRN, RSHRN expanders and patternsKyrylo Tkachov2-81/+14
Now that we've got the <vczle><vczbe> annotations we can get rid of explicit !BYTES_BIG_ENDIAN and BYTES_BIG_ENDIAN patterns for the narrowing shift instructions. This allows us to clean up the expanders as well. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_shrn<mode>_insn_le): Delete. (aarch64_shrn<mode>_insn_be): Delete. (*aarch64_<srn_op>shrn<mode>_vect): Rename to... (*aarch64_<srn_op>shrn<mode><vczle><vczbe>): ... This. (aarch64_shrn<mode>): Remove reference to the above deleted patterns. (aarch64_rshrn<mode>_insn_le): Delete. (aarch64_rshrn<mode>_insn_be): Delete. (aarch64_rshrn<mode><vczle><vczbe>_insn): New define_insn. (aarch64_rshrn<mode>): Remove references to the above deleted patterns. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/pr99195_5.c: Add testing for shrn_n, rshrn_n intrinsics.
2023-06-06aarch64: Improve representation of ADDLV instructionsKyrylo Tkachov6-11/+168
We've received requests to optimise the attached intrinsics testcase. We currently generate: foo_1: uaddlp v0.4s, v0.8h uaddlv d31, v0.4s fmov x0, d31 ret foo_2: uaddlp v0.4s, v0.8h addv s31, v0.4s fmov w0, s31 ret foo_3: saddlp v0.4s, v0.8h addv s31, v0.4s fmov w0, s31 ret The widening pair-wise addition addlp instructions can be omitted if we're just doing an ADDV afterwards. Making this optimisation would be quite simple if we had a standard RTL PLUS vector reduction code. As we don't, we can use UNSPEC_ADDV as a stand in. This patch expresses the SADDLV and UADDLV instructions as an UNSPEC_ADDV over a widened input, thus removing the need for separate UNSPEC_SADDLV and UNSPEC_UADDLV codes. To optimise the testcases involved we add two splitters that match a vector addition where all participating elements are taken and widened from the same vector and then fed into an UNSPEC_ADDV. In that case we can just remove the vector PLUS and just emit the simple RTL for SADDLV/UADDLV. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_parallel_select_half_p): Define prototype. (aarch64_pars_overlap_p): Likewise. * config/aarch64/aarch64-simd.md (aarch64_<su>addlv<mode>): Express in terms of UNSPEC_ADDV. (*aarch64_<su>addlv<VDQV_L:mode>_ze<GPI:mode>): Likewise. (*aarch64_<su>addlv<mode>_reduction): Define. (*aarch64_uaddlv<mode>_reduction_2): Likewise. * config/aarch64/aarch64.cc (aarch64_parallel_select_half_p): Define. (aarch64_pars_overlap_p): Likewise. * config/aarch64/iterators.md (UNSPEC_SADDLV, UNSPEC_UADDLV): Delete. (VQUADW): New mode attribute. (VWIDE2X_S): Likewise. (USADDLV): Delete. (su): Delete handling of UNSPEC_SADDLV, UNSPEC_UADDLV. * config/aarch64/predicates.md (vect_par_cnst_select_half): Define. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/addlv_1.c: New test.
2023-06-06middle-end/110055 - avoid CLOBBERing static variablesRichard Biener2-1/+19
The gimplifier can elide initialized constant automatic variables to static storage in which case TARGET_EXPR gimplification needs to avoid emitting a CLOBBER for them since their lifetime is no longer limited. Failing to do so causes spurious dangling-pointer diagnostics on the added testcase for some targets. PR middle-end/110055 * gimplify.cc (gimplify_target_expr): Do not emit CLOBBERs for variables which have static storage duration after gimplifying their initializers. * g++.dg/warn/Wdangling-pointer-pr110055.C: New testcase.
2023-06-06tree-optimization/109143 - improve PTA compile timeRichard Biener1-13/+25
The following improves solution_set_expand to require one less iteration over the bitmap and avoid changing the bitmap we iterate over. Plus we handle adjacent subvars in the ID space (the common case) and use bitmap_set_range. This cuts a bit less than 10% off the PTA time from the testcase in the PR. PR tree-optimization/109143 * tree-ssa-structalias.cc (solution_set_expand): Avoid one bitmap iteration and optimize bit range setting.
2023-06-06bootstrap rtl-checking: Fix XVEC vs XVECEXP in postreload.ccHans-Peter Nilsson1-2/+2
PR bootstrap/110120 * postreload.cc (reload_cse_move2add, move2add_use_add2_insn): Use XVECEXP, not XEXP, to access first item of a PARALLEL.
2023-06-05RISC-V] add TC for save-restore cfi directives.Fei Gao1-0/+17
gcc/testsuite/ChangeLog: * gcc.target/riscv/save-restore-cfi.c: New test to check save-restore cfi directives.
2023-06-06RISC-V: Support RVV FP16 ZVFH Reduction floating-point intrinsic APIPan Li3-2/+75
This patch support the intrinsic API of FP16 ZVFH Reduction floating-point. Aka SEW=16 for below instructions: vfredosum vfredusum vfredmax vfredmin vfwredosum vfwredusum Then users can leverage the instrinsic APIs to perform the FP=16 related reduction operations. Please note not all the instrinsic APIs are coverred in the test files, only pick some typical ones due to too many. We will perform the FP16 related instrinsic API test entirely soon. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-types.def (vfloat16mf4_t): Add vfloat16mf4_t to WF operations. (vfloat16mf2_t): Likewise. (vfloat16m1_t): Likewise. (vfloat16m2_t): Likewise. (vfloat16m4_t): Likewise. (vfloat16m8_t): Likewise. * config/riscv/vector-iterators.md: Add FP=16 to VWF, VWF_ZVE64, VWLMUL1, VWLMUL1_ZVE64, vwlmul1 and vwlmul1_zve64. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Add new test cases.
2023-06-05[RISC-V] correct machine mode in save-restore cfi RTL.Fei Gao2-5/+21
gcc/ChangeLog: * config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue): Use Pmode for cfi reg/mem machmode (riscv_adjust_libcall_cfi_epilogue): Use Pmode for cfi reg machmode gcc/testsuite/ChangeLog: * gcc.target/riscv/save-restore-cfi-2.c: New test to check machmode for cfi reg/mem.
2023-06-06RISC-V: Fix 'REQUIREMENT' for machine_mode 'MODE' in vector-iterators.md.Li Xu2-16/+16
gcc/ChangeLog: * config/riscv/vector-iterators.md: Fix 'REQUIREMENT' for machine_mode 'MODE'. * config/riscv/vector.md (@pred_indexed_<order>store<VNX16_QHS:mode> <VNX16_QHSI:mode>): change VNX16_QHSI to VNX16_QHSDI. (@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>): Ditto.
2023-06-06RISC-V: Fix some typo in vector-iterators.mdPan Li1-4/+4
This patch would like to fix some typo in vector-iterators.md, aka: [-"vnx1DI")-]{+"vnx1di")+} [-"vnx2SI")-]{+"vnx2si")+} [-"vnx1SI")-]{+"vnx1si")+} Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/vector-iterators.md: Fix typo in mode attr.
2023-06-06Daily bump.GCC Administrator4-1/+229
2023-06-05Remove widen_plus/minus_expr tree codesAndre Vieira14-137/+2
This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. 2023-06-05 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> gcc/ChangeLog: * doc/generic.texi: Remove old tree codes. * expr.cc (expand_expr_real_2): Remove old tree code cases. * gimple-pretty-print.cc (dump_binary_rhs): Likewise. * optabs-tree.cc (optab_for_tree_code): Likewise. (supportable_half_widening_operation): Likewise. * tree-cfg.cc (verify_gimple_assign_binary): Likewise. * tree-inline.cc (estimate_operator_cost): Likewise. (op_symbol_code): Likewise. * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise. (vect_analyze_data_ref_accesses): Likewise. * tree-vect-generic.cc (expand_vector_operations_1): Likewise. * cfgexpand.cc (expand_debug_expr): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Likewise. (supportable_widening_operation): Likewise. * gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard): Likewise. * optabs.def (vec_widen_ssubl_hi_optab, vec_widen_ssubl_lo_optab, vec_widen_saddl_hi_optab, vec_widen_saddl_lo_optab, vec_widen_usubl_hi_optab, vec_widen_usubl_lo_optab, vec_widen_uaddl_hi_optab, vec_widen_uaddl_lo_optab): Remove optabs. * tree-pretty-print.cc (dump_generic_node): Remove tree code definition. * tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR, VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR, VEC_WIDEN_MINUS_LO_EXPR): Likewise.
2023-06-05internal-fn,vect: Refactor widen_plus as internal_fnAndre Vieira12-47/+375
DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively. With the exception that they provide convenience wrappers for a single vector to vector conversion, a hi/lo split or an even/odd split. Each definition for <NAME> will require either signed optabs named <UOPTAB> and <SOPTAB> (for widening) or a single <OPTAB> (for narrowing) for each of the five functions it creates. For example, for widening addition the DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions: IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two optabs, one for signed and one for unsigned. Aarch64 implements the hi/lo split optabs: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. 2023-06-05 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> Tamar Christina <tamar.christina@arm.com> gcc/ChangeLog: * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename this ... (vec_widen_<su>add_lo_<mode>): ... to this. (vec_widen_<su>addl_hi_<mode>): Rename this ... (vec_widen_<su>add_hi_<mode>): ... to this. (vec_widen_<su>subl_lo_<mode>): Rename this ... (vec_widen_<su>sub_lo_<mode>): ... to this. (vec_widen_<su>subl_hi_<mode>): Rename this ... (vec_widen_<su>sub_hi_<mode>): ...to this. * doc/generic.texi: Document new IFN codes. * internal-fn.cc (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (narrowing_fn_p): New function. (direct_internal_fn_optab): Change visibility. * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an internal_fn that expands into multiple internal_fns for widening. (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD, IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, IFN_VEC_WIDEN_MINUS_LO, IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening plus,minus functions. * internal-fn.h (direct_internal_fn_optab): Declare new prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (Narrowing_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus optabs. * optabs.def (OPTAB_D): Define widen add, sub optabs. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo or even/odd split. (vect_recog_sad_pattern): Refactor to use new IFN codes. (vect_recog_widen_plus_pattern): Likewise. (vect_recog_widen_minus_pattern): Likewise. (vect_recog_average_pattern): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Add support for _HILO IFNs. (supportable_widening_operation): Likewise. * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used.
2023-06-05vect: Refactor to allow internal_fn'sAndre Vieira4-99/+180
Refactor vect-patterns to allow patterns to be internal_fns starting with widening_plus/minus patterns 2023-06-05 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> gcc/ChangeLog: * tree-vect-patterns.cc: Add include for gimple-iterator. (vect_recog_widen_op_pattern): Refactor to use code_helper. (vect_gimple_build): New function. * tree-vect-stmts.cc (simple_integer_narrowing): Refactor to use code_helper. (vectorizable_call): Likewise. (vect_gen_widened_results_half): Likewise. (vect_create_vectorized_demotion_stmts): Likewise. (vect_create_vectorized_promotion_stmts): Likewise. (vect_create_half_widening_stmts): Likewise. (vectorizable_conversion): Likewise. (supportable_widening_operation): Likewise. (supportable_narrowing_operation): Likewise. * tree-vectorizer.h (supportable_widening_operation): Change prototype to use code_helper. (supportable_narrowing_operation): Likewise. (vect_gimple_build): New function prototype. * tree.h (code_helper::safe_as_tree_code): New function. (code_helper::safe_as_fn_code): New function.
2023-06-05d: Warn when declared size of a special enum does not match its intrinsic type.Iain Buclaw5-0/+49
All special enums have declarations in the D runtime library, but the compiler will recognize and treat them specially if declared in any module. When the underlying base type of a special enum is a different size to its matched intrinsic, then this can cause undefined behavior at runtime. Detect and warn about when such a mismatch occurs. gcc/d/ChangeLog: * gdc.texi (Warnings): Document -Wextra and -Wmismatched-special-enum. * implement-d.texi (Special Enums): Add reference to warning option -Wmismatched-special-enum. * lang.opt: Add -Wextra and -Wmismatched-special-enum. * types.cc (TypeVisitor::visit (TypeEnum *)): Warn when declared special enum size mismatches its intrinsic type. gcc/testsuite/ChangeLog: * gdc.dg/Wmismatched_enum.d: New test.
2023-06-05New wi::bitreverse function.Roger Sayle2-0/+42
This patch provides a wide-int implementation of bitreverse, that implements both of Richard Sandiford's suggestions from the review at https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an improved API (as a stand-alone function matching the bswap refactoring), and an implementation that works with any bit-width precision. 2023-06-05 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * wide-int.cc (wi::bitreverse_large): New function implementing bit reversal of an integer. * wide-int.h (wi::bitreverse): New (template) function prototype. (bitreverse_large): Prototype helper function/implementation. (wi::bitreverse): New template wrapper around bitreverse_large.
2023-06-05Testsuite: Fix a fail about xtheadcondmov-indirect-rv64.cLiao Shihua2-50/+50
I find fail of the xtheadcondmov-indirect-rv64.c test case and provide a way to solve it. In this patch, I take Kito's advice that I modify the form of the function bodies.It likes *[a-x0-9]. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Generalize to be less sensitive to register allocation choices. * gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Similarly.
2023-06-05print-rtl: Change return type of two print functions from int to voidUros Bizjak3-30/+28
Also change one internal variable to bool. gcc/ChangeLog: * rtl.h (print_rtl_single): Change return type from int to void. (print_rtl_single_with_indent): Ditto. * print-rtl.h (class rtx_writer): Ditto. Change m_sawclose to bool. * print-rtl.cc (rtx_writer::rtx_writer): Update for m_sawclose change. (rtx_writer::print_rtx_operand_code_0): Ditto. (rtx_writer::print_rtx_operand_codes_E_and_V): Ditto. (rtx_writer::print_rtx_operand_code_i): Ditto. (rtx_writer::print_rtx_operand_code_u): Ditto. (rtx_writer::print_rtx_operand): Ditto. (rtx_writer::print_rtx): Ditto. (rtx_writer::finish_directive): Ditto. (print_rtl_single): Change return type from int to void and adjust function body accordingly. (rtx_writer::print_rtl_single_with_indent): Ditto.
2023-06-05reginfo: Change return type of predicate functions from int to boolUros Bizjak2-6/+6
gcc/ChangeLog: * rtl.h (reg_classes_intersect_p): Change return type from int to bool. (reg_class_subset_p): Ditto. * reginfo.cc (reg_classes_intersect_p): Ditto. (reg_class_subset_p): Ditto.
2023-06-05RISC-V: Support RVV FP16 ZVFH floating-point intrinsic APIPan Li3-0/+471
This patch support the intrinsic API of FP16 ZVFH floating-point. Aka SEW=16 for below instructions: vfadd vfsub vfrsub vfwadd vfwsub vfmul vfdiv vfrdiv vfwmul vfmacc vfnmacc vfmsac vfnmsac vfmadd vfnmadd vfmsub vfnmsub vfwmacc vfwnmacc vfwmsac vfwnmsac vfsqrt vfrsqrt7 vfrec7 vfmin vfmax vfsgnj vfsgnjn vfsgnjx vmfeq vmfne vmflt vmfle vmfgt vmfge vfclass vfmerge vfmv vfcvt vfwcvt vfncvt Then users can leverage the instrinsic APIs to perform the FP=16 related operations. Please note not all the instrinsic APIs are coverred in the test files, only pick some typical ones due to too many. We will perform the FP16 related instrinsic API test entirely soon. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-types.def (vfloat32mf2_t): New type for DEF_RVV_WEXTF_OPS. (vfloat32m1_t): Ditto. (vfloat32m2_t): Ditto. (vfloat32m4_t): Ditto. (vfloat32m8_t): Ditto. (vint16mf4_t): New type for DEF_RVV_CONVERT_I_OPS. (vint16mf2_t): Ditto. (vint16m1_t): Ditto. (vint16m2_t): Ditto. (vint16m4_t): Ditto. (vint16m8_t): Ditto. (vuint16mf4_t): New type for DEF_RVV_CONVERT_U_OPS. (vuint16mf2_t): Ditto. (vuint16m1_t): Ditto. (vuint16m2_t): Ditto. (vuint16m4_t): Ditto. (vuint16m8_t): Ditto. (vint32mf2_t): New type for DEF_RVV_WCONVERT_I_OPS. (vint32m1_t): Ditto. (vint32m2_t): Ditto. (vint32m4_t): Ditto. (vint32m8_t): Ditto. (vuint32mf2_t): New type for DEF_RVV_WCONVERT_U_OPS. (vuint32m1_t): Ditto. (vuint32m2_t): Ditto. (vuint32m4_t): Ditto. (vuint32m8_t): Ditto. * config/riscv/vector-iterators.md: Add FP=16 support for V, VWCONVERTI, VCONVERT, VNCONVERT, VMUL1 and vlmul1. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zvfh-intrinsic.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-06-05Fix PR 110085: `make clean` in GCC directory on sh target causes a failureAndrew Pinski1-7/+0
On sh target, there is a MULTILIB_DIRNAMES (or is it MULTILIB_OPTIONS) named m2, this conflicts with the langauge m2. So when you do a `make clean`, it will remove the m2 directory and then a build will fail. Now since r0-78222-gfa9585134f6f58, the multilib directories are no longer created in the gcc directory as libgcc was moved to the toplevel. So we can remove the part of clean that removes those directories. Tested on x86_64-linux-gnu and a cross to sh-elf that `make clean` followed by `make` works again. OK? gcc/ChangeLog: PR bootstrap/110085 * Makefile.in (clean): Remove the removing of MULTILIB_DIR/MULTILIB_OPTIONS directories.
2023-06-05MIPS: Add speculation_barrier supportYunQiang Su3-0/+26
speculation_barrier for MIPS needs sync+jr.hb (r2+), so we implement __speculation_barrier in libgcc, like arm32 does. gcc/ChangeLog: * config/mips/mips-protos.h (mips_emit_speculation_barrier): New prototype. * config/mips/mips.cc (speculation_barrier_libfunc): New static variable. (mips_init_libfuncs): Initialize it. (mips_emit_speculation_barrier): New function. * config/mips/mips.md (speculation_barrier): Call mips_emit_speculation_barrier. libgcc/ChangeLog: * config/mips/lib1funcs.S: New file. define __speculation_barrier and include mips16.S. * config/mips/t-mips: define LIB1ASMSRC as mips/lib1funcs.S. define LIB1ASMFUNCS as _speculation_barrier. set version info for __speculation_barrier. * config/mips/libgcc-mips.ver: New file. * config/mips/t-mips16: don't define LIB1ASMSRC as mips16.S included in lib1funcs.S now.
2023-06-05RISC-V: Reorganize riscv-v.ccJuzhe-Zhong1-248/+249
This patch is just reorganizing the functions for the following patch. I put rvv_builder and emit_* functions located before expand_const_vector function since I will use them in expand_const_vector in the following patch. gcc/ChangeLog: * config/riscv/riscv-v.cc (class rvv_builder): Reorganize functions. (rvv_builder::can_duplicate_repeating_sequence_p): Ditto. (rvv_builder::repeating_sequence_use_merge_profitable_p): Ditto. (rvv_builder::get_merged_repeating_sequence): Ditto. (rvv_builder::get_merge_scalar_mask): Ditto. (emit_scalar_move_insn): Ditto. (emit_vlmax_integer_move_insn): Ditto. (emit_nonvlmax_integer_move_insn): Ditto. (emit_vlmax_gather_insn): Ditto. (emit_vlmax_masked_gather_mu_insn): Ditto. (get_repeating_sequence_dup_machine_mode): Ditto.
2023-06-05RISC-V: Split arguments of expand_vec_permJuzhe-Zhong3-7/+4
Since the following patch will calls expand_vec_perm with splitted arguments, change the expand_vec_perm interface in this patch. gcc/ChangeLog: * config/riscv/autovec.md: Split arguments. * config/riscv/riscv-protos.h (expand_vec_perm): Ditto. * config/riscv/riscv-v.cc (expand_vec_perm): Ditto.
2023-06-05Daily bump.GCC Administrator4-1/+130
2023-06-04Improve do_store_flag for comparing single bit against that bitAndrew Pinski1-3/+8
This is a case which I noticed while working on the previous patch. Sometimes we end up with `a == CST` instead of comparing against 0. This happens in the following code: ``` unsigned f(unsigned t) { if (t & ~(1<<30)) __builtin_unreachable(); t ^= (1<<30); return t != 0; } ``` We should handle the case where the nonzero bits is the same as the comparison operand. Changes from v1: * v2: Updated for the bit extraction changes. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * expr.cc (do_store_flag): Improve for single bit testing not against zero but against that single bit.
2023-06-04Improve do_store_flag for single bit comparison against 0Andrew Pinski1-5/+20
While working something else, I noticed we could improve the following function code generation: ``` unsigned f(unsigned t) { if (t & ~(1<<30)) __builtin_unreachable(); return t != 0; } ``` Right know we just emit a comparison against 0 instead of just a shift right by 30. There is code in do_store_flag which already optimizes `(t & 1<<30) != 0` to `(t >> 30) & 1` (using bit extraction if available). This patch extends it to handle the case where we know t has a nonzero of just one bit set. Changes from v1: * v2: Updated for the bit extraction improvements. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * expr.cc (do_store_flag): Extend the one bit checking case to handle the case where we don't have an and but rather still one bit is known to be non-zero.
2023-06-04Convert H8 port to LRAJeff Law3-34/+1
With Vlad's recent LRA fix to the elimination code, the H8 can be converted to LRA. This patch has two changes of note. First, this turns Zz into a standard constraint. This helps reloading for the H8/SX movqi pattern. Second, this drops the whole pattern for the SX bit memory operations. I can't see why those exist to begin with. They should be handled by the standard bit manipulation patterns. If someone wants to try and improve SX bit support, that'd be great and they can do so within the LRA framework :-) Pushed to the trunk... gcc/ * config/h8300/constraints.md (Zz): Make this a normal constraint. * config/h8300/h8300.cc (TARGET_LRA_P): Remove. * config/h8300/logical.md (H8/SX bit patterns): Remove.
2023-06-04xtensa: Optimize boolean evaluation or branching when EQ/NE to INT_MINTakayuki 'January June' Suwa1-0/+65
This patch optimizes both the boolean evaluation of and the branching of EQ/NE against INT_MIN (-2147483648), by taking advantage of the specifi- cation the ABS machine instruction on Xtensa returns INT_MIN iff INT_MIN, otherwise non-negative value. /* example */ int test0(int x) { return (x == -2147483648); } int test1(int x) { return (x != -2147483648); } extern void foo(void); void test2(int x) { if(x == -2147483648) foo(); } void test3(int x) { if(x != -2147483648) foo(); } ;; before test0: movi.n a9, -1 slli a9, a9, 31 add.n a2, a2, a9 nsau a2, a2 srli a2, a2, 5 ret.n test1: movi.n a9, -1 slli a9, a9, 31 add.n a9, a2, a9 movi.n a2, 1 moveqz a2, a9, a9 ret.n test2: movi.n a9, -1 slli a9, a9, 31 bne a2, a9, .L3 j.l foo, a9 .L3: ret.n test3: movi.n a9, -1 slli a9, a9, 31 beq a2, a9, .L5 j.l foo, a9 .L5: ret.n ;; after test0: abs a2, a2 extui a2, a2, 31, 1 ret.n test1: abs a2, a2 srai a2, a2, 31 addi.n a2, a2, 1 ret.n test2: abs a2, a2 bbci a2, 31, .L3 j.l foo, a9 .L3: ret.n test3: abs a2, a2 bbsi a2, 31, .L5 j.l foo, a9 .L5: ret.n gcc/ChangeLog: * config/xtensa/xtensa.md (*btrue_INT_MIN, *eqne_INT_MIN): New insn_and_split patterns.
2023-06-04RISC-V: Remove redundant vlmul_ext_* patterns to fix PR110109Juzhe-Zhong4-144/+529
This patch is to fix PR110109 issue. This issue happens is because: (define_insn_and_split "*vlmul_extx2<mode>" [(set (match_operand:<VLMULX2> 0 "register_operand" "=vr, ?&vr") (subreg:<VLMULX2> (match_operand:VLMULEXT2 1 "register_operand" " 0, vr") 0))] "TARGET_VECTOR" "#" "&& reload_completed" [(const_int 0)] { emit_insn (gen_rtx_SET (gen_lowpart (<MODE>mode, operands[0]), operands[1])); DONE; }) Such pattern generate such codes in insn-recog.cc: static int pattern57 (rtx x1) { rtx * const operands ATTRIBUTE_UNUSED = &recog_data.operand[0]; rtx x2; int res ATTRIBUTE_UNUSED; if (maybe_ne (SUBREG_BYTE (x1).to_constant (), 0)) return -1; ... PR110109 ICE at maybe_ne (SUBREG_BYTE (x1).to_constant (), 0) since for scalable RVV modes can not be accessed as SUBREG_BYTE (x1).to_constant () I create that patterns is to optimize the following test: vfloat32m2_t test_vlmul_ext_v_f32mf2_f32m2(vfloat32mf2_t op1) { return __riscv_vlmul_ext_v_f32mf2_f32m2(op1); } codegen: test_vlmul_ext_v_f32mf2_f32m2: vsetvli a5,zero,e32,m2,ta,ma vmv.v.i v2,0 vsetvli a5,zero,e32,mf2,ta,ma vle32.v v2,0(a1) vs2r.v v2,0(a0) ret There is a redundant 'vmv.v.i' here, Since GCC doesn't undefine IR (unlike LLVM, LLVM has undef/poison). For vlmul_ext_* RVV intrinsic, GCC will initiate all zeros into register. However, I think it's not a big issue after we support subreg livness tracking. PR target/110109 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Change expand approach. * config/riscv/vector.md (@vlmul_extx2<mode>): Remove it. (@vlmul_extx4<mode>): Ditto. (@vlmul_extx8<mode>): Ditto. (@vlmul_extx16<mode>): Ditto. (@vlmul_extx32<mode>): Ditto. (@vlmul_extx64<mode>): Ditto. (*vlmul_extx2<mode>): Ditto. (*vlmul_extx4<mode>): Ditto. (*vlmul_extx8<mode>): Ditto. (*vlmul_extx16<mode>): Ditto. (*vlmul_extx32<mode>): Ditto. (*vlmul_extx64<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr110109-1.c: New test. * gcc.target/riscv/rvv/base/pr110109-2.c: New test.
2023-06-04RISC-V: Support RVV FP16 ZVFHMIN intrinsic APIPan Li4-1/+70
This patch support the 2 intrinsic API of FP16 ZVFHMIN extension. Aka SEW=16 for below instructions vfwcvt.f.f.v vfncvt.f.f.w Then users can leverage the instrinsic APIs to perform the conversion between RVV vector single float point and half float point. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-types.def (vfloat32mf2_t): Add vfloat32mf2_t type to vfncvt.f.f.w operations. (vfloat32m1_t): Likewise. (vfloat32m2_t): Likewise. (vfloat32m4_t): Likewise. (vfloat32m8_t): Likewise. * config/riscv/riscv-vector-builtins.def: Fix typo in comments. * config/riscv/vector-iterators.md: Add single to half machine mode conversion. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: New test.
2023-06-04RISC-V: Move optimization patterns into autovec-opt.mdJuzhe-Zhong3-91/+92
Move all optimization patterns into autovec-opt.md to make organization easier maintain. gcc/ChangeLog: * config/riscv/autovec-opt.md (*<optab>not<mode>): Move to autovec-opt.md. (*n<optab><mode>): Ditto. * config/riscv/autovec.md (*<optab>not<mode>): Ditto. (*n<optab><mode>): Ditto. * config/riscv/vector.md: Ditto.
2023-06-04PR target/110083: Fix-up REG_EQUAL notes on COMPARE in STV.Roger Sayle2-0/+59
This patch fixes PR target/110083, an ICE-on-valid regression exposed by my recent PTEST improvements (to address PR target/109973). The latent bug (admittedly mine) is that the scalar-to-vector (STV) pass doesn't update or delete REG_EQUAL notes attached to COMPARE instructions. As a result the operands of COMPARE would be mismatched, with the register transformed to V1TImode, but the immediate operand left as const_wide_int, which is valid for TImode but not V1TImode. This remained latent when the STV conversion converted the mode of the COMPARE to CCmode, with later passes recognizing the REG_EQUAL note is obviously invalid as the modes didn't match, but now that we (correctly) preserve the CCZmode on COMPARE, the mismatched operand modes trigger a sanity checking ICE downstream. Fixed by updating (or deleting) any REG_EQUAL notes in convert_compare. Before: (expr_list:REG_EQUAL (compare:CCZ (reg:V1TI 119 [ ivin.29_38 ]) (const_wide_int 0x80000000000000000000000000000000)) After: (expr_list:REG_EQUAL (compare:CCZ (reg:V1TI 119 [ ivin.29_38 ]) (const_vector:V1TI [ (const_wide_int 0x80000000000000000000000000000000) ])) 2023-06-04 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/110083 * config/i386/i386-features.cc (scalar_chain::convert_compare): Update or delete REG_EQUAL notes, converting CONST_INT and CONST_WIDE_INT immediate operands to a suitable CONST_VECTOR. gcc/testsuite/ChangeLog PR target/110083 * gcc.target/i386/pr110083.c: New test case.
2023-06-03c++: use __cxa_call_terminate for MUST_NOT_THROW [PR97720]Jason Merrill6-4/+53
[except.handle]/7 says that when we enter std::terminate due to a throw, that is considered an active handler. We already implemented that properly for the case of not finding a handler (__cxa_throw calls __cxa_begin_catch before std::terminate) and the case of finding a callsite with no landing pad (the personality function calls __cxa_call_terminate which calls __cxa_begin_catch), but for the case of a throw in a try/catch in a noexcept function, we were emitting a cleanup that calls std::terminate directly without ever calling __cxa_begin_catch to handle the exception. A straightforward way to fix this seems to be calling __cxa_call_terminate instead. However, that requires exporting it from libstdc++, which we have not previously done. Despite the name, it isn't actually part of the ABI standard. Nor is __cxa_call_unexpected, as far as I can tell, but that one is also used by clang. For this case they use __clang_call_terminate; it seems reasonable to me for us to stick with __cxa_call_terminate. I also change __cxa_call_terminate to take void* for simplicity in the front end (and consistency with __cxa_call_unexpected) but that isn't necessary if it's undesirable for some reason. This patch does not fix the issue that representing the noexcept as a cleanup is wrong, and confuses the handler search; since it looks like a cleanup in the EH tables, the unwinder keeps looking until it finds the catch in main(), which it should never have gotten to. Without the try/catch in main, the unwinder would reach the end of the stack and say no handler was found. The noexcept is a handler, and should be treated as one, as it is when the landing pad is omitted. The best fix for that issue seems to me to be to represent an ERT_MUST_NOT_THROW after an ERT_TRY in an action list as though it were an ERT_ALLOWED_EXCEPTIONS (since indeed it is an exception-specification). The actual code generation shouldn't need to change (apart from the change made by this patch), only the action table entry. PR c++/97720 gcc/cp/ChangeLog: * cp-tree.h (enum cp_tree_index): Add CPTI_CALL_TERMINATE_FN. (call_terminate_fn): New macro. * cp-gimplify.cc (gimplify_must_not_throw_expr): Use it. * except.cc (init_exception_processing): Set it. (cp_protect_cleanup_actions): Return it. gcc/ChangeLog: * tree-eh.cc (lower_resx): Pass the exception pointer to the failure_decl. * except.h: Tweak comment. libstdc++-v3/ChangeLog: * libsupc++/eh_call.cc (__cxa_call_terminate): Take void*. * config/abi/pre/gnu.ver: Add it. gcc/testsuite/ChangeLog: * g++.dg/eh/terminate2.C: New test.
2023-06-04reload_cse_move2add: Handle trivial single_set:sHans-Peter Nilsson1-29/+36
The reload_cse_move2add part of "postreload" handled only insns whose PATTERN was a SET. That excludes insns that e.g. clobber a flags register, which it does only for "simplicity". The patch extends the "simplicity" to most single_set insns. For a subset of those insns there's still an assumption; that the single_set of a PARALLEL insn is the first element in the PARALLEL. If the assumption fails, it's no biggie; the optimization just isn't performed. Don't let the name deceive you, this optimization doesn't hit often, but as often (or as rarely) for LRA as for reload at least on e.g. cris-elf where the biggest effect was seen in reducing repeated addresses in copies from fixed-address arrays, like in gcc.c-torture/compile/pr78694.c. * postreload.cc (move2add_use_add2_insn): Handle trivial single_sets. Rename variable PAT to SET. (move2add_use_add3_insn, reload_cse_move2add): Similar.
2023-06-04RISC-V: Support RVV zvfh{min} vfloat16*_t mov and spillPan Li6-0/+257
This patch would like to allow the mov and spill operation for the RVV vfloat16*_t types. The involved machine mode includes VNx1HF, VNx2HF, VNx4HF, VNx8HF, VNx16HF, VNx32HF and VNx64HF. Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-types.def (vfloat16mf4_t): Add the float16 type to DEF_RVV_F_OPS. (vfloat16mf2_t): Likewise. (vfloat16m1_t): Likewise. (vfloat16m2_t): Likewise. (vfloat16m4_t): Likewise. (vfloat16m8_t): Likewise. * config/riscv/riscv.md: Add vfloat16*_t to attr mode. * config/riscv/vector-iterators.md: Add vfloat16*_t machine mode to V, V_WHOLE, V_FRACT, VINDEX, VM, VEL and sew. * config/riscv/vector.md: Add vfloat16*_t machine mode to sew, vlmul and ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/mov-14.c: New test. * gcc.target/riscv/rvv/base/spill-13.c: New test.
2023-06-04Daily bump.GCC Administrator5-1/+79
2023-06-03[RISC-V] fix cfi issue in save-restore.Fei Gao1-2/+2
This patch fixes a cfi issue introduced by https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=60524be1e3929d83e15fceac6e2aa053c8a6fb20 Test code: char my_getchar(); float getf(); int test_f0() { int s0 = my_getchar(); float f0 = getf(); int b = my_getchar(); return f0+s0+b; } cflags: -g -Os -march=rv32imafc -mabi=ilp32f -msave-restore -mcmodel=medlow before patch: test_f0: ... .cfi_startproc call t0,__riscv_save_1 .cfi_offset 8, -8 .cfi_offset 1, -4 .cfi_def_cfa_offset 16 ... addi sp,sp,-16 .cfi_def_cfa_offset 32 ... addi sp,sp,16 .cfi_def_cfa_offset 0 // issue here ... tail __riscv_restore_1 .cfi_restore 8 .cfi_restore 1 .cfi_def_cfa_offset -16 // issue here .cfi_endproc after patch: test_f0: ... .cfi_startproc call t0,__riscv_save_1 .cfi_offset 8, -8 .cfi_offset 1, -4 .cfi_def_cfa_offset 16 ... addi sp,sp,-16 .cfi_def_cfa_offset 32 ... addi sp,sp,16 .cfi_def_cfa_offset 16 // corrected here ... tail __riscv_restore_1 .cfi_restore 8 .cfi_restore 1 .cfi_def_cfa_offset 0 // corrected here .cfi_endproc gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_epilogue): fix cfi issue with correct offset.
2023-06-03Remove unnecessary md pattern for TARGET_XTHEADCONDMOVDie Li1-14/+1
There are 2 small changes in this patch, but they do not affect the result. 1. Remove unnecessary md pattern for TARGET_XTHEADCONDMOV in thead.md. The operands[4] in "if_then_else" are always comparison operations, so the generated rtl does not match the pattern that is expected to be deleted. 2. Change operands[4] from const0_rtx to operands[1] to maintain rtl consistency. Although when output assembly, only operands[4] CODE will affect the output result. Signed-off-by: Die Li <lidie@eswincomputing.com> gcc/ChangeLog: * config/riscv/thead.md (*th_cond_gpr_mov<GPR:mode><GPR2:mode>): Delete.
2023-06-03PR modula2/110003 Wrong source line listed for unused parametersGaius Mulley1-4/+5
Ensure that the parameter token position is recorded for both definition and implementation modules. The shadow variable is created inside BuildFormalParameterSection. The shadow variable needs to have the other definition or implementation module token position set when CheckFormalParameterSection is called. This allows the MetaError family of procedures to request the implementation module token position when reporting unused parameters. gcc/m2/ChangeLog: PR modula2/110003 * gm2-compiler/P2SymBuild.mod (GetParameterShadowVar): Import. (CheckFormalParameterSection): Call PutDeclared for the shadow variable associated with the parameter. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-06-03c++: is_specialization_of_friend confusion [PR109923]Patrick Palka2-0/+21
The check for a non-template member function of a class template in is_specialization_of_friend is overbroad, and accidentally holds for a non-template hidden friend too, which for the testcase below causes the predicate to bogusly return true for decl = void non_templ_friend(A<int>, A<void>) friend_decl = void non_templ_friend(A<void>, A<void>) This patch refines the check appropriately. PR c++/109923 gcc/cp/ChangeLog: * pt.cc (is_specialization_of_friend): Fix overbroad check for a non-template member function of a class template. gcc/testsuite/ChangeLog: * g++.dg/template/friend79.C: New test.
2023-06-03c++: simplify TEMPLATE_TEMPLATE_PARM hashingPatrick Palka1-12/+1
r10-7815-gaa576f2a860c82 added special hashing for TEMPLATE_TEMPLATE_PARM to work around non-lowered ttps having TYPE_CANONICAL set but lowered ttps did not. But ever since r13-737-gd0ef9e06197d14 this is no longer the case, and all ttps should now have TYPE_CANONICAL set. So this special hashing is now unnecessary and we can fall back to always using TYPE_CANONICAL. gcc/cp/ChangeLog: * pt.cc (iterative_hash_template_arg): Don't hash TEMPLATE_TEMPLATE_PARM specially.