aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-06-28cprop_hardreg: fix ORIGINAL_REGNO/REG_ATTRS/REG_POINTER handlingManolis Tsamis2-16/+65
Fixes: 6a2e8dcbbd4bab3 Propagation for the stack pointer in regcprop was enabled in 6a2e8dcbbd4bab3, but set ORIGINAL_REGNO/REG_ATTRS/REG_POINTER for stack_pointer_rtx which caused regression (e.g., PR 110313, PR 110308). This fix adds special handling for stack_pointer_rtx in the places where maybe_mode_change is called. This also adds an check in maybe_mode_change to return the stack pointer only when the requested mode matches the mode of stack_pointer_rtx. PR debug/110308 gcc/ChangeLog: * regcprop.cc (maybe_mode_change): Check stack_pointer_rtx mode. (maybe_copy_reg_attrs): New function. (find_oldest_value_reg): Use maybe_copy_reg_attrs. (copyprop_hardreg_forward_1): Ditto. gcc/testsuite/ChangeLog: * g++.dg/torture/pr110308.C: New test. Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
2023-06-28tree-optimization/110434 - avoid <retval> ={v} {CLOBBER} from NRVRichard Biener1-1/+11
When NRV replaces a local variable with <retval> it also replaces occurences in clobbers. This leads to <retval> being clobbered before the return of it which is strictly invalid but harmless in practice since there's no pass after NRV which would remove earlier stores. The following fixes this nevertheless. PR tree-optimization/110434 * tree-nrv.cc (pass_nrv::execute): Remove CLOBBERs of VAR we replace with <retval>.
2023-06-28Make mve_fp_fpu[12].c accept single or double precision FPUChristophe Lyon2-2/+2
This tests currently expect a directive containing .fpu fpv5-sp-d16 and thus may fail if the test is executed for instance with -march=armv8.1-m.main+mve.fp+fp.dp This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure. 2023-06-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu scan-assembler. * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
2023-06-28Make nomve_fp_1.c require arm_fpChristophe Lyon1-0/+2
If GCC is configured with the default (soft) -mfloat-abi, and we don't override the target_board test flags appropriately, gcc.target/arm/mve/general-c/nomve_fp_1.c fails for lack of -mfloat-abi=softfp or -mfloat-abi=hard, because it doesn't use dg-add-options arm_v8_1m_mve (on purpose, see comment in the test). Require and use the options needed for arm_fp to fix this problem. 2023-06-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/general-c/nomve_fp_1.c: Require arm_fp.
2023-06-28tree-optimization/110451 - hoist invariant compare after interchangeRichard Biener2-1/+61
The following adjusts the cost model of invariant motion to consider [VEC_]COND_EXPRs and comparisons producing a data value as expensive. For 503.bwaves_r this avoids an unnecessarily high vectorization factor because of an integer comparison besides data operations on double. PR tree-optimization/110451 * tree-ssa-loop-im.cc (stmt_cost): [VEC_]COND_EXPR and tcc_comparison are expensive. * gfortran.dg/vect/pr110451.f: New testcase.
2023-06-28Fortran: Enable class expressions in structure constructors [PR49213]Paul Thomas5-12/+166
2023-06-28 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/49213 * expr.cc (gfc_is_ptr_fcn): Remove reference to class_pointer. * resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow associate names with pointer function targets to be used in variable definition context. * trans-decl.cc (get_symbol_decl): Remove extraneous line. * trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain size of intrinsic and character expressions. (gfc_trans_subcomponent_assign): Expand assignment to class components to include intrinsic and character expressions. gcc/testsuite/ PR fortran/49213 * gfortran.dg/pr49213.f90 : New test
2023-06-28i386: Add cbranchti4 pattern to i386.md (for -m32 compare_by_pieces).Roger Sayle4-3/+45
This patch fixes some very odd (unanticipated) code generation by compare_by_pieces with -m32 -mavx, since the recent addition of the cbranchoi4 pattern. The issue is that cbranchoi4 is available with TARGET_AVX, but cbranchti4 is currently conditional on TARGET_64BIT which results in the odd behaviour (thanks to OPTAB_WIDEN) that with -m32 -mavx, compare_by_pieces ends up (inefficiently) widening 128-bit comparisons to 256-bits before performing PTEST. This patch fixes this by providing a cbranchti4 pattern that's available with either TARGET_64BIT or TARGET_SSE4_1. For the test case below (again from PR 104610): int foo(char *a) { static const char t[] = "0123456789012345678901234567890"; return __builtin_memcmp(a, &t[0], sizeof(t)) == 0; } GCC with -m32 -O2 -mavx currently produces the bonkers: foo: pushl %ebp movl %esp, %ebp andl $-32, %esp subl $64, %esp movl 8(%ebp), %eax vmovdqa .LC0, %xmm4 movl $0, 48(%esp) vmovdqu (%eax), %xmm2 movl $0, 52(%esp) movl $0, 56(%esp) movl $0, 60(%esp) movl $0, 16(%esp) movl $0, 20(%esp) movl $0, 24(%esp) movl $0, 28(%esp) vmovdqa %xmm2, 32(%esp) vmovdqa %xmm4, (%esp) vmovdqa (%esp), %ymm5 vpxor 32(%esp), %ymm5, %ymm0 vptest %ymm0, %ymm0 jne .L2 vmovdqu 16(%eax), %xmm7 movl $0, 48(%esp) movl $0, 52(%esp) vmovdqa %xmm7, 32(%esp) vmovdqa .LC1, %xmm7 movl $0, 56(%esp) movl $0, 60(%esp) movl $0, 16(%esp) movl $0, 20(%esp) movl $0, 24(%esp) movl $0, 28(%esp) vmovdqa %xmm7, (%esp) vmovdqa (%esp), %ymm1 vpxor 32(%esp), %ymm1, %ymm0 vptest %ymm0, %ymm0 je .L6 .L2: movl $1, %eax xorl $1, %eax vzeroupper leave ret .L6: xorl %eax, %eax xorl $1, %eax vzeroupper leave ret with this patch, we now generate the (slightly) more sensible: foo: vmovdqa .LC0, %xmm0 movl 4(%esp), %eax vpxor (%eax), %xmm0, %xmm0 vptest %xmm0, %xmm0 jne .L2 vmovdqa .LC1, %xmm0 vpxor 16(%eax), %xmm0, %xmm0 vptest %xmm0, %xmm0 je .L5 .L2: movl $1, %eax xorl $1, %eax ret .L5: xorl %eax, %eax xorl $1, %eax ret 2023-06-28 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_branch): Also use ptest for TImode comparisons on 32-bit architectures. * config/i386/i386.md (cbranch<mode>4): Change from SDWIM to SWIM1248x to exclude/avoid TImode being conditional on -m64. (cbranchti4): New define_expand for TImode on both TARGET_64BIT and/or with TARGET_SSE4_1. * config/i386/predicates.md (ix86_timode_comparison_operator): New predicate that depends upon TARGET_64BIT. (ix86_timode_comparison_operand): Likewise. gcc/testsuite/ChangeLog * gcc.target/i386/pieces-memcmp-2.c: New test case.
2023-06-28i386: Fix FAIL of gcc.target/i386/pr78794.c on ia32.Roger Sayle1-1/+25
This patch fixes that FAIL of gcc.target/i386/pr78794.c on ia32, which is caused by minor STV rtx_cost differences with -march=silvermont. It turns out that generic tuning results in pandn, but the lack of accurate parameterization for COMPARE in compute_convert_gain combined with small differences in scalar<->SSE costs on silvermont results in this DImode chain not being converted. The solution is to provide more accurate costs/gains for converting (DImode and SImode) comparisons. I'd been holding off of doing this as I'd thought it would be possible to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector win) but I've recently realized that these optimizations (as I've implemented them) occur in the wrong order (stv2 occurs after combine), so it isn't easy for STV to convert CCZmode into CCCmode. Doh! Perhaps something can be done in peephole2. 2023-06-28 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/78794 * config/i386/i386-features.cc (compute_convert_gain): Provide more accurate gains for conversion of scalar comparisons to PTEST.
2023-06-28Add cold attribute to throw wrappers and terminateJan Hubicka3-20/+20
PR middle-end/109849 * include/bits/c++config (std::__terminate): Mark cold. * include/bits/functexcept.h: Mark everything as cold. * libsupc++/exception: Mark terminate and unexpected as cold.
2023-06-28tree-optimization/110443 - prevent SLP splat of gathersRichard Biener2-1/+23
The following prevents non-grouped load SLP in case the element to splat is from a gather operation. While it should be possible to support this it is not similar to the single element interleaving case I was trying to mimic here. PR tree-optimization/110443 * tree-vect-slp.cc (vect_build_slp_tree_1): Reject non-grouped gather loads. * gcc.dg/torture/pr110443.c: New testcase.
2023-06-28rs6000: Add two peephole patterns for "mr." insnHaochen Gui3-4/+155
When investigating the issue mentioned in PR87871#c30 - if compare and move pattern benefits before RA, I checked the assembly generated for SPEC2017 and found that certain insn sequences aren't converted to "mr." instructions. Following two sequence are never to be combined to "mr." pattern as there is no register link between them. This patch adds two peephole2 patterns to convert them to "mr." instructions. cmp 0,3,0 mr 4,3 mr 4,3 cmp 0,3,0 The patch also creates a new mode iterator which decided by TARGET_POWERPC64. This mode iterator is used in "mr." and its split pattern. The original P iterator is improper when -m32/-mpowerpc64 is set. In this situation, the "mr." should compares the whole 64-bit register with 0 other than the low 32-bit one. gcc/ * config/rs6000/rs6000.md (peephole2 for compare_and_move): New. (peephole2 for move_and_compare): New. (mode_iterator WORD): New. Set the mode to SI/DImode by TARGET_POWERPC64. (*mov<mode>_internal2): Change the mode iterator from P to WORD. (split pattern for compare_and_move): Likewise. gcc/testsuite/ * gcc.dg/rtl/powerpc/move_compare_peephole_32.c: New. * gcc.dg/rtl/powerpc/move_compare_peephole_64.c: New.
2023-06-28RISC-V: Support vfwmacc combine loweringJuzhe-Zhong5-6/+103
This patch adds combine pattern as follows: 1. (set (reg) (fma (float_extend:reg)(float_extend:reg)(reg))) This pattern allows combine: vfwcvt + vfwcvt + vfmacc ==> vwfmacc. 2. (set (reg) (fma (float_extend:reg)(reg)(reg))) This pattern is the intermediate IR that enhances the combine optimizations. Since for the complicate situation, combine pass can not combine both operands of multiplication at the first time, it will try to first combine at the first stage: (set (reg) (fma (float_extend:reg)(reg)(reg))). Then combine another extension of the other operand at the second stage. This can enhance combine optimization for the following case: define TEST_TYPE(TYPE1, TYPE2) \ __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 ( \ TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3, \ TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b, \ TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n) \ { \ for (int i = 0; i < n; i++) \ { \ dst[i] += (TYPE1) a[i] * (TYPE1) b[i]; \ dst2[i] += (TYPE1) a2[i] * (TYPE1) b[i]; \ dst3[i] += (TYPE1) a2[i] * (TYPE1) a[i]; \ dst4[i] += (TYPE1) a[i] * (TYPE1) b2[i]; \ } \ } define TEST_ALL() \ TEST_TYPE (int16_t, int8_t) \ TEST_TYPE (uint16_t, uint8_t) \ TEST_TYPE (int32_t, int16_t) \ TEST_TYPE (uint32_t, uint16_t) \ TEST_TYPE (int64_t, int32_t) \ TEST_TYPE (uint64_t, uint32_t) \ TEST_TYPE (float, _Float16) \ TEST_TYPE (double, float) TEST_ALL () gcc/ChangeLog: * config/riscv/autovec-opt.md (*double_widen_fma<mode>): New pattern. (*single_widen_fma<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-8.c: Add floating-point. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-5.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-8.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-8.c: New test.
2023-06-28rs6000: Splat vector small V2DI constants with vspltisw and vupkhswHaochen Gui6-1/+81
This patch adds a new insn for vector splat with small V2DI constants on P8. If the value of constant is in RANGE (-16, 15) but not 0 or -1, it can be loaded with vspltisw and vupkhsw on P8. gcc/ PR target/104124 * config/rs6000/altivec.md (*altivec_vupkhs<VU_char>_direct): Rename to... (altivec_vupkhs<VU_char>_direct): ...this. * config/rs6000/predicates.md (vspltisw_vupkhsw_constant_split): New predicate to test if a constant can be loaded with vspltisw and vupkhsw. (easy_vector_constant): Call vspltisw_vupkhsw_constant_p to Check if a vector constant can be synthesized with a vspltisw and a vupkhsw. * config/rs6000/rs6000-protos.h (vspltisw_vupkhsw_constant_p): Declare. * config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): New function to return true if OP mode is V2DI and can be synthesized with vupkhsw and vspltisw. * config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up constants with vspltisw and vupkhsw. gcc/testsuite/ PR target/104124 * gcc.target/powerpc/pr104124.c: New.
2023-06-28Enable ranger for ipa-propJan Hubicka2-2/+20
gcc/ChangeLog: PR tree-optimization/110377 * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Pass statement to the ranger query. (ipa_analyze_node): Enable ranger. gcc/testsuite/ChangeLog: PR tree-optimization/110377 * gcc.dg/ipa/pr110377.c: New test.
2023-06-28Add testcase for PR 110444Andrew Pinski1-0/+11
This testcase was fixed after r14-2135-gd915762ea9043da85 and there was no testcase for it before so adding one is a good thing. Committed as obvious after testing the testcase to make sure it works. gcc/testsuite/ChangeLog: PR tree-optimization/110444 * gcc.c-torture/compile/pr110444-1.c: New test.
2023-06-28Prevent TYPE_PRECISION on VECTOR_TYPEsRichard Biener6-8/+10
The following makes sure that using TYPE_PRECISION on VECTOR_TYPE ICEs when tree checking is enabled. This should avoid wrong-code in cases like PR110182 and instead ICE. It also introduces a TYPE_PRECISION_RAW accessor and adjusts places I found that are eligible to use that. * tree.h (TYPE_PRECISION): Check for non-VECTOR_TYPE. (TYPE_PRECISION_RAW): Provide raw access to the precision field. * tree.cc (verify_type_variant): Compare TYPE_PRECISION_RAW. (gimple_canonical_types_compatible_p): Likewise. * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream TYPE_PRECISION_RAW. * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Likewise. * lto-streamer-out.cc (hash_tree): Hash TYPE_PRECISION_RAW. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Use TYPE_PRECISION_RAW.
2023-06-28c++: inherited constructor attributesJason Merrill4-1/+43
Inherited constructors are like constructor clones; they don't exist from the language perspective, so they should copy the attributes in the same way. But it doesn't make sense to copy alias or ifunc attributes in either case. Unlike handle_copy_attribute, we do want to copy inlining attributes. The discussion of PR110334 pointed out that we weren't copying the always_inline attribute, leading to poor inlining choices. PR c++/110334 gcc/cp/ChangeLog: * cp-tree.h (clone_attrs): Declare. * method.cc (implicitly_declare_fn): Use it for inherited constructor. * optimize.cc (clone_attrs): New. (maybe_clone_body): Use it. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/nodiscard-inh1.C: New test.
2023-06-28Add leafy mode for zero-call-used-regsAlexandre Oliva8-3/+103
Introduce 'leafy' to auto-select between 'used' and 'all' for leaf and nonleaf functions, respectively. for gcc/ChangeLog * doc/extend.texi (zero-call-used-regs): Document leafy and variants thereof. * flag-types.h (zero_regs_flags): Add LEAFY_MODE, as well as LEAFY and variants. * function.cc (gen_call_ued_regs_seq): Set only_used for leaf functions in leafy mode. * opts.cc (zero_call_used_regs_opts): Add leafy and variants. for gcc/testsuite/ChangeLog * c-c++-common/zero-scratch-regs-leafy-1.c: New. * c-c++-common/zero-scratch-regs-leafy-2.c: New. * gcc.target/i386/zero-scratch-regs-leafy-1.c: New. * gcc.target/i386/zero-scratch-regs-leafy-2.c: New.
2023-06-28[testsuite] note pitfall in how outputs.exp sets gldAlexandre Oliva1-1/+9
This patch documents a glitch in gcc.misc-tests/outputs.exp: it checks whether the linker is GNU ld, and uses that to decide whether to expect collect2 to create .ld1_args files under -save-temps, but collect2 bases that decision on whether HAVE_GNU_LD is set, which may be false zero if the linker in use is GNU ld. Configuring --with-gnu-ld fixes this misalignment. Without that, atsave tests are likely to fail, because without HAVE_GNU_LD, collect2 won't use @file syntax to run the linker (so it won't create .ld1_args files). Long version: HAVE_GNU_LD is set when (i) DEFAULT_LINKER is set during configure, pointing at GNU ld; (ii) --with-gnu-ld is passed to configure; or (iii) config.gcc sets gnu_ld=yes. If a port doesn't set gnu_ld, and the toolchain isn't configured so as to assume GNU ld, configure and thus collect2 conservatively assume the linker doesn't support @file arguments. But outputs.exp can't see how configure set HAVE_GNU_LD (it may be used to test an installed compiler), and upon finding that the linker used by the compiler is GNU ld, it will expect collect2 to use @file arguments when running the linker. If that assumption doesn't hold, atsave tests will fail. for gcc/testsuite/ChangeLog * gcc.misc-tests/outputs.exp (gld): Note a known mismatch and record a workaround.
2023-06-27c++: C++26 constexpr cast from void* [PR110344]Jason Merrill5-1/+665
P2768 allows static_cast from void* to ob* in constant evaluation if the pointer does in fact point to an object of the appropriate type. cxx_fold_indirect_ref already does the work of finding such an object if it happens to be a subobject rather than the outermost object at that address, as in constexpr-voidptr2.C. P2768 PR c++/110344 gcc/c-family/ChangeLog: * c-cppbuiltin.cc (c_cpp_builtins): Update __cpp_constexpr. gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_constant_expression): In C++26, allow cast from void* to the type of a pointed-to object. gcc/testsuite/ChangeLog: * g++.dg/cpp26/constexpr-voidptr1.C: New test. * g++.dg/cpp26/constexpr-voidptr2.C: New test. * g++.dg/cpp26/feat-cxx26.C: New test.
2023-06-27testsuite: std_list handling for { target c++26 }Jason Merrill1-5/+5
As with c++23, we want to run { target c++26 } tests even though it isn't part of the default std_list. C++17 with Concepts TS is no longer an interesting target configuration. And bump the impcx target to use C++26 mode instead of 23. gcc/testsuite/ChangeLog: * lib/g++-dg.exp (g++-dg-runtest): Update for C++26.
2023-06-28RISC-V: Support floating-point vfwadd/vfwsub vv/wv combine loweringJuzhe-Zhong16-26/+187
Currently, vfwadd.wv is the pattern with (set (reg) (float_extend:(reg)) which makes combine pass faile to combine. change RTL format of vfwadd.wv ------> (set (float_extend:(reg) (reg)) so that combine PASS can combine. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Adapt expand. * config/riscv/vector.md (@pred_single_widen_<plus_minus:optab><mode>): Remove. (@pred_single_widen_add<mode>): New pattern. (@pred_single_widen_sub<mode>): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-1.c: Add floating-point. * gcc.target/riscv/rvv/autovec/widen/widen-2.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-5.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-6.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-6.c: New test.
2023-06-28i386: Fix mvc17.c test for default target clone under --with-archHongyu Wang1-1/+1
For target clones, the default clone follows the default march so adjust the testcase to avoid test failure on --with-arch=native build. gcc/testsuite/ChangeLog: * gcc.target/i386/mvc17.c: Add -march=x86-64 to dg-options.
2023-06-28Issue a warning for conversion between short and __bf16 under TARGET_AVX512BF16.liuhongt2-0/+49
__bfloat16 is redefined from typedef short to real __bf16 since GCC V13. The patch issues an warning for potential silent implicit conversion between __bf16 and short where users may only expect a data movement. To avoid too many false positive, warning is only under TARGET_AVX512BF16. gcc/ChangeLog: * config/i386/i386.cc (ix86_invalid_conversion): New function. (TARGET_INVALID_CONVERSION): Define as ix86_invalid_conversion. gcc/testsuite/ChangeLog: * gcc.target/i386/bf16_short_warn.c: New test.
2023-06-28Daily bump.GCC Administrator4-1/+303
2023-06-27RISC-V: Add autovect widening/narrowing Integer/FP conversions.Robin Dapp22-0/+737
This patch implements widening and narrowing float-to-int and int-to-float conversions and adds tests. gcc/ChangeLog: * config/riscv/autovec.md (<optab><vnconvert><mode>2): New expander. (<float_cvt><vnconvert><mode>2): Ditto. (<optab><mode><vnconvert>2): Ditto. (<float_cvt><mode><vnconvert>2): Ditto. * config/riscv/vector-iterators.md: Add vnconvert. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-zvfh-run.c: New test.
2023-06-27RISC-V: Add autovec FP widening/narrowing.Robin Dapp12-2/+292
This patch adds FP widening and narrowing expanders as well as tests. Conceptually similar to integer extension/truncation, we emulate _Float16 -> double by two vfwcvts and double -> _Float16 by two vfncvts. gcc/ChangeLog: * config/riscv/autovec.md (extend<v_double_trunc><mode>2): New expander. (extend<v_quad_trunc><mode>2): Ditto. (trunc<mode><v_double_trunc>2): Ditto. (trunc<mode><v_quad_trunc>2): Ditto. * config/riscv/vector-iterators.md: Add VQEXTF and HF to V_QUAD_TRUNC and v_quad_trunc. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfncvt-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-zvfh-run.c: New test.
2023-06-27RISC-V: Add autovec FP int->float conversion.Robin Dapp15-15/+378
This patch adds the autovec expander for vfcvt.f.x.v and tests for it. gcc/ChangeLog: * config/riscv/autovec.md (<float_cvt><vconvert><mode>2): New expander. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-run.c: Adjust. * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-template.h: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vncvt-template.h: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vsext-template.h: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vzext-template.h: Ditto. * gcc.target/riscv/rvv/autovec/zvfhmin-1.c: Add int/float conversions. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-zvfh-run.c: New test.
2023-06-27RISC-V: Implement autovec copysign.Robin Dapp9-9/+377
This adds vector copysign, ncopysign and xorsign as well as the accompanying tests. gcc/ChangeLog: * config/riscv/autovec.md (copysign<mode>3): Add expander. (xorsign<mode>3): Ditto. * config/riscv/riscv-vector-builtins-bases.cc (class vfsgnjn): New class. * config/riscv/vector-iterators.md (copysign): Remove ncopysign. (xorsign): Ditto. (n): Ditto. (x): Ditto. * config/riscv/vector.md (@pred_ncopysign<mode>): Split off. (@pred_ncopysign<mode>_scalar): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/copysign-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-template.h: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-zvfh-run.c: New test.
2023-06-27RISC-V: Split VF iterators for Zvfh(min).Robin Dapp3-111/+128
When working on FP widening/narrowing I realized the Zvfhmin handling is not ideal right now: We use the "enabled" insn attribute to disable instructions not available with Zvfhmin but only with Zvfh. However, "enabled == 0" only disables insn alternatives, in our case all of them when the mode is a HFmode. The insn itself remains available (e.g. for combine to match) and we end up with an insn without alternatives that reload cannot handle --> ICE. The proper solution is to disable the instruction for the respective mode altogether. This patch achieves this by splitting the VF as well as VWEXTF iterators into variants with TARGET_ZVFH and TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or TARGET_ZVFHMIN are true). Also, VWCONVERTI, VHF and VHF_LMUL1 need adjustments. gcc/ChangeLog: * config/riscv/autovec.md: VF_AUTO -> VF. * config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN, VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and VHF_LMUL1. * config/riscv/vector.md: Use new iterators.
2023-06-27match.pd: Use element_mode instead of TYPE_MODE.Robin Dapp1-2/+4
This patch changes TYPE_MODE into element_mode in a match.pd simplification. As the simplification can be also called with vector types real_can_shorten_arithmetic would ICE in REAL_MODE_FORMAT which expects a scalar mode. Therefore, use element_mode instead of TYPE_MODE. Additionally, check if the target supports the resulting operation. One target that supports e.g. a float addition but not a _Float16 addition is the RISC-V vector extension Zvfhmin. gcc/ChangeLog: * match.pd: Use element_mode and check if target supports operation with new type.
2023-06-28[SVE] Fold svdupq to VEC_PERM_EXPR if elements are not constant.Prathamesh Kulkarni2-1/+78
gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svdupq_impl::fold_nonconst_dupq): New method. (svdupq_impl::fold): Call fold_nonconst_dupq. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/acle/general/dupq_11.c: New test.
2023-06-27Mark asm goto with outputs as volatileAndrew Pinski2-1/+32
The manual references asm goto as being implicitly volatile already and that was done when asm goto could not have outputs. When outputs were added to `asm goto`, only asm goto without outputs were still being marked as volatile. Now some parts of GCC decide, removing the `asm goto` is ok if the output is not used, though not updating the CFG (this happens on both the RTL level and the gimple level). Since the biggest user of `asm goto` is the Linux kernel and they expect them to be volatile (they use them to copy to/from userspace), we should just mark the inline-asm as volatile. OK? Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/110420 PR middle-end/103979 PR middle-end/98619 gcc/ChangeLog: * gimplify.cc (gimplify_asm_expr): Mark asm with labels as volatile. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/asmgoto-6.c: New test.
2023-06-27ada: Fix build of GNAT toolsEric Botcazou1-3/+6
gcc/ada/ * gcc-interface/Makefile.in (LIBIBERTY): Fix condition. (TOOLS_LIBS): Add @LD_PICFLAG@.
2023-06-27ada: Fix bad interaction between inlining and thunk generationEric Botcazou1-3/+6
This may cause the type of the RESULT_DECL of a function which returns by invisible reference to be turned into a reference type twice. gcc/ada/ * gcc-interface/trans.cc (Subprogram_Body_to_gnu): Add guard to the code turning the type of the RESULT_DECL into a reference type. (maybe_make_gnu_thunk): Use a more precise guard in the same case.
2023-06-27ada: Make the identification of case expressions more robustEric Botcazou1-5/+3
gcc/ada/ * gcc-interface/trans.cc (Case_Statement_to_gnu): Rename boolean constant and use From_Conditional_Expression flag for its value.
2023-06-27ada: Fix double finalization of case expression in concatenationEric Botcazou4-72/+23
This streamlines the expansion of case expressions by not wrapping them in an Expression_With_Actions node when the type is not by copy, which avoids the creation of a temporary and the associated finalization issues. That's the same strategy as the one used for the expansion of if expressions when the type is by reference, unless Back_End_Handles_Limited_Types is set to True. Given that it is never set to True, except by a debug switch, and has never been implemented, this parameter is removed in the process. gcc/ada/ * debug.adb (d.L): Remove documentation. * exp_ch4.adb (Expand_N_Case_Expression): In the not-by-copy case, do not wrap the case statement in an Expression_With_Actions node. (Expand_N_If_Expression): Do not test Back_End_Handles_Limited_Types * gnat1drv.adb (Adjust_Global_Switches): Do not set it. * opt.ads (Back_End_Handles_Limited_Types): Delete.
2023-06-27ada: Fix incorrect handling of iterator specifications in recent changeEric Botcazou1-7/+11
Unlike for loop parameter specifications where it references an index, the defining identifier references an element in them. gcc/ada/ * sem_ch12.adb (Check_Generic_Actuals): Check the component type of constants and variables of an array type. (Copy_Generic_Node): Fix bogus handling of iterator specifications.
2023-06-27ada: Correct the contract of Ada.Text_IO.Get_LineClaire Dross1-9/+13
Item might not be entirely initialized after a call to Get_Line. gcc/ada/ * libgnat/a-textio.ads (Get_Line): Use Relaxed_Initialization on the Item parameter of Get_Line.
2023-06-27ada: Fix too late finalization and secondary stack release in iterator loopsEric Botcazou2-31/+14
Sem_Ch5 contains an entire machinery to deal with finalization actions and secondary stack releases around iterator loops, so this removes a recent fix that was made in a narrower case and instead refines the condition under which this machinery is triggered. As a side effect, given that finalization and secondary stack management are still entangled in this machinery, this also fixes the counterpart of a leak for the former, which is a finalization occurring too late. gcc/ada/ * exp_ch4.adb (Expand_N_Quantified_Expression): Revert the latest change as it is subsumed by the machinery in Sem_Ch5. * sem_ch5.adb (Prepare_Iterator_Loop): Also wrap the loop statement in a block in the name contains a function call that returns on the secondary stack.
2023-06-27ada: Plug small loophole in the handling of private views in instancesEric Botcazou1-7/+39
This deals with nested instantiations in package bodies. gcc/ada/ * sem_ch12.adb (Scope_Within_Body_Or_Same): New predicate. (Check_Actual_Type): Take into account packages nested in bodies to compute the enclosing scope by means of Scope_Within_Body_Or_Same.
2023-06-27ada: Plug another loophole in the handling of private views in instancesEric Botcazou1-0/+17
This deals with discriminants of types declared in package bodies. gcc/ada/ * sem_ch12.adb (Check_Private_View): Also check the type of visible discriminants in record and concurrent types.
2023-06-27ada: Update printing container aggregates for debuggingViljar Indus1-2/+4
All N_Aggregate nodes were printed with parentheses "()". However the new container aggregates (homogeneous N_Aggregate nodes) should be printed with brackets "[]". gcc/ada/ * sprint.adb (Print_Node_Actual): Print homogeneous N_Aggregate nodes with brackets.
2023-06-27ada: Fix expanding container aggregatesViljar Indus1-0/+1
Ensure that that container aggregate expressions are expanded as such and not as records even if the type of the expression is a record. gcc/ada/ * exp_aggr.adb (Expand_N_Aggregate): Ensure that container aggregate expressions do not get expanded as records but instead as container aggregates.
2023-06-27Convert remaining uses of value_range in ipa-*.cc to Value_Range.Aldy Hernandez3-15/+19
Minor cleanups to get rid of value_range in IPA. There's only one left, but it's in the switch code which is integer specific. gcc/ChangeLog: * ipa-cp.cc (decide_whether_version_node): Adjust comment. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Adjust for Value_Range. (set_switch_stmt_execution_predicate): Same. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
2023-06-27Implement ipa_vr hashing.Aldy Hernandez2-46/+45
Implement hashing for ipa_vr. When all is said and done, all these patches incurr a 7.64% slowdown for ipa-cp, with is entirely covered by the similar 7% increase in this area last week. So we get type agnostic ranges with "infinite" range precision close to free. There is no change in overall compilation. gcc/ChangeLog: * ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Adjust for use with ipa_vr instead of value_range. (gt_pch_nx): Same. (gt_ggc_mx): Same. (ipa_get_value_range): Same. * value-range.cc (gt_pch_nx): Move to ipa-prop.cc and adjust for ipa_vr. (gt_ggc_mx): Same.
2023-06-27Convert ipa_jump_func to use ipa_vr instead of a value_range.Aldy Hernandez3-46/+44
This patch converts the ipa_jump_func code to use the type agnostic ipa_vr suitable for GC instead of value_range which is integer specific. I've disabled the range cacheing to simplify the patch for review, but it is handled in the next patch in the series. gcc/ChangeLog: * ipa-cp.cc (ipa_vr_operation_and_type_effects): New. * ipa-prop.cc (ipa_get_value_range): Adjust for ipa_vr. (ipa_set_jfunc_vr): Take a range. (ipa_compute_jump_functions_for_edge): Pass range to ipa_set_jfunc_vr. (ipa_write_jump_function): Call streamer write helper. (ipa_read_jump_function): Call streamer read helper. * ipa-prop.h (class ipa_vr): Change m_vr to an ipa_vr.
2023-06-27gengtype: Handle braced initialisers in structsRichard Sandiford1-0/+6
I have a patch that adds braced initialisers to a GTY structure. gengtype didn't accept that, because it parsed the "{ ... }" in " = { ... };" as the end of a statement (as "{ ... }" would be in a function definition) and so it didn't expect the following ";". This patch explicitly handles initialiser-like sequences. Arguably, the parser should also skip redundant ";", but that feels more like a workaround rather than the real fix. gcc/ * gengtype-parse.cc (consume_until_comma_or_eos): Parse "= { ... }" as a probable initializer rather than a probable complete statement.
2023-06-27tree-optimization/96208 - SLP of non-grouped loadsRichard Biener4-70/+127
The following extends SLP discovery to handle non-grouped loads in loop vectorization in the case the same load appears in all lanes. Code generation is adjusted to mimick what we do for the case of single element interleaving (when the load is not unit-stride) which is already handled by SLP. There are some limits we run into because peeling for gap cannot cover all cases and we choose VMAT_CONTIGUOUS. The patch does not try to address these issues yet. The main obstacle is that these loads are not STMT_VINFO_GROUPED_ACCESS and that's a new thing with SLP. I know from the past that it's not a good idea to make them grouped. Instead the following massages places to deal with SLP loads that are not STMT_VINFO_GROUPED_ACCESS. There's already a testcase testing for the case the PR is after, just XFAILed, the following adjusts that instead of adding another. I do expect to have missed some so I don't plan to push this on a Friday. Still there may be feedback, so posting this now. Bootstrapped and tested on x86_64-unknown-linux-gnu. PR tree-optimization/96208 * tree-vect-slp.cc (vect_build_slp_tree_1): Allow a non-grouped load if it is the same for all lanes. (vect_build_slp_tree_2): Handle not grouped loads. (vect_optimize_slp_pass::remove_redundant_permutations): Likewise. (vect_transform_slp_perm_load_1): Likewise. * tree-vect-stmts.cc (vect_model_load_cost): Likewise. (get_group_load_store_type): Likewise. Handle invariant accesses. (vectorizable_load): Likewise. * gcc.dg/vect/slp-46.c: Adjust for new vectorizations. * gcc.dg/vect/bb-slp-pr65935.c: Adjust.
2023-06-27Refine maskstore patterns with UNSPEC_MASKMOV.liuhongt1-12/+57
Similar like r14-2070-gc79476da46728e If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent it to be transformed to any other whole memory access instructions. gcc/ChangeLog: PR rtl-optimization/110237 * config/i386/sse.md (<avx512>_store<mode>_mask): Refine with UNSPEC_MASKMOV. (maskstore<mode><avx512fmaskmodelower): Ditto. (*<avx512>_store<mode>_mask): New define_insn, it's renamed from original <avx512>_store<mode>_mask.